Big Data
-
Software Development
The Changing Economics of Big Data
Perhaps you’re old enough to remember when the library was the place we went to learn. We foraged through card…
Read More » -
Software Development
Real Time Credit Card Fraud Detection with Apache Spark and Event Streaming
In this post we are going to discuss building a real time solution for credit card fraud detection. There are…
Read More » -
Software Development
Distributed Deep Learning with Caffe Using a MapR Cluster
We have experimented with CaffeOnSpark on a 5 node MapR 5.1 cluster running Spark 1.5.2 and will share our experience, difficulties,…
Read More » -
Software Development
Evolution of Big Data Storage: How to Support Real-time Analytics at Scale
Organizations embracing big data are ready to put data to work, including looking for ways to effectively analyze data from…
Read More » -
Software Development
Spark Streaming and Twitter Sentiment Analysis
This blog post is the result of my efforts to show to a coworker how to get the insights he…
Read More » -
Software Development
Key Steps for Removing the Hive Metastore Password from the Hive Configuration
In a typical Hive installation with metadata in a MySQL configuration, a password is configured in a configuration file in…
Read More » -
Software Development
Spark Data Source API: Extending Our Spark SQL Query Engine
In my last post, Apache Spark as a Distributed SQL Engine, we explained how we could use SQL to query…
Read More » -
Software Development
Achieving Sub Second SQL JOINs and building a data warehouse using Spark, Cassandra, and FiloDB
Evan loves to design, build, and improve bleeding edge distributed data and backend systems using the latest in open source…
Read More » -
Software Development
The Method Behind March Madness
There are 150 quintillion (i.e. the one after trillion) permutations to consider when completing your NCAA bracket. Some of us…
Read More »