Apache Hadoop
-
Software Development
The Lord of the Things: Spark or Hadoop?
Are people in your data analytics organization contemplating the impending data avalanche from the internet of things and thus asking…
Read More » -
Software Development
Mesos and YARN: A tale of two clusters
This is a tale of two siloed clusters. The first cluster is an Apache Hadoop cluster. This is an island…
Read More » -
Software Development
What Are The Advanced Apache Hadoop MapReduce Features?
Overview The basic MapReduce programming explains the work flow details. But it does not cover the actual working details inside…
Read More » -
Software Development
Tuning Hadoop & Cassandra : Beware of vNodes, Splits and Pages
When running Hadoop jobs against Cassandra, you will want to be careful about a few parameters. Specifically, pay special attention…
Read More » -
Enterprise Java
Delta Architectures: Unifying the Lambda Architecture and leveraging Storm from Hadoop/REST
Recently, I’ve been asked by a bunch of people to go into more detail on the Druid/Storm integration that I…
Read More » -
Enterprise Java
Running PageRank Hadoop job on AWS Elastic MapReduce
In a previous post I described an example to perform a PageRank calculation which is part of the Mining Massive…
Read More » -
Enterprise Java
Calculate PageRanks with Apache Hadoop
Currently I am following the Coursera training ‘Mining Massive Datasets‘. I have been interested in MapReduce and Apache Hadoop for…
Read More » -
Software Development
Hadoop and the OpenDataPlatform
Pivotal, IBM and Hortonworks announced today the “Open Data Platform” (ODP) – an attempt to standardize Hadoop. This move seems…
Read More » -
Software Development
Lambda Architecture for Big Data
An increasing number of systems are being built to handle the Volume, Velocity and Variety of Big Data, and hopefully help gain new…
Read More »