MapReduce with MongoDB
MongoDB is an open source document-oriented NoSQL database system written in C++. You can read more about MongoDB from here.
1. Installing MangoDB.
2. Running MongoDB
3. Starting MongoDB shell
mongo [ip_address]:[port]
e.g : mongo localhost:4000
4. Let’s create a Database first.
In the MangoDB shell type the following…
> use library
The above is supposed to create a database called ‘library’.
Now to see whether your database been created, just type the following – which is supposed to list all the databases.
> show dbs;
5. Inserting data to MongoDB.
Let’s first create two books with the following commands.
> book1 = {name : "Understanding JAVA", pages : 100} > book2 = {name : "Understanding JSON", pages : 200}
Now, let’s insert these two books in to a collection called books.
> db.books.save(book1) > db.books.save(book2)
The above two statements will create a collection called books under the database library. Following statement will list out the two books which we just saved.
> db.books.find(); { "_id" : ObjectId("4f365b1ed6d9d6de7c7ae4b1"), "name" : "Understanding JAVA", "pages" : 100 } { "_id" : ObjectId("4f365b28d6d9d6de7c7ae4b2"), "name" : "Understanding JSON", "pages" : 200 }
Let’s add few more records.
> book = {name : "Understanding XML", pages : 300} > db.books.save(book) > book = {name : "Understanding Web Services", pages : 400} > db.books.save(book) > book = {name : "Understanding Axis2", pages : 150} > db.books.save(book)
6. Writing the Map function
Let’s process this library collection in a way that, we need to find the number of books having pages less 250 pages and greater than that.
> var map = function() { var category; if ( this.pages >= 250 ) category = 'Big Books'; else category = "Small Books"; emit(category, {name: this.name}); };
Here, the collection produced by the Map function will have a collection of following members.
{"Big Books",[{name: "Understanding XML"}, {name : "Understanding Web Services"}]); {"Small Books",[{name: "Understanding JAVA"}, {name : "Understanding JSON"},{name: "Understanding Axis2"}]);
7. Writing the Reduce function.
> var reduce = function(key, values) { var sum = 0; values.forEach(function(doc) { sum += 1; }); return {books: sum}; };
8. Running MapReduce against the books collection.
> var count = db.books.mapReduce(map, reduce, {out: "book_results"}); > db[count.result].find() { "_id" : "Big Books", "value" : { "books" : 2 } } { "_id" : "Small Books", "value" : { "books" : 3 } }
The above says, we have 2 Big Books and 3 Small Books.
Everything done above using the MongoDB shell, can be done with Java too. Following is the Java client for it. You can download the required dependent jar from here.
import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MapReduceCommand; import com.mongodb.MapReduceOutput; import com.mongodb.Mongo; public class MongoClient { /** * @param args */ public static void main(String[] args) { Mongo mongo; try { mongo = new Mongo("localhost", 27017); DB db = mongo.getDB("library"); DBCollection books = db.getCollection("books"); BasicDBObject book = new BasicDBObject(); book.put("name", "Understanding JAVA"); book.put("pages", 100); books.insert(book); book = new BasicDBObject(); book.put("name", "Understanding JSON"); book.put("pages", 200); books.insert(book); book = new BasicDBObject(); book.put("name", "Understanding XML"); book.put("pages", 300); books.insert(book); book = new BasicDBObject(); book.put("name", "Understanding Web Services"); book.put("pages", 400); books.insert(book); book = new BasicDBObject(); book.put("name", "Understanding Axis2"); book.put("pages", 150); books.insert(book); String map = "function() { "+ "var category; " + "if ( this.pages >= 250 ) "+ "category = 'Big Books'; " + "else " + "category = 'Small Books'; "+ "emit(category, {name: this.name});}"; String reduce = "function(key, values) { " + "var sum = 0; " + "values.forEach(function(doc) { " + "sum += 1; "+ "}); " + "return {books: sum};} "; MapReduceCommand cmd = new MapReduceCommand(books, map, reduce, null, MapReduceCommand.OutputType.INLINE, null); MapReduceOutput out = books.mapReduce(cmd); for (DBObject o : out.results()) { System.out.println(o.toString()); } } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } } }
Reference: MapReduce with MongoDB from our JCG partner Prabath Siriwardena at the Facile Login blog.
great job…this article is very informatiive for me….i would like to learn more in mongodb….
this is not the way to implement map reduce tasks that i know. What is the map and reduce code? jaqul code embebed in java code?
Thanks for sharing such a simple and nice article…. It was quick for me to deduce that map and reduce anonymous functions will live for current client shell only.. which otherwise was difficult for me through any other tutorial…
This example is outdated. Now you should use MongoClient, MongoCollection and Document objects instead:
The example like this:
MongoClient mongoClient = new MongoClient( “localhost” , 27017 );
MongoDatabase database = mongoClient.getDatabase(“library”);
MongoCollection collection = database.getCollection(“books”);
Document doc = new Document( “title”, “Understanding JAVA”)
.append(“pages”, 100);
collection.insertOne(doc);
….
MapReduceIterable documents = collection.mapReduce(map, reduce);
MongoCursor iterator = collection.mapReduce(map, reduce).iterator();
while (iterator.hasNext()){
Document resDoc = iterator.next();
System.out.println(resDoc);
}
mongoClient.close();
I need the exact jar file for MongoClient and need full program for yours. Because i am new to the mongodb process. i am in search of the mongodb code for the my learning. Kindly help me to learn the process