Hadoop: How to Group mongodb - mapReduce output?

0 votes

Ihave a result of key value pair from mapReduce function , now i want to run the query on this output of mapReduce.

So i am using mapReduce to find out the stats of user like this

db.order.mapReduce(function() { emit (this.customer,{count:1,orderDate:this.orderDate.interval_start}) },
function(key,values){ 
    var sum =0 ; var lastOrderDate;  
    values.forEach(function(value) {
     if(value['orderDate']){ 
        lastOrderDate=value['orderDate'];
    }  
    sum+=value['count'];
}); 
    return {count:sum,lastOrderDate:lastOrderDate}; 
},
{ query:{status:"DELIVERED"},out:"order_total"}).find()

which give me output like this

{ "_id" : ObjectId("5443765ae4b05294c8944d5b"), "value" : { "count" : 1, "orderDate" : ISODate("2014-10-18T18:30:00Z") } }
{ "_id" : ObjectId("54561911e4b07a0a501276af"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2015-03-14T18:30:00Z") } }
{ "_id" : ObjectId("54561b9ce4b07a0a501276b1"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-01T18:30:00Z") } }
{ "_id" : ObjectId("5458712ee4b07a0a501276c2"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2014-11-03T18:30:00Z") } }
{ "_id" : ObjectId("545f64e7e4b07a0a501276db"), "value" : { "count" : 15, "lastOrderDate" : ISODate("2015-06-04T18:30:00Z") } }
{ "_id" : ObjectId("54690771e4b0070527c657ed"), "value" : { "count" : 6, "lastOrderDate" : ISODate("2015-06-03T18:30:00Z") } }
{ "_id" : ObjectId("54696c64e4b07f3c07010b4a"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-18T18:30:00Z") } }
{ "_id" : ObjectId("546980d1e4b07f3c07010b4d"), "value" : { "count" : 4, "lastOrderDate" : ISODate("2015-03-24T18:30:00Z") } }
{ "_id" : ObjectId("54699ac4e4b07f3c07010b51"), "value" : { "count" : 30, "lastOrderDate" : ISODate("2015-05-23T18:30:00Z") } }
{ "_id" : ObjectId("54699d0be4b07f3c07010b55"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-16T18:30:00Z") } }
{ "_id" : ObjectId("5469a1dce4b07f3c07010b59"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2015-04-29T18:30:00Z") } }
{ "_id" : ObjectId("5469a96ce4b07f3c07010b5e"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-16T18:30:00Z") } }
{ "_id" : ObjectId("5469c1ece4b07f3c07010b64"), "value" : { "count" : 9, "lastOrderDate" : ISODate("2015-04-15T18:30:00Z") } }
{ "_id" : ObjectId("5469f422e4b0ce7d5ee021ad"), "value" : { "count" : 5, "lastOrderDate" : ISODate("2015-06-01T18:30:00Z") } }
......

Now i want to run query and group the users on the basis of count in different categories like for user with count less than 5 in one group , 5-10 in another, etc

and want output something like this

{userLessThan5: 9 }
{user5to10: 2 }
{user10to15: 1 }
{user15to20: 0 }
  ....

How can I do it?

Oct 31, 2018 in Big Data Hadoop by digger
• 27,620 points
61 views

1 answer to this question.

0 votes
db.order.mapReduce(function() { emit (this.customer,{count:1,orderDate:this.orderDate.interval_start}) },
function(key,values){ 
var category; // add this new field
var sum =0 ; var lastOrderDate;  
values.forEach(function(value) {
 if(value['orderDate']){ 
    lastOrderDate=value['orderDate'];
}  
sum+=value['count'];
}); 
// at this point you are already aware in which category your records lies , just add a new field to mark it
 if(sum < 5){ category: userLessThan5};
 if(sum >= 5 && sum <=10){ category: user5to10};
 if(sum <= 10 && sum >= 15){ category: user10to15};
 if(sum <= 15 && sum >=20){ category: user15to20};
  ....
return {count:sum,lastOrderDate:lastOrderDate,category:category}; 
},
{ query:{status:"DELIVERED"},out:"order_total"}).find()
 db.order_total.aggregate([{ $group: { "_id": "$value.category", "users": { $sum: 1 } } }]);

answered Oct 31, 2018 by Omkar
• 67,120 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to format the output being written by MapReduce in Hadoop?

Here is a simple code demonstrate the ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
75 views
0 votes
1 answer

How can we send data from MongoDB to Hadoop?

The MongoDB Connector for Hadoop reads data ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,110 points
58 views
0 votes
1 answer

How hadoop mapreduce job is submitted to worker nodes?

Alright, I think you are basically looking ...READ MORE

answered Mar 29, 2018 in Big Data Hadoop by Ashish
• 2,630 points
1,427 views
0 votes
1 answer

Integration of Hadoop with Mongo DB concept

MongoDB isn't built to work on top ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by Frankie
• 9,810 points
48 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
2,040 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
196 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
10,439 views
0 votes
1 answer

In Hadoop MapReduce, how can i set an Object as the Value for Map output?

Try this and see if it works: public ...READ MORE

answered Nov 20, 2018 in Big Data Hadoop by Omkar
• 67,120 points
27 views
0 votes
1 answer

Hadoop: How to get the column name along with the output in Hive?

You can get the column names by ...READ MORE

answered Nov 20, 2018 in Big Data Hadoop by Omkar
• 67,120 points
137 views