Hadoop How to Group mongodb - mapReduce output

0 votes

Ihave a result of key value pair from mapReduce function , now i want to run the query on this output of mapReduce.

So i am using mapReduce to find out the stats of user like this

db.order.mapReduce(function() { emit (this.customer,{count:1,orderDate:this.orderDate.interval_start}) },
function(key,values){ 
    var sum =0 ; var lastOrderDate;  
    values.forEach(function(value) {
     if(value['orderDate']){ 
        lastOrderDate=value['orderDate'];
    }  
    sum+=value['count'];
}); 
    return {count:sum,lastOrderDate:lastOrderDate}; 
},
{ query:{status:"DELIVERED"},out:"order_total"}).find()

which give me output like this

{ "_id" : ObjectId("5443765ae4b05294c8944d5b"), "value" : { "count" : 1, "orderDate" : ISODate("2014-10-18T18:30:00Z") } }
{ "_id" : ObjectId("54561911e4b07a0a501276af"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2015-03-14T18:30:00Z") } }
{ "_id" : ObjectId("54561b9ce4b07a0a501276b1"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-01T18:30:00Z") } }
{ "_id" : ObjectId("5458712ee4b07a0a501276c2"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2014-11-03T18:30:00Z") } }
{ "_id" : ObjectId("545f64e7e4b07a0a501276db"), "value" : { "count" : 15, "lastOrderDate" : ISODate("2015-06-04T18:30:00Z") } }
{ "_id" : ObjectId("54690771e4b0070527c657ed"), "value" : { "count" : 6, "lastOrderDate" : ISODate("2015-06-03T18:30:00Z") } }
{ "_id" : ObjectId("54696c64e4b07f3c07010b4a"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-18T18:30:00Z") } }
{ "_id" : ObjectId("546980d1e4b07f3c07010b4d"), "value" : { "count" : 4, "lastOrderDate" : ISODate("2015-03-24T18:30:00Z") } }
{ "_id" : ObjectId("54699ac4e4b07f3c07010b51"), "value" : { "count" : 30, "lastOrderDate" : ISODate("2015-05-23T18:30:00Z") } }
{ "_id" : ObjectId("54699d0be4b07f3c07010b55"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-16T18:30:00Z") } }
{ "_id" : ObjectId("5469a1dce4b07f3c07010b59"), "value" : { "count" : 2, "lastOrderDate" : ISODate("2015-04-29T18:30:00Z") } }
{ "_id" : ObjectId("5469a96ce4b07f3c07010b5e"), "value" : { "count" : 1, "orderDate" : ISODate("2014-11-16T18:30:00Z") } }
{ "_id" : ObjectId("5469c1ece4b07f3c07010b64"), "value" : { "count" : 9, "lastOrderDate" : ISODate("2015-04-15T18:30:00Z") } }
{ "_id" : ObjectId("5469f422e4b0ce7d5ee021ad"), "value" : { "count" : 5, "lastOrderDate" : ISODate("2015-06-01T18:30:00Z") } }
......

Now i want to run query and group the users on the basis of count in different categories like for user with count less than 5 in one group , 5-10 in another, etc

and want output something like this

{userLessThan5: 9 }
{user5to10: 2 }
{user10to15: 1 }
{user15to20: 0 }
  ....

How can I do it?

Oct 31, 2018 in Big Data Hadoop by digger
• 26,720 points
457 views

1 answer to this question.

0 votes
db.order.mapReduce(function() { emit (this.customer,{count:1,orderDate:this.orderDate.interval_start}) },
function(key,values){ 
var category; // add this new field
var sum =0 ; var lastOrderDate;  
values.forEach(function(value) {
 if(value['orderDate']){ 
    lastOrderDate=value['orderDate'];
}  
sum+=value['count'];
}); 
// at this point you are already aware in which category your records lies , just add a new field to mark it
 if(sum < 5){ category: userLessThan5};
 if(sum >= 5 && sum <=10){ category: user5to10};
 if(sum <= 10 && sum >= 15){ category: user10to15};
 if(sum <= 15 && sum >=20){ category: user15to20};
  ....
return {count:sum,lastOrderDate:lastOrderDate,category:category}; 
},
{ query:{status:"DELIVERED"},out:"order_total"}).find()
 db.order_total.aggregate([{ $group: { "_id": "$value.category", "users": { $sum: 1 } } }]);

Hope this helps!!

To know more about Mongodb, go for Mongodb training online without fail.

Thanks!

answered Oct 31, 2018 by Omkar
• 69,210 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to format the output being written by MapReduce in Hadoop?

Here is a simple code demonstrate the ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,830 points
1,355 views
0 votes
1 answer

How can we send data from MongoDB to Hadoop?

The MongoDB Connector for Hadoop reads data ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,241 views
0 votes
1 answer

How hadoop mapreduce job is submitted to worker nodes?

Alright, I think you are basically looking ...READ MORE

answered Mar 30, 2018 in Big Data Hadoop by Ashish
• 2,650 points
4,179 views
0 votes
1 answer

Integration of Hadoop with Mongo DB concept

MongoDB isn't built to work on top ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by Frankie
• 9,830 points
768 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
8,760 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,558 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
78,088 views
0 votes
1 answer

In Hadoop MapReduce, how can i set an Object as the Value for Map output?

Try this and see if it works: public ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 69,210 points
461 views
0 votes
1 answer

Hadoop: How to get the column name along with the output in Hive?

You can get the column names by ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 69,210 points
3,721 views
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP