Use of Group ALL in Pig

0 votes
Can someone explain with an example how to use GROUP ALL in pig?
Jul 10 in Big Data Hadoop by Esha
31 views

1 answer to this question.

0 votes

Suppose we have a data set as follows:

001,Rajiv,Reddy,21,9848022337,Hyderabad

002,siddarth,Battacharya,22,9848022338,Kolkata

003,Rajesh,Khanna,22,9848022339,Delhi

004,Preethi,Agarwal,21,9848022330,Pune

005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar

006,Archana,Mishra,23,9848022335,Chennai

007,Komal,Nayak,24,9848022334,trivendram

008,Bharathi,Nambiayar,24,9848022333,Chennai

and we are loading this table as follows:

The record coming out of group all has the chararray literal  'all' as a key and the complete content as the value.That means you would have only one key for all the values which means only one reducer would process the complete file.Because grouping collects all records together with the same value for the key.

Group all is used to group a relation by all the columns as shown below. 

grunt> group_all = GROUP student_details All;

Now, verify the content of the relation group_all as shown below.

grunt> Dump group_all;

(all,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334 ,trivendram),

(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336,Bhuw aneshwar),

(4,Preethi,Agarwal,21,9848022330,Pune),(3,Rajesh,Khanna,22,9848022339,Delhi),

(2,siddarth,Battacharya,22,9848022338,Kolkata),(1,Rajiv,Reddy,21,9848022337,Hyd erabad)})
answered Jul 10 by Roshan

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
77 views
0 votes
1 answer

Explain the use of FILTER operator in pig?

Hey, The FILTER operator is used to select ...READ MORE

answered May 6 in Big Data Hadoop by Gitika
• 25,340 points
27 views
0 votes
1 answer

What is the use of parser in Apache pig?

Hey, It is correct that it comes under ...READ MORE

answered May 7 in Big Data Hadoop by Gitika
• 25,340 points
38 views
0 votes
1 answer

What is the use of sequence file in Hadoop?

Sequence files are binary files containing serialized ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by Ashish
• 2,630 points
1,184 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,712 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
286 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,469 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
986 views
0 votes
1 answer

What is the use of Apache Kafka in a Big Data Cluster?

Kafka is a Distributed Messaging System which ...READ MORE

answered Jun 21 in Big Data Hadoop by ravikiran
• 4,560 points
21 views
0 votes
12 answers

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hey, Apache Zookeeper says that it is a ...READ MORE

answered Apr 29 in Big Data Hadoop by Gitika
• 25,340 points
4,416 views