Use of Group ALL in Pig

0 votes
Can someone explain with an example how to use GROUP ALL in pig?
Jul 10, 2019 in Big Data Hadoop by Esha
2,862 views

1 answer to this question.

0 votes

Suppose we have a data set as follows:

001,Rajiv,Reddy,21,9848022337,Hyderabad

002,siddarth,Battacharya,22,9848022338,Kolkata

003,Rajesh,Khanna,22,9848022339,Delhi

004,Preethi,Agarwal,21,9848022330,Pune

005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar

006,Archana,Mishra,23,9848022335,Chennai

007,Komal,Nayak,24,9848022334,trivendram

008,Bharathi,Nambiayar,24,9848022333,Chennai

and we are loading this table as follows:

The record coming out of group all has the chararray literal  'all' as a key and the complete content as the value.That means you would have only one key for all the values which means only one reducer would process the complete file.Because grouping collects all records together with the same value for the key.

Group all is used to group a relation by all the columns as shown below. 

grunt> group_all = GROUP student_details All;

Now, verify the content of the relation group_all as shown below.

grunt> Dump group_all;

(all,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334 ,trivendram),

(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336,Bhuw aneshwar),

(4,Preethi,Agarwal,21,9848022330,Pune),(3,Rajesh,Khanna,22,9848022339,Delhi),

(2,siddarth,Battacharya,22,9848022338,Kolkata),(1,Rajiv,Reddy,21,9848022337,Hyd erabad)})
answered Jul 10, 2019 by Roshan

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,670 points
1,965 views
0 votes
1 answer

Explain the use of FILTER operator in pig?

Hey, The FILTER operator is used to select ...READ MORE

answered May 7, 2019 in Big Data Hadoop by Gitika
• 65,910 points
1,299 views
0 votes
1 answer

What is the use of parser in Apache pig?

Hey, It is correct that it comes under ...READ MORE

answered May 8, 2019 in Big Data Hadoop by Gitika
• 65,910 points
858 views
0 votes
1 answer

What is the use of sequence file in Hadoop?

Sequence files are binary files containing serialized ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by Ashish
• 2,650 points
9,206 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,602 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,207 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,777 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,286 views
0 votes
1 answer

What is the use of Apache Kafka in a Big Data Cluster?

Kafka is a Distributed Messaging System which ...READ MORE

answered Jun 21, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
723 views
0 votes
12 answers

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hey, Apache Zookeeper says that it is a ...READ MORE

answered Apr 29, 2019 in Big Data Hadoop by Gitika
• 65,910 points
28,398 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP