Use of Group ALL in Pig

0 votes
Can someone explain with an example how to use GROUP ALL in pig?
Jul 10 in Big Data Hadoop by Esha
42 views

1 answer to this question.

0 votes

Suppose we have a data set as follows:

001,Rajiv,Reddy,21,9848022337,Hyderabad

002,siddarth,Battacharya,22,9848022338,Kolkata

003,Rajesh,Khanna,22,9848022339,Delhi

004,Preethi,Agarwal,21,9848022330,Pune

005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar

006,Archana,Mishra,23,9848022335,Chennai

007,Komal,Nayak,24,9848022334,trivendram

008,Bharathi,Nambiayar,24,9848022333,Chennai

and we are loading this table as follows:

The record coming out of group all has the chararray literal  'all' as a key and the complete content as the value.That means you would have only one key for all the values which means only one reducer would process the complete file.Because grouping collects all records together with the same value for the key.

Group all is used to group a relation by all the columns as shown below. 

grunt> group_all = GROUP student_details All;

Now, verify the content of the relation group_all as shown below.

grunt> Dump group_all;

(all,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334 ,trivendram),

(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336,Bhuw aneshwar),

(4,Preethi,Agarwal,21,9848022330,Pune),(3,Rajesh,Khanna,22,9848022339,Delhi),

(2,siddarth,Battacharya,22,9848022338,Kolkata),(1,Rajiv,Reddy,21,9848022337,Hyd erabad)})
answered Jul 10 by Roshan

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
139 views
0 votes
1 answer

Explain the use of FILTER operator in pig?

Hey, The FILTER operator is used to select ...READ MORE

answered May 6 in Big Data Hadoop by Gitika
• 25,360 points
47 views
0 votes
1 answer

What is the use of parser in Apache pig?

Hey, It is correct that it comes under ...READ MORE

answered May 7 in Big Data Hadoop by Gitika
• 25,360 points
52 views
0 votes
1 answer

What is the use of sequence file in Hadoop?

Sequence files are binary files containing serialized ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by Ashish
• 2,630 points
1,680 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,730 points
3,365 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,730 points
404 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,688 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,219 views
0 votes
1 answer

What is the use of Apache Kafka in a Big Data Cluster?

Kafka is a Distributed Messaging System which ...READ MORE

answered Jun 21 in Big Data Hadoop by ravikiran
• 4,560 points
28 views
0 votes
12 answers

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hey, Apache Zookeeper says that it is a ...READ MORE

answered Apr 29 in Big Data Hadoop by Gitika
• 25,360 points
5,754 views