MapFile in Pig

0 votes
What is the use of MapFile class in Pig?
Jul 6, 2018 in Big Data Hadoop by shams
• 3,580 points
58 views

1 answer to this question.

0 votes
MapFile is a class which serves file-based map from keys to values.

A map is a directory containing two files, the data file, containing all keys and values in the map, and a smaller index file, containing a fraction of the keys. The fraction is determined by MapFile.Writer.getIndexInterval().

The index file is read entirely into memory. Thus, key implementations should try to keep themselves small. Map files are created by adding entries in-order.
answered Jul 6, 2018 by Data_Nerd
• 2,360 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
76 views
0 votes
1 answer

GROUP and COGROUP in PIG

Both GROUP and COGROUP operators are identical ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
564 views
+1 vote
1 answer

How to count number of rows in alias in PIG?

COUNT is part of pig LOGS= LOAD 'log'; LOGS_GROUP= ...READ MORE

answered Oct 15, 2018 in Big Data Hadoop by Omkar
• 67,380 points
85 views
0 votes
1 answer

Hadoop Pig: How to include external jar file in PIG?

You can do this: register /local/path/to/Jar_name.jar READ MORE

answered Nov 16, 2018 in Big Data Hadoop by Omkar
• 67,380 points
32 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
186 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,675 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
279 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,264 views
0 votes
1 answer

Checkpointing in Hadoop

Checkpointing is the process of combining the ...READ MORE

answered Jul 3, 2018 in Big Data Hadoop by Data_Nerd
• 2,360 points
120 views
0 votes
1 answer

Bucketing vs Partitioning in HIve

Partition divides large amount of data into ...READ MORE

answered Jul 9, 2018 in Big Data Hadoop by Data_Nerd
• 2,360 points
2,671 views