MapFile in Pig

0 votes
What is the use of MapFile class in Pig?
Jul 6, 2018 in Big Data Hadoop by shams
• 3,580 points
92 views

1 answer to this question.

0 votes
MapFile is a class which serves file-based map from keys to values.

A map is a directory containing two files, the data file, containing all keys and values in the map, and a smaller index file, containing a fraction of the keys. The fraction is determined by MapFile.Writer.getIndexInterval().

The index file is read entirely into memory. Thus, key implementations should try to keep themselves small. Map files are created by adding entries in-order.
answered Jul 6, 2018 by Data_Nerd
• 2,370 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
219 views
0 votes
1 answer

GROUP and COGROUP in PIG

Both GROUP and COGROUP operators are identical ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
1,026 views
+1 vote
1 answer

How to count number of rows in alias in PIG?

COUNT is part of pig LOGS= LOAD 'log'; LOGS_GROUP= ...READ MORE

answered Oct 15, 2018 in Big Data Hadoop by Omkar
• 68,860 points
222 views
0 votes
1 answer

Hadoop Pig: How to include external jar file in PIG?

You can do this: register /local/path/to/Jar_name.jar READ MORE

answered Nov 16, 2018 in Big Data Hadoop by Omkar
• 68,860 points
45 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
307 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
3,905 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
536 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
20,745 views
0 votes
1 answer

Checkpointing in Hadoop

Checkpointing is the process of combining the ...READ MORE

answered Jul 3, 2018 in Big Data Hadoop by Data_Nerd
• 2,370 points
293 views
0 votes
1 answer

Bucketing vs Partitioning in HIve

Partition divides large amount of data into ...READ MORE

answered Jul 9, 2018 in Big Data Hadoop by Data_Nerd
• 2,370 points
4,601 views