MapFile in Pig

0 votes
What is the use of MapFile class in Pig?
Jul 6, 2018 in Big Data Hadoop by shams
• 3,580 points
111 views

1 answer to this question.

0 votes
MapFile is a class which serves file-based map from keys to values.

A map is a directory containing two files, the data file, containing all keys and values in the map, and a smaller index file, containing a fraction of the keys. The fraction is determined by MapFile.Writer.getIndexInterval().

The index file is read entirely into memory. Thus, key implementations should try to keep themselves small. Map files are created by adding entries in-order.
answered Jul 6, 2018 by Data_Nerd
• 2,370 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Use of MapReduce in PIG

Apache Pig programs are written in a ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
263 views
0 votes
1 answer

GROUP and COGROUP in PIG

Both GROUP and COGROUP operators are identical ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by shams
• 3,580 points
1,180 views
+1 vote
1 answer

How to count number of rows in alias in PIG?

COUNT is part of pig LOGS= LOAD 'log'; LOGS_GROUP= ...READ MORE

answered Oct 15, 2018 in Big Data Hadoop by Omkar
• 69,000 points
304 views
0 votes
1 answer

Hadoop Pig: How to include external jar file in PIG?

You can do this: register /local/path/to/Jar_name.jar READ MORE

answered Nov 16, 2018 in Big Data Hadoop by Omkar
• 69,000 points
64 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
422 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,870 points
4,580 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,870 points
652 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
25,993 views
0 votes
1 answer

Checkpointing in Hadoop

Checkpointing is the process of combining the ...READ MORE

answered Jul 3, 2018 in Big Data Hadoop by Data_Nerd
• 2,370 points
357 views
0 votes
1 answer

Bucketing vs Partitioning in HIve

Partition divides large amount of data into ...READ MORE

answered Jul 9, 2018 in Big Data Hadoop by Data_Nerd
• 2,370 points
6,066 views