Classes implementing InputFormat frequently

0 votes

Which is the base class for all file-based InputFormats? And which are the most frequent use Input Formats?

Jul 24 in Big Data Hadoop by Piyush
17 views

1 answer to this question.

0 votes

FileInputFormat : Base class for all file-based InputFormats

Other frequently used Input Formats are:

KeyValueTextInputFormat : An InputFormat for plain text files. Files are broken into lines. Either line feed or carriage-return are used to signal end of line. Each line is divided into key and value parts by a separator byte. If no such a byte exists, the key will be the entire line and value will be empty.

TextInputFormat : An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Keys are the position in the file, and values are the line of text..

NLineInputFormat : NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.

SequenceFileInputFormat : An InputFormat for SequenceFiles.

Regarding second query, get the files from remote servers first and use appropriate InputFileFormat depending on contents in file. Hadoop works best for data locality.

answered Jul 24 by Reshma

Related Questions In Big Data Hadoop

0 votes
1 answer

Can we use different i/p and o/p format classes in mapreduce code?

Yes, InputFormatClass and OutputFormatClass are independent of ...READ MORE

answered Jul 9 in Big Data Hadoop by Jimmy
34 views
0 votes
1 answer

Can we use different input and output format classes?

Yes, InputFormatClass and OutputFormatClass are independent of ...READ MORE

answered Jul 22 in Big Data Hadoop by Jishan
31 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,760 points
3,531 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,760 points
433 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
17,963 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
1,298 views
0 votes
1 answer

Why does one remove or add nodes in a Hadoop cluster frequently?

One of the most attractive features of ...READ MORE

answered Dec 13, 2018 in Big Data Hadoop by Frankie
• 9,810 points
233 views
0 votes
1 answer

Using Java Classes in Talend

While working with routines, the very 1st ...READ MORE

answered Apr 14, 2018 in Talend by geek.erkami
• 2,640 points
463 views
0 votes
1 answer