How to specify KeyValueTextInputFormat Separator in Hadoop-.20 api?

0 votes

I'm using apache.hadoop.mapreduce.KeyValueTextInputFormat API, I want to specify separator (delimiter) other than tab(which is default) to separate key and Value. 

Sample Input :

one,first line
two,second line

Ouput Required :

Key : one
Value : first line
Key : two
Value : second line

I am specifying KeyValueTextInputFormat as :

    Job job = new Job(conf, "Sample");

    job.setInputFormatClass(KeyValueTextInputFormat.class);
    KeyValueTextInputFormat.addInputPath(job, new Path("/home/input.txt"));

Oct 24, 2018 in Big Data Hadoop by digger
• 26,600 points

recategorized Oct 24, 2018 by digger 244 views

3 answers to this question.

0 votes

In this API you should use mapreduce.input.keyvaluelinerecordreader.key.value.separatorconfiguration property.

For example:

Configuration conf = new Configuration();
conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator", ",");

Job job = new Job(conf);
job.setInputFormatClass(KeyValueTextInputFormat.class);
// next job set-up
answered Oct 24, 2018 by Omkar
• 68,480 points
0 votes

Use this setting in the Driver Code.

conf.set("key.value.separator.in.input.line", ",");
answered Dec 4, 2018 by Coco
0 votes
conf.set("key.value.separator.in.input.line", ","); 
Job job = new Job(conf);
answered Dec 4, 2018 by Rio

Related Questions In Big Data Hadoop

0 votes
0 answers

How to run Hadoop in Docker containers?

I want to incorporate Hadoop in Docker ...READ MORE

Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 10,800 points
105 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
6,626 views
0 votes
1 answer

How to configure secondary namenode in Hadoop 2.x ?

bin/hadoop-daemon.sh start [namenode | secondarynamenode | datanode ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
455 views
0 votes
1 answer

Moving files in Hadoop using the Java API?

I would recommend you to use FileSystem.rename(). ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,350 points
969 views
0 votes
1 answer

Hadoop giving java.io.IOException, in mkdir Java code.

I am not sure about the issue. ...READ MORE

answered May 3, 2018 in Big Data Hadoop by Shubham
• 13,350 points
515 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
3,587 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
457 views
0 votes
1 answer

Hadoop: How to keep duplicates in Hive using collect_set()?

SELECT hash_id, COLLECT_LIST(num_of_cats) AS ...READ MORE

answered Nov 2, 2018 in Big Data Hadoop by Omkar
• 68,480 points
387 views
0 votes
1 answer

Hadoop Hive: How to split string in Hive?

You can use the split function along ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 68,480 points
3,162 views