How to set input split settings

0 votes

I have an object of the org.apache.hadoop.conf.Configuration class. I have a 20 GB file, and I want to fire 40 mappers. How can I do it?

Dec 27, 2018 in Big Data Hadoop by slayer
• 29,350 points
2,168 views

1 answer to this question.

0 votes

It can be controlled by setting the parameters, mapred.min.split.size and mapred.max.split.size, while configuring the job in MapReduce. The value is to be set in bytes. So if we have a 20 GB file, and we want to fire 40 mappers, then we need to set it to 20480 / 40 = 512 MB each. So for that the code would be,
conf.set("mapred.min.split.size", "536870912"); 
conf.set("mapred.max.split.size", "536870912");

answered Dec 27, 2018 by Omkar
• 69,210 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to set a custom install directory for a deb package with fpm

Here's something that you can try... the last ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Amrinder
• 140 points
1,138 views
0 votes
1 answer

How to set the number of Map & Reduce tasks?

The map tasks created for a job ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by Shubham
• 13,490 points
1,608 views
0 votes
1 answer

How to avoid a “split-brain” scenario with NameNodes?

Okay, so let me tell you that ...READ MORE

answered Jul 11, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
3,908 views
+1 vote
0 answers

How to set up Hadoop cluster on Mac in intelliJ IDEA

I have Installed hadoop using brew and ...READ MORE

Jul 25, 2018 in Big Data Hadoop by Neha
• 6,300 points
925 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,600 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,207 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,755 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,285 views
0 votes
1 answer

Hadoop Hive: How to split string in Hive?

You can use the split function along ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 69,210 points
11,277 views
0 votes
1 answer

Hadoop Hive: How to split a single row into multiple rows?

Try this SELECT ID1, Sub FROM tableName lateral view ...READ MORE

answered Nov 14, 2018 in Big Data Hadoop by Omkar
• 69,210 points
8,600 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP