How does Hadoop accesses the files which are distributed among different boundaries

0 votes
May 7, 2019 in Big Data Hadoop by nitinrawat895
• 11,380 points
526 views

1 answer to this question.

0 votes

Hadoop's MapReduce function does not work on physical blocks of the file, instead, it is designed to work upon the logical memory or in simpler words, the input splits. 

These Input splits are dependent on the location where the file is written. A record may map two mappers.

The HDFS is designed in such a way that each and every file is written into it is split into blocks of 128 MB each and each block is replicated 3 times by default.

for example, consider a file. The data in this file can begin in block a and end in block b.

HDFS does not track the location of the data. Instead, it solely depends upon the logical input splits. It is these input splits which depict the start and end of any particular file.

enter image description here

for more information, you can go through this article.

answered May 7, 2019 by ravikiran
• 4,620 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How can Hadoop process the records that are split across the block boundaries?

First of all, Map Reduce algorithm is not programmed ...READ MORE

answered Apr 15, 2019 in Big Data Hadoop by nitinrawat895
• 11,380 points
3,480 views
0 votes
1 answer

Which among the following are the Features of Hadoop?

Apache Hadoop is a collection of open-source ...READ MORE

answered Dec 9, 2021 in Big Data Hadoop by Kavya
• 700 points
1,914 views
0 votes
1 answer

What are the different ways to load data from Hadoop to Azure Data Lake?

I would recommend you to go through ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by coldcode
• 2,080 points
892 views
0 votes
1 answer

What are some of the famous visualization tools which can be integrated with Hadoop & Hive?

I have personally used two visualization tools ...READ MORE

answered May 1, 2018 in Big Data Hadoop by coldcode
• 2,080 points
1,804 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,557 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,185 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,208 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,260 views
0 votes
1 answer

How does Hadoop process data which is split across multiple boundaries in an HDFS?

I found some comments: from the Hadoop ...READ MORE

answered Jul 1, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
725 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP