What is Network Topology in Hadoop

0 votes
I could not understand the how the distance between the nodes became 0, 2, 4, 6.

As per the definitive guide,

For example, imagine a node n1 on rack r1 in data center d1. This can be represented as /d1/r1/n1. Using this notation, here are the distances for the four scenarios:

• distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)

• distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack)

• distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the same data center)

• distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers).

• distance(/d1/r1/n1, /d2/r3/n10) = ?

What is the network distance?
Sep 6, 2018 in Big Data Hadoop by Neha
• 6,300 points
3,481 views

1 answer to this question.

0 votes

Let's imagine your cluster as a tree with the following levels:

  • Abstract global root (Top or root)
  • Data centers (1st level)
  • Racks (2nd level)
  • Nodes (3rd level or leaves)

If we draw this tree there should be something like this:

Cluster topology

Let's count distance between any circle and its parent as 1.

Then the distance between any two circles is the sum of their distance to their closest common ancestor or 0 for the same node.

So it's always 6 for any two nodes in different data centers (like between /d1/r1/n1 and /d2/r4/n10).

 

                                                                          OR 

"The distance between two nodes is the sum of their distances to their closest common ancestor" (Hadoop: The Definitive Guide 4th ed, page 70)

distance (/d1/r1/n1, /d2/r3/n10) = 6

The common ancestor between two nodes is /

so the distance from n1 to / is 3

and the distance from n10 to / is 3

the total is 6

Hope this helps you :)

answered Sep 6, 2018 by Frankie
• 9,830 points

Related Questions In Big Data Hadoop

0 votes
1 answer

What Distributed Cache is actually used for in Hadoop?

Basically distributed cache allows you to cache ...READ MORE

answered Apr 3, 2018 in Big Data Hadoop by Ashish
• 2,650 points
1,835 views
0 votes
1 answer

What is the use of sequence file in Hadoop?

Sequence files are binary files containing serialized ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by Ashish
• 2,650 points
9,186 views
0 votes
1 answer
0 votes
12 answers

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hey, Apache Zookeeper says that it is a ...READ MORE

answered Apr 29, 2019 in Big Data Hadoop by Gitika
• 65,910 points
28,274 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,558 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,185 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,214 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,260 views
0 votes
1 answer

What is the Data format and database choices in Hadoop and Spark?

Use Parquet. I'm not sure about CSV ...READ MORE

answered Sep 4, 2018 in Big Data Hadoop by Frankie
• 9,830 points
716 views
0 votes
1 answer

What is the difference between Hadoop MapReduce and built-in MapReduce?

Differences are as follows: Hadoop's MR can be ...READ MORE

answered Sep 11, 2018 in Big Data Hadoop by Frankie
• 9,830 points
1,181 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP