What is Network Topology in Hadoop?

Question

I could not understand the how the distance between the nodes became 0, 2, 4, 6.

As per the definitive guide,

For example, imagine a node n1 on rack r1 in data center d1. This can be represented as /d1/r1/n1. Using this notation, here are the distances for the four scenarios:

&#8226; distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)

&#8226; distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack)

&#8226; distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the same data center)

&#8226; distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers).

&#8226; distance(/d1/r1/n1, /d2/r3/n10) = ?

What is the network distance?

Frankie · Answer

Let's imagine your cluster as a tree with the following levels:Abstract global root (Top or root)Data centers (1st level)Racks (2nd level)Nodes (3rd level or leaves)If we draw this tree there should be something like this:Let's count distance between any circle and its parent as 1.Then the distance between any two circles is the sum of their distance to their closest common ancestor or 0 for the same node.So it's always&#160;6&#160;for any two nodes in different data centers (like between /d1/r1/n1 and /d2/r4/n10).&#160;&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; OR&#160;"The distance between two nodes is the sum of their distances to their closest common ancestor" (Hadoop: The Definitive Guide 4th ed, page 70)distance (/d1/r1/n1, /d2/r3/n10) = 6The common ancestor between two nodes is /so the distance from n1 to / is 3and the distance from n10 to / is 3the total is 6Hope this helps you :)

What is Network Topology in Hadoop

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

What Distributed Cache is actually used for in Hadoop?

What is the use of sequence file in Hadoop?

What is the difference between a zero reducer and identity reducer in Hadoop Mapreduce?

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

What is the Data format and database choices in Hadoop and Spark?

What is the difference between Hadoop MapReduce and built-in MapReduce?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES