What is Network Topology in Hadoop?

0 votes
I could not understand the how the distance between the nodes became 0, 2, 4, 6.

As per the definitive guide,

For example, imagine a node n1 on rack r1 in data center d1. This can be represented as /d1/r1/n1. Using this notation, here are the distances for the four scenarios:

• distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)

• distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack)

• distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the same data center)

• distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers).

• distance(/d1/r1/n1, /d2/r3/n10) = ?

What is the network distance?
Sep 6, 2018 in Big Data Hadoop by Neha
• 6,280 points
402 views

1 answer to this question.

0 votes

Let's imagine your cluster as a tree with the following levels:

  • Abstract global root (Top or root)
  • Data centers (1st level)
  • Racks (2nd level)
  • Nodes (3rd level or leaves)

If we draw this tree there should be something like this:

Cluster topology

Let's count distance between any circle and its parent as 1.

Then the distance between any two circles is the sum of their distance to their closest common ancestor or 0 for the same node.

So it's always 6 for any two nodes in different data centers (like between /d1/r1/n1 and /d2/r4/n10).

 

                                                                          OR 

"The distance between two nodes is the sum of their distances to their closest common ancestor" (Hadoop: The Definitive Guide 4th ed, page 70)

distance (/d1/r1/n1, /d2/r3/n10) = 6

The common ancestor between two nodes is /

so the distance from n1 to / is 3

and the distance from n10 to / is 3

the total is 6

Hope this helps you :)

answered Sep 6, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

What Distributed Cache is actually used for in Hadoop?

Basically distributed cache allows you to cache ...READ MORE

answered Apr 2, 2018 in Big Data Hadoop by Ashish
• 2,630 points
129 views
0 votes
1 answer

What is the use of sequence file in Hadoop?

Sequence files are binary files containing serialized ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by Ashish
• 2,630 points
906 views
0 votes
1 answer
0 votes
12 answers

What is Zookeeper? What is the purpose of Zookeeper in Hadoop Ecosystem?

Hey, Apache Zookeeper says that it is a ...READ MORE

answered Apr 29 in Big Data Hadoop by Gitika
• 25,300 points
3,799 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
2,392 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
244 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
12,185 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
896 views
0 votes
1 answer

What is the Data format and database choices in Hadoop and Spark?

Use Parquet. I'm not sure about CSV ...READ MORE

answered Sep 4, 2018 in Big Data Hadoop by Frankie
• 9,810 points
52 views
0 votes
1 answer

What is the difference between Hadoop MapReduce and built-in MapReduce?

Differences are as follows: Hadoop's MR can be ...READ MORE

answered Sep 11, 2018 in Big Data Hadoop by Frankie
• 9,810 points
144 views