BigData in MPI

0 votes

Please let me know what is MPI in bigdata. 

Feb 14, 2019 in Big Data Hadoop by Ganesh
778 views

1 answer to this question.

0 votes

MPI is a communication protocol for programming parallel computers. MPI's goals are high performance, scalability, and portability. The MPI interface is meant to provide essential virtual topology, synchronization, and communication functionality between a set of processes (that have been mapped to nodes/servers/computer instances).MPI has been widely used in High-Performance Computing. In contrast, such efficient communication support is lacking in the field of Big Data Computing, where communication is realized by time-consuming techniques such as HTTP/RPC. This paper takes a step in bridging these two fields by extending MPI to support Hadoop-like Big Data Computing jobs, where processing and communication of a large number of key-value pair instances are needed through distributed computation models such as MapReduce, Iteration, and Streaming. We abstract the characteristics of key-value communication patterns into a bipartite communication model, which reveals four distinctions from MPI: Dichotomic, Dynamic, Data-centric, and Diversified features. Utilizing this model, we propose the specification of a minimalistic extension to MPI. An open source communication library, DataMPI, is developed to implement this specification. Performance experiments show that DataMPI has significant advantages in performance and flexibility while maintaining high productivity, scalability, and fault tolerance of Hadoop.

answered Feb 14, 2019 by Disha

Related Questions In Big Data Hadoop

0 votes
1 answer

How to run Hadoop in Docker containers?

Hi, You can run Hadoop in Docker container. Follow ...READ MORE

answered Jan 24, 2020 in Big Data Hadoop by MD
• 95,440 points
1,772 views
0 votes
1 answer

How Impala is fast compared to Hive in terms of query response?

Impala provides faster response as it uses MPP(massively ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,860 views
+2 votes
1 answer

Is Kafka and Zookeeper are required in a Big Data Cluster?

Apache Kafka is one of the components ...READ MORE

answered Mar 23, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,855 views
0 votes
1 answer

How can I get the respective Bitcoin value for an input in USD when using c#

Simply make call to server and parse ...READ MORE

answered Mar 25, 2018 in Big Data Hadoop by charlie_brown
• 7,720 points
752 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,559 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,185 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,218 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,261 views
0 votes
1 answer

why do we need MaPReduce in BigData Hadoop?

Hi, As we know Hadoop provides Hdfs as ...READ MORE

answered Feb 4, 2020 in Big Data Hadoop by MD
• 95,440 points
555 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
25,534 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP