Internal work of Spark

0 votes
Spark used scala language to load and execute the program and also python and java. RDD is used to store the data. But, I can't understand the architecture of Spark, how it runs internally.

Please tell me Spark Architecture as well as How it works internally?
Oct 11, 2018 in Apache Spark by Meci Matt
• 9,460 points
1,415 views

1 answer to this question.

0 votes

Spark revolves around the concept of a resilient distributed dataset (RDD), which is a fault-tolerant collection of elements that can be operated on in parallel. RDDs support two types of operations: transformations, which create a new dataset from an existing one, and actions, which return a value to the driver program after running a computation on the dataset.

Below image will help you understand how spark works internally:

image

answered Oct 11, 2018 by nitinrawat895
• 11,380 points

Related Questions In Apache Spark

+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points
7,258 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Which is better in term of speed, Shark or Spark?

Spark is a framework for distributed data ...READ MORE

answered Jun 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,459 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
18,329 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points
9,875 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points
35,653 views
0 votes
1 answer

How does partitioning work in Spark?

By default a partition is created for ...READ MORE

answered May 31, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,761 views
0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,257 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP