Internal work of Spark

Spark used scala language to load and execute the program and also python and java. RDD is used to store the data. But, I can't understand the architecture of Spark, how it runs internally.

Please tell me Spark Architecture as well as How it works internally?

Oct 11, 2018 in Apache Spark by Meci Matt
• 9,460 points • 1,587 views

1 answer to this question.

Spark revolves around the concept of a resilient distributed dataset (RDD), which is a fault-tolerant collection of elements that can be operated on in parallel. RDDs support two types of operations: transformations, which create a new dataset from an existing one, and actions, which return a value to the driver program after running a computation on the dataset.

Below image will help you understand how spark works internally:

answered Oct 11, 2018 by nitinrawat895
• 11,380 points

Related Questions In Apache Spark

+1 vote

2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 7,770 views

0 votes

1 answer

How to import the dependencies of Spark MLlib into eclipse project?

I would recommend you create & build ...READ MORE

answered May 31, 2018 in Apache Spark by Shubham
• 13,490 points • 2,784 views

0 votes

1 answer

How to find the number of elements present in the array in a Spark DataFame column?

You can select the column and apply ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points • 23,709 views

0 votes

1 answer

Which is better in term of speed, Shark or Spark?

Spark is a framework for distributed data ...READ MORE

answered Jun 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 1,639 views

+1 vote

1 answer

I installed Spark but while executing command, I am getting ‘hadoop’ command not found error?

For accessing Hadoop commands & HDFS, you ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,490 points • 3,780 views

0 votes

1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 6, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points • 18,609 views

0 votes

1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 10,391 views

0 votes

1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 36,062 views

0 votes

1 answer

How does partitioning work in Spark?

By default a partition is created for ...READ MORE

answered May 31, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 2,006 views

0 votes

1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 7,560 views

Subscribe to our Newsletter, and get personalized recommendations.

REGISTER FOR FREE WEBINAR

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP