Apache Spark vs Apache Spark 2

Question

What are the key differences between Spark 1.x and 2.x series?
Both Architecture and Applications perspective.

Thanks in advance.

kurt_cobain · Answer 1 · Apr 24, 2018

Spark 2 doesn't differ much architecture-wise from Spark 1.x

API usability, SQL support and Structured streaming are some of the major areas of change.

One of the biggest change was the merge of DataSet and DataFrame APIs.

The main focus was ETL.

Hope this helps

answered Apr 24, 2018 by kurt_cobain
• 9,350 points

Can you please elaborate on this, like what else changes are there between Apache Spark and Apache Spark 2

commented Jun 26, 2018 by shams
• 3,670 points

zombie · Answer 2 · Jun 26, 2018

Apache Spark 2.0.0 APIs are largely similar to 1.X, but Spark 2.0.0 does have API breaking changes
Apache Spark 2.0.0 is the first release on the 2.x line
The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements