Apache Spark vs Apache Spark 2

+1 vote
What are the key differences between Spark 1.x and 2.x series?
Both Architecture and Applications perspective.

Thanks in advance.
Apr 24, 2018 in Apache Spark by Ashish
• 2,650 points

2 answers to this question.

+1 vote
Spark 2 doesn't differ much architecture-wise from Spark 1.x

API usability, SQL support and Structured streaming are some of the major areas of change.

One of the biggest change was the merge of DataSet and DataFrame APIs.

The main focus was ETL.

Hope this helps
answered Apr 24, 2018 by kurt_cobain
• 9,390 points
Can you please elaborate on this, like what else changes are there between Apache Spark and Apache Spark 2
0 votes
  • Apache Spark 2.0.0 APIs are largely similar to 1.X, but Spark 2.0.0 does have API breaking changes
  • Apache Spark 2.0.0 is the first release on the 2.x line
  • The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements
answered Jun 26, 2018 by zombie
• 3,790 points

Related Questions In Apache Spark

0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

answered May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
+1 vote
6 answers

groupByKey vs reduceByKey in Apache Spark.

ReduceByKey is the best for production. READ MORE

answered Mar 3, 2019 in Apache Spark by anonymous
0 votes
1 answer

cache tables in apache spark sql

Caching the tables puts the whole table ...READ MORE

answered May 4, 2018 in Apache Spark by Data_Nerd
• 2,390 points
0 votes
1 answer

How is Apache Spark different from the Hadoop approach?

In Hadoop MapReduce the input data is ...READ MORE

answered May 7, 2018 in Apache Spark by BD Master
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,620 points
+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

answered Mar 21, 2019 in Apache Spark by anonymous
0 votes
1 answer

Can I read a CSV represented as a string into Apache Spark?

You can use the following command. This ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points