Apache Spark vs Apache Spark 2

+1 vote
What are the key differences between Spark 1.x and 2.x series?
Both Architecture and Applications perspective.

Thanks in advance.
Apr 24, 2018 in Apache Spark by Ashish
• 2,630 points
4,504 views

2 answers to this question.

+1 vote
Spark 2 doesn't differ much architecture-wise from Spark 1.x

API usability, SQL support and Structured streaming are some of the major areas of change.

One of the biggest change was the merge of DataSet and DataFrame APIs.

The main focus was ETL.

Hope this helps
answered Apr 24, 2018 by kurt_cobain
• 9,290 points
Can you please elaborate on this, like what else changes are there between Apache Spark and Apache Spark 2
0 votes
  • Apache Spark 2.0.0 APIs are largely similar to 1.X, but Spark 2.0.0 does have API breaking changes
  • Apache Spark 2.0.0 is the first release on the 2.x line
  • The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements
answered Jun 25, 2018 by zombie
• 3,750 points

Related Questions In Apache Spark

0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 10,840 points
2,332 views
0 votes
5 answers

groupByKey vs reduceByKey in Apache Spark.

ReduceByKey is the best for production. READ MORE

answered Mar 3, 2019 in Apache Spark by anonymous
15,718 views
0 votes
1 answer

cache tables in apache spark sql

Caching the tables puts the whole table ...READ MORE

answered May 4, 2018 in Apache Spark by Data_Nerd
• 2,370 points
1,057 views
0 votes
1 answer

How is Apache Spark different from the Hadoop approach?

In Hadoop MapReduce the input data is ...READ MORE

answered May 7, 2018 in Apache Spark by BD Master
113 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
307 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
254 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,560 points
811 views
+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

answered Mar 21, 2019 in Apache Spark by anonymous
39,133 views
0 votes
1 answer

Can I read a CSV represented as a string into Apache Spark?

You can use the following command. This ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,290 points
80 views