Free Webinar on ‘Big Data Processing with Scala and Spark’

Big Data Processing with Spark and Scala

The above video is the recorded webinar session on the topic “Big Data Processing with Spark and Scala”, held on 27th July’14.

Introduction to Spark & Scala:

Apache Spark is a fast and general engine for large-scale data processing, originally developed in the AMPLab at UC Berkeley. Spark is a good fit for the Hadoop open-source community as its built on top of the Hadoop Distributed File System (HDFS). But Spark has the added advantage of not being tied to the two-stage MapReduce paradigm and Apache Spark addresses the limitations of Hadoop MapReduce, by generalizing the MapReduce computation model, while dramatically improving performance and ease of use. Spark provides primitives for in-memory cluster computing that enables user programs to load data into a cluster’s memory and query it repeatedly, making it well suited to machine learning algorithms.

Scala is an acronym for ‘Scalable Language’ Scala is a object-oriented language and its scalability is the result of a careful integration of object-oriented and functional language concepts. The language supports advanced component architectures through classes and traits. Scala also includes first-class functions and a library with resourceful immutable data structures.

Topics covered in the Video & Presentation:

What is Big Data?
What is Spark?
Why Spark?
Spark Ecosystem
A note about Scala
Why Scala?
Hello Spark

Spark Features:

Fast Analytics
Real-Time Stream Processing
Fault Tolerant
Powerful and Integrated Data Processing
Easy to use

Please visit this link for more details about our course ‘Big Data Processing with Scala and Spark.’
Feel free to drop us a line for any clarifications.

ol/u/0/

Sree Eedupuganti says:
Feb 23, 2015 at 1:18 pm GMT
hi everyone i am trying to access the data from hive to spark when i am running a query i can’t see the jobs is either completed or running but i am getting the output in terminal.Any suggestions plz….
Reply
Netra says:
Aug 21, 2014 at 3:07 am GMT
Is spark is the replacement of MapReduce or YARN in future or they are complementary?
Reply
- EdurekaSupport says:
  Sep 18, 2014 at 10:11 am GMT
  Hi Netra, Spark is not a replacement as they have their own features. Spark runs on YARN cluster as well.
  Reply
venkata murty maddula says:
Jul 28, 2014 at 1:39 pm GMT
Excellent ….
Reply
- EdurekaSupport says:
  Jul 30, 2014 at 8:35 am GMT
  Thanks Venkata!! Feel free to go through our other blog posts as well.
  Reply
Kaustav Ray says:
Jul 28, 2014 at 5:56 am GMT
Being a fresher in data analytics, can I opt for learning spark before learning Hadoop ? [ I understand Java and have also worked with R. ]
Reply
- EdurekaSupport says:
  Aug 20, 2014 at 1:00 am GMT
  Absolutely Kaustav!! You can go for Spark. Since you already to know Java, you can also go for Hadoop. You can call us at US: 1800 275 9730 (Toll Free) or India: +91 88808 62004 to discuss in detail. You can also go through this link for more information: https://www.edureka.co/big-data-hadoop-training-certification
  Reply
Amitabh says:
Jul 27, 2014 at 2:05 pm GMT
Can we use Spark for unstructured data?
Reply
- EdurekaSupport says:
  Jul 28, 2014 at 5:47 am GMT
  Absolutely Amitabh!! Spark can be used for Unstructured data. Either we can do some data cleansing and bring that data to Spark or we can do that in Spark itself.
  Reply
Karuna Devanagavi says:
Jul 27, 2014 at 4:53 am GMT
Are scala and spark are also integrated in cloudera virtual machine ???
Reply
- EdurekaSupport says:
  Jul 28, 2014 at 5:53 am GMT
  Hi Karuna, You can install Spark on CDH4(cloudera) using cloudera manager. You can refer to the following link for this: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.8.1/Cloudera-Manager-Installation-Guide/cmig_spark_installation_standalone.html
  Spark(Scala is included along with spark) will come integrated with CDH 5.1. You can refer to the following link: http://blog.cloudera.com/blog/2014/05/apache-spark-1-0-is-released/
  Reply

Big Data Processing with Spark and Scala

Introduction to Spark & Scala:

Recommended blogs for you

Azure Synapse: Unlocking the Power of Your Data

How to Become an Azure Data Engineer in 2024? – A Complete Roadmap

What are Kafka Streams and How are they implemented?

What are the Best books for Hadoop?

How to become an Apache Spark Developer?

How to Plan the Capacity of a Hadoop Cluster?

Zookeeper Tutorial: The Guide you need to Master Zookeeper

Big Data Characteristics: Know the 5’Vs of Big Data

What is CCA-175 Spark and Hadoop Developer Certification?

What are the Key Terminologies in Hadoop Security?

Top Hadoop Developer Skills You Need to Master in 2024

What are the Roles and Responsibilities of a Hadoop Developer?

How to become a Hadoop Developer? Job Trends and Salary

Big Data Testing: A Perfect Guide You Need to Follow

Why do we need Hadoop for Data Science?

How To Install MongoDB on Mac Operating System?

How To Install MongoDB On Ubuntu Operating System?

How To Install MongoDB On Windows Operating System?

How To Create User In MongoDB?

Machine Learning and Big Data: Is it the future?

Playlist & Videos

Join the discussion Cancel reply

Browse Categories

Big Data Processing with Spark and Scala

Introduction to Spark & Scala:

Recommended blogs for you

Azure Synapse: Unlocking the Power of Your Data

How to Become an Azure Data Engineer in 2024? – A Complete Roadmap

What are Kafka Streams and How are they implemented?

What are the Best books for Hadoop?

How to become an Apache Spark Developer?

How to Plan the Capacity of a Hadoop Cluster?

Zookeeper Tutorial: The Guide you need to Master Zookeeper

Big Data Characteristics: Know the 5’Vs of Big Data

What is CCA-175 Spark and Hadoop Developer Certification?

What are the Key Terminologies in Hadoop Security?

Top Hadoop Developer Skills You Need to Master in 2024

What are the Roles and Responsibilities of a Hadoop Developer?

How to become a Hadoop Developer? Job Trends and Salary

Big Data Testing: A Perfect Guide You Need to Follow

Why do we need Hadoop for Data Science?

How To Install MongoDB on Mac Operating System?

How To Install MongoDB On Ubuntu Operating System?

How To Install MongoDB On Windows Operating System?

How To Create User In MongoDB?

Machine Learning and Big Data: Is it the future?

Playlist & Videos

Join the discussion Cancel reply

Trending Courses in Big Data

Azure Data Engineer Certification (DP-203) Co ...

PySpark Course Online Training

Big Data Hadoop Certification Training Course

Apache Spark and Scala Certification Training ...

Apache Kafka Certification Training Course

Splunk Certification Training: Power User and ...

Leveraging Big Data for Business Intelligence ...

ELK Stack Training & Certification

Apache Solr Certification Training

Apache Storm Certification Training

Browse Categories

Subscribe to our Newsletter, and get personalized recommendations.