Big Data Processing with Spark and Scala
The above video is the recorded webinar session on the topic “Big Data Processing with Spark and Scala”, held on 27th July’14.
Introduction to Spark & Scala:
Apache Spark is a fast and general engine for large-scale data processing, originally developed in the AMPLab at UC Berkeley. Spark is a good fit for the Hadoop open-source community as its built on top of the Hadoop Distributed File System (HDFS). But Spark has the added advantage of not being tied to the two-stage MapReduce paradigm and Apache Spark addresses the limitations of Hadoop MapReduce, by generalizing the MapReduce computation model, while dramatically improving performance and ease of use. Spark provides primitives for in-memory cluster computing that enables user programs to load data into a cluster’s memory and query it repeatedly, making it well suited to machine learning algorithms.
Scala is an acronym for ‘Scalable Language’ Scala is a object-oriented language and its scalability is the result of a careful integration of object-oriented and functional language concepts. The language supports advanced component architectures through classes and traits. Scala also includes first-class functions and a library with resourceful immutable data structures.
- What is Big Data?
- What is Spark?
- Why Spark?
- Spark Ecosystem
- A note about Scala
- Why Scala?
- Hello Spark
- Fast Analytics
- Real-Time Stream Processing
- Fault Tolerant
Powerful and Integrated Data Processing
Easy to use
Please visit this link for more details about our course ‘Big Data Processing with Scala and Spark.’
Feel free to drop us a line for any clarifications.