$70 | 5 sections, 15 exercises, 13 quizzes (8 hours)
Apache Spark is an open source cluster data processing engine. Spark is designed to provide fast processing of large datasets and high performance for a wide range of applications. Unlike MapReduce, Spark enables in-memory cluster computing which greatly improves the speed of iterative algorithms and interactive data mining tasks.
We guide developers through an explanation of the Spark framework, the basics of programming in Scala, Spark’s native language, and an outline of how to work with Spark’s primary abstraction, resilient distributed datasets (RDDs).
‘Introduction to Apache Spark’ includes illuminating video lectures, practical hands-on Scala and Spark exercises, a guide to local installation of Spark, and quizzes.
With the completion of ‘Introduction to Spark’, trainees should be able to explain core concepts relating to Spark, understand the fundamentals of coding in Scala, and can execute basic programming and data manipulation in Spark.
- Data scientists and engineers
- Individuals with a basic understand of Apache Hadoop and Big Data
- Individuals with some understanding of programming languages like Scala, Java, or Python