Introducing - Cloudera Data Engineering: Developing Applications with Apache Spark



Course Length
144 mins

OnDemand Moderation



Welcome to our Introduction Series for Cloudera Education. The following contains excerpts from a course that is part of our full OnDemand Training Library. The complete library is available for purchase. Upon completion you will see a path to continue your Cloudera training journey. Many courses include hands-on labs and the OnDemand library comes with 100 hands-on lab hours to practice the concepts and exercises taught. Please enjoy this section of your selected course to help you on your data journey.

Disclaimer - The following descriptions and objectives are for the full course.

About This Course

This course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform (CDP).

Practice writing spark applications that integrate with CDP core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with big data stored in a distributed file system.

Course Length

This course includes approximately 9.5 hours of video lectures, demonstrations, and exercises.
In order to complete the self-paced exercises for this course, students must have access to CDP through their organization.

Audience and Prerequisites

This course is designed for developers and data engineers. Students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful. Prior knowledge of Spark and Hadoop is not required.


During this course, you will learn how to:

  • Distribute, store, and process data in a CDP cluster.
  • Write, configure, and deploy Apache Spark applications.
  • Use the Spark interpreters and Spark applications to explore, process, and analyze distributed data.
  • Query data using Spark SQL, DataFrames, and Hive tables.
  • Deploy a Spark application on the Data Engineering Service

Added 3 days ago, by Anonymous
Added 14 days ago, by Elvis
Added 16 days ago, by Anonymous
nice course
Added 23 days ago, by Rahmat
good course
Added 23 days ago, by Christopher
Added 26 days ago, by Ludwik
Great, straight to the point introduction, with some execution/underlying technology information included. Very useful. Slow speech, so can be comfortably listened in 1.5 of the original speed
Added about 1 month ago, by Robertus Agung
Added about 1 month ago, by Shobhit
Added about 1 month ago, by Khalid
Added 2 months ago, by Sakina

Shopping Cart

Your cart is empty