Cloudera Educational Services

Upcoming Sessions

See All Upcoming Sessions

Welcome to our Introduction Series for Cloudera Education. The following contains excerpts from a course that is part of our full OnDemand Training Library. The complete library is available for purchase. Upon completion you will see a path to continue your Cloudera training journey. Many courses include hands-on labs and the OnDemand library comes with 100 hands-on lab hours to practice the concepts and exercises taught. Please enjoy this section of your selected course to help you on your data journey. Disclaimer - The following descriptions and objectives are for the full course. About This Course This course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform (CDP). Practice writing spark applications that integrate with CDP core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with big data stored in a distributed file system. Course Length This course includes approximately 9.5 hours of video lectures, demonstrations, and exercises. In order to complete the self-paced exercises for this course, students must have access to CDP through their organization. Audience and Prerequisites This course is designed for developers and data engineers. Students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful. Prior knowledge of Spark and Hadoop is not required. Read more

About This Course This video demonstration consists of building a serverless website where one can upload a receipt picture. Following the upload to S3, an AWS Lambda function is triggered and a NiFi slow is executed to process the image and extract useful information using AWS Textract. The information is then sent to a database to serve an expense report solution.  Audience and Prerequisites This OnDemand course is suitable for data engineers and data analysts.   Read more

About This Course This course is part of the Skillup series. Learn how typical AI-Centric' challenges are addressed with CDP Machine Learning. This course includes 40 minutes of video content including a demonstration on customer churn. Audience and Prerequisites This OnDemand course is suitable for data engineers, data analysts, developers, and data scientists    Read more

About This Course DataGen by Francois Risch. In this course, we will give a walkthrough of installation and how to use the DataGen, a tool that generates data on all services provided by Cloudera (HDFS, Hive, HBase, Ozone, Kafka, Kudu, SolR, Local files), in any format (CSV, JSON, Avro, Parquet, ORC). Course Length This course includes 1 hour of video content. Audience and Prerequisites This OnDemand course is suitable for administrators that need to generate data.       Read more

About This Course This course will teach you the key language concepts and programming techniques you need so that you can concentrate on the subjects covered in Cloudera's developer courses without also having to learn a complex programming language and a new programming paradigm on the fly. Course Length This module includes 2.5 hours of video content. The hands-on exercises for this free course  require  your own environment. They may take up to 3 hours. If you have a paid subscription you should enroll in Just Enough Python which includes an environment. Audience and Prerequisites This course is intended for students who need to become familiar with Python before progressing to one of the Cloudera developer classes. Prior knowledge of Hadoop is not required. Since this course is intended for developers who do not yet have the prerequisite skills writing code in Python, basic programming experience in at least one commonly-used programming language (ideally Java, but Ruby, Perl, Scala, C, C++, PHP, or Javascript will suffice) is assumed. This course does not teach Big Data concepts, nor does it cover how to use Cloudera software. Instead, it is meant as a precursor for one of our developer-focused training courses, which provide those skills. Read more

About This Course Whether you’re building big data applications, developing data pipelines, or working on machine learning projects, it’s essential to manage changes to your code. Although developers and data scientists have employed a variety of tools for this over the years, an open source version control system called git has emerged as the standard tool for thousands of organizations around the world. This course introduces students to the Git version control system through a series of lectures, demonstrations, and exercises. Course Length This module includes over an hour of video content. The hands-on exercises for this free course  requires your own environment. They may take up to 3 hours. If you have a paid subscription you should enroll in Just Enough Git which includes an environment. Audience and Prerequisites This course is best suited to developers and data scientists who feel comfortable performing basic operations from the Linux command line. No prior experience with git or other revision control systems is necessary. Read more

Shopping Cart

Your cart is empty