Cloudera Educational Services

Upcoming Sessions

See All Upcoming Sessions

Overview The Cloudera platform is intended to meet the most demanding technical audit standards. The significant improvements in Cloudera architecture and components make Cloudera “Secure by Design.” This four-day hands-on course is presented as a project plan for Cloudera administrators to build fully secured Cloudera clusters.   The course begins with implementing Perimeter Security by installing host level security and Kerberos. Next, students protect Data by implementing Transport Layer Security using Auto-TLS and data encryption using Key Management System and Key Trustee Server (KMS/KTS). Following this, in the third stage, students control access for users and to data using Apache Ranger and Apache Atlas. The fourth stage focuses on visibility practices, teaching students how to audit systems, users, and data usage. Finally, the course introduces Cloudera practices for Risk Management in a fully secured Cloudera platform. This course is 60% exercise and 40% lecture.   Who should take this course? This immersion course is designed for Linux Administrators transitioning to Cloudera Administrator roles. Students must have proficiency in Linux (e.g., navigating the file system, using basic commands) and Linux text editors (e.g., vi, nano). Familiarity with Directory Services, Transport Layer Security, Kerberos, and SQL select statements is recommended. Prior experience with Cloudera products is required. Students must have reliable internet access to connect to the classroom environments hosted on Amazon Web Services.   DATE: November 10-13, 2025 Virtual Classroom, AMER 9:00 - 17:00 (Central US TIMEZONE) Read more

Cloudera is a fully integrated edge to AI product set. Cloudera Manager is purposely built as the DevOps tooling for building and managing the Cloudera platform. This four-day hands-on course presents detailed explanation, comprehensive theory, key skills, and recommended practices for successful platform administration. Upon completion of this course a Cloudera Administrator will learn the full range of functionality and capability of Cloudera Manager.   DATE: November 3-6, 2025 Virtual Classroom, AMER 9:00 - 17:00 (Central US TIMEZONE) Read more

This course introduces Apache Iceberg, a high-performance open table format for organizing petabyte-scale analytic datasets on a file system or object store, available on Cloudera Data Warehouse and Cloudera Data Engineering on both Private and Public Cloud. Combined with Cloudera Data Platform, Iceberg can enable users to build an open data lakehouse architecture for multi-function analytics and to deploy large-scale end-to-end pipelines. This course covers various aspects of Apache Iceberg, such as benefits, architecture, internal operation, read and write operations, and advanced functions, all while drawing comparisons to Hive and building on the students’ existing knowledge and experience. DATE: October 27-30, 2025 Virtual Classroom, AMER 9:00 - 17:00 (Central US TIMEZONE) Read more

Overview This three-day hands-on training course delivers the key concepts and expertise developers need to optimize the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Optimizing Apache Spark Applications presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. Download full course description What You'll Learn Students who successfully complete this course will be able to: Understand Apache Spark's architecture, job execution, and how techniques such as lazy execution and pipelining can improve runtime performance Evaluate the performance characteristics of core data structures such as RDD and DataFrames Select the file formats that will provide the best performance for your application Identify and resolve performance problems caused by data skew Use partitioning, bucketing, and join optimizations to improve SparkSQL performance Understand the performance overhead of Python-based RDDs, DataFrames, and user-defined functions Take advantage of caching for better application performance Understand how the Catalyst and Tungsten optimizers work Understand how Workload XM can help troubleshoot and proactively monitor Spark applications performance Learn how the Adaptive Query Execution engine improves performance What to Expect This course is designed for software developers, engineers, and data scientists who have experience developing Spark applications and want to learn how to improve the performance of their code. This is not an introduction to Spark. Spark examples and hands-on exercises are presented in Python and the ability to program in this language is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.   DATE: November 19-21, 2025 Virtual Classroom, EMEA 9:00 - 17:00 (CET TIMEZONE) Read more

This four-day hands-on training course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform.  Hands-on exercises allow students to practice writing Spark applications that integrate with Cloudera Data Platform core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with “big data” stored in a distributed file system. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries. Download full course description  What you'll learn During this course, you will learn how to: Distribute, store, and process data in a cluster Write, configure, and deploy Apache Spark applications Use the Spark interpreters and Spark applications to explore, process, and analyze distributed data Query data using Spark SQL, DataFrames, and Hive tables Deploy a Spark application on the Data Engineering Service What to expect This course is designed for developers and data engineers. All students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful.  Prior knowledge of Spark and Hadoop is not required. DATE: October 27-30, 2025 Virtual Classroom, EMEA French language 9:00 - 17:00 (CET TIMEZONE) Read more

Welcome to our Introduction Series for Cloudera Education. The following contains excerpts from Admin:230 Administering Cloudera on premises, that is part of our full OnDemand Training Library.  The complete library is available for purchase. Upon completion you will see a path to continue your Cloudera training journey. Many courses include hands-on labs and the OnDemand library comes with 100 hands-on lab hours to practice the concepts and exercises taught. Please enjoy this section of your selected course to help you on your data journey. Disclaimer - The following descriptions and objectives are for the full course. Overview Lab environment is included with this course. Starting in lesson 04 you will be able to launch your environment. This course presents detailed explanation, comprehensive theory, key skills, and recommended practices for successful platform administration. Upon completion of this course a Cloudera Administrator will learn the full range of functionality and capability of Cloudera Manager. This course provides an in-depth explanation and skills to become highly productive with Cloudera Manager and the Cloudera platform. Cloudera Manager is a full featured and mature DevOps tool. It is used to install, configure, operate, troubleshoot, report, and upgrade Cloudera. Many Cloudera Administrators only use a fraction of the capabilities built into Cloudera Manager. This course teaches the architecture, deployment, configuration, logging, reporting, REST API, and much more. The course provides references for architecture and recommended practices used by enterprises around the globe. What to expect While this course is an entry point for aspiring Cloudera Administrators this course is detailed enough for more senior Cloudera Administrators to discover new functionality and capabilities. This course is intended for Linux Administrators who are taking up roles as Platform Administrators. We recommend a minimum of 2 years of system administration experience in industry. Students must have proficiency in Linux. Knowledge of Directory Services, Transport Layer Security, Kerberos, and SQL select statements is helpful. Students must have access to the Internet to reach Amazon Web Services. Read more

Shopping Cart

Your cart is empty