Cloudera Educational Services

Upcoming Sessions

See All Upcoming Sessions

Overview The Cloudera platform is intended to meet the most demanding technical audit standards. The significant improvements in Cloudera architecture and components make Cloudera “Secure by Design.” This four-day hands-on course is presented as a project plan for Cloudera administrators to build fully secured Cloudera clusters.   The course begins with implementing Perimeter Security by installing host level security and Kerberos. Next, students protect Data by implementing Transport Layer Security using Auto-TLS and data encryption using Key Management System and Key Trustee Server (KMS/KTS). Following this, in the third stage, students control access for users and to data using Apache Ranger and Apache Atlas. The fourth stage focuses on visibility practices, teaching students how to audit systems, users, and data usage. Finally, the course introduces Cloudera practices for Risk Management in a fully secured Cloudera platform. This course is 60% exercise and 40% lecture.   Who should take this course? This immersion course is designed for Linux Administrators transitioning to Cloudera Administrator roles. Students must have proficiency in Linux (e.g., navigating the file system, using basic commands) and Linux text editors (e.g., vi, nano). Familiarity with Directory Services, Transport Layer Security, Kerberos, and SQL select statements is recommended. Prior experience with Cloudera products is required. Students must have reliable internet access to connect to the classroom environments hosted on Amazon Web Services.   DATE: June 23-26, 2026 9:30 - 17:30 (SGT timezone) Virtual Classroom, APAC Read more

Overview This three-day hands-on training course delivers the key concepts and expertise developers need to optimize the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Optimizing Apache Spark Applications presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. Download full course description What You'll Learn Students who successfully complete this course will be able to: Understand Apache Spark's architecture, job execution, and how techniques such as lazy execution and pipelining can improve runtime performance Evaluate the performance characteristics of core data structures such as RDD and DataFrames Select the file formats that will provide the best performance for your application Identify and resolve performance problems caused by data skew Use partitioning, bucketing, and join optimizations to improve SparkSQL performance Understand the performance overhead of Python-based RDDs, DataFrames, and user-defined functions Take advantage of caching for better application performance Understand how the Catalyst and Tungsten optimizers work Understand how Workload XM can help troubleshoot and proactively monitor Spark applications performance Learn how the Adaptive Query Execution engine improves performance What to Expect This course is designed for software developers, engineers, and data scientists who have experience developing Spark applications and want to learn how to improve the performance of their code. This is not an introduction to Spark. Spark examples and hands-on exercises are presented in Python and the ability to program in this language is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.   DATE: July 13-15, 2026 9:30 - 17:30 (SGT timezone) Virtual Classroom, APAC Read more

This four-day course teaches the architecture, deployment, and configuration of Cloudera Data Services on Embedded Containerized Services (ECS). Cloudera Data Services provide a state-of-the-art, low- code platform that unifies the entire data lifecycle, reducing development costs and accelerating the development and deployment of use cases.   The course starts by covering best practices for managing Docker images and containers. Students will then build a Docker private registry. This Docker private registry will be used to deploy a Data Services cluster on ECS. Students will install, configure, and validate Cloudera Data Engineering, Cloudera Data Warehouse, and Cloudera Machine Learning. Through hands-on exercises, students will gain experience with Kubernetes, install a Private Cloud Embedded Container Service (ECS), and deploy Cloudera Data Services. Additionally, the course covers networking and hardware requirements and explains how Kubernetes pods dynamically scale to support Cloudera Data Services. Who should take this course This immersive course is designed for Cloudera Administrators transitioning to managing Cloudera Data Services on premises. Students should have at least 3 to 5 years of system administration experience. Students must have proficiency in the Linux Command Line Interface and knowledge of Identity Management, including Transport Layer Security and Kerberos. Familiarity with SQL select statements is recommended. Prior experience with Cloudera products is required. Students need reliable internet access to connect to the Amazon Web Services environment used in this course. Recommended prerequisite courses • ADMIN-230: Administering Cloudera on premises • ADMIN-332: Securing Cloudera on premises   DATE: July 14-17, 2026 9:30 - 17:30 (SGT TIMEZONE) Virtual Classroom, APAC Read more

Overview The Cloudera platform is intended to meet the most demanding technical audit standards. The significant improvements in Cloudera architecture and components make Cloudera “Secure by Design.” This four-day hands-on course is presented as a project plan for Cloudera administrators to build fully secured Cloudera clusters.   The course begins with implementing Perimeter Security by installing host level security and Kerberos. Next, students protect Data by implementing Transport Layer Security using Auto-TLS and data encryption using Key Management System and Key Trustee Server (KMS/KTS). Following this, in the third stage, students control access for users and to data using Apache Ranger and Apache Atlas. The fourth stage focuses on visibility practices, teaching students how to audit systems, users, and data usage. Finally, the course introduces Cloudera practices for Risk Management in a fully secured Cloudera platform. This course is 60% exercise and 40% lecture.   Who should take this course? This immersion course is designed for Linux Administrators transitioning to Cloudera Administrator roles. Students must have proficiency in Linux (e.g., navigating the file system, using basic commands) and Linux text editors (e.g., vi, nano). Familiarity with Directory Services, Transport Layer Security, Kerberos, and SQL select statements is recommended. Prior experience with Cloudera products is required. Students must have reliable internet access to connect to the classroom environments hosted on Amazon Web Services.   DATE: July 7-10, 2026 9:30 - 17:30 (SGT TIMEZONE) Virtual Classroom, APAC Read more

This two-day instructor-led training course teaches students the development and operations skills needed to support Cloudera Streaming Analytics, a framework for low-latency processing and analytics powered by Apache Flink and Cloudera's innovative SQL Stream Builder. Through extensive hands-on exercises, students will gain experience deploying and managing a Flink cluster, developing and running Flink applications, and using SQL Stream Builder's continuous SQL to perform analytics on streaming data. DATE: July 13-14, 2026 9:00 - 17:00 (CEST TIMEZONE) Virtual Classroom, EMEA Read more

Designing Edge to AI Applications is a 4-day learning event that addresses advanced big data architecture topics for building edge to AI applications to cover streaming, operational data processing, analytics, and machine learning.  The workshop brings together technical contributors into a group setting to design and architect solutions to a challenging business problem. The workshop addresses big data architecture problems in general, and then applies them to the design of a challenging system. Throughout the highly interactive workshop, participants apply concepts to real-world examples resulting in detailed synergistic discussions. The workshop is conducive for participants to learn techniques for architecting big data systems, not only from Cloudera’s experience but also from the experiences of fellow participants.  More specifically, this workshop addresses advanced big data architecture topics, including, data formats, transformation, transactions, real-time, batch and machine learning processing, scalability, fault tolerance, security, and privacy, minimizing the risk of an unsound architecture and technology selection. What you'll learn Cloudera Data Platform Big Data Architecture Building Scalable applications Building Fault Tolerant Solutions Security and Privacy Deployment on Public, Private, and Hybrid Cloud What to expect Participants should mainly be architects, developer team leads, big data developers, data engineers, senior analysts, dev ops admins and machine learning developers who are working on big data or streaming applications and have an interest in how to design and develop such applications on CDP. To gain the most from the workshop, participants should have working knowledge of popular Big Data and streaming technologies such as HDFS, Spark, Kafka, Hive/Impala, Data Formats, and relational database management systems. Detailed API level knowledge is not needed, as there will not be any programming activities and instead the focus will be on architecture design. The workshop will be divided into small groups to discuss the problems, develop solutions, and present their solutions. DATE: June 23-25, 2026 9:00 - 17:00 (GMT+3 TIMEZONE) Virtual Classroom, EMEA Read more

Shopping Cart

Your cart is empty