Cloudera Education

ILT - DSCI-273: Generative AI on Cloudera - 5114865

About This Training Generative AI (GenAI) and Large Language Models (LLMs) are extremely powerful new tools that are changing every industry. To fully take advantage of GenAI and LLMs, these new capabilities need to be combined with your existing enterprise data. This two-day course teaches how to use Cloudera AI to train, augment, fine tune, and host LLMs to create powerful enterprise AI solutions. What Skills You Will Gain Through lecture and Hands-On exercises, you will learn how to: Select the right LLM model for a use case Configure a Prompt for an LLM Use Retrieval Augmented Generation (RAG) Fine Tune an LLM Model with Enterprise Data Use the AI Model Registry and host an LLM Create an AI Agent with Crew AI Who Should Take This Course This course is designed for data scientists and machine learning engineers who need to understand how to utilize Cloudera AI to leverage the full power of their enterprise data, generative AI, and Large Language Models and deliver powerful business solutions. DATE: October 5-6, 2026 9:00 - 17:00 (GMT+2 TIMEZONE) Virtual Classroom, EMEA Read more

ILT - DOPS-246: Streaming on Cloudera with Apache Flink - 5114793

This two-day instructor-led training course teaches students the development and operations skills needed to support Cloudera Streaming Analytics, a framework for low-latency processing and analytics powered by Apache Flink and Cloudera's innovative SQL Stream Builder. Through extensive hands-on exercises, students will gain experience deploying and managing a Flink cluster, developing and running Flink applications, and using SQL Stream Builder's continuous SQL to perform analytics on streaming data. DATE: August 24-25, 2026 9:00 - 17:00 (GMT+2 TIMEZONE) Virtual Classroom, EMEA Read more

ILT - DENG-256: Optimizing Apache Spark Applications - 5114321

Overview This three-day hands-on training course delivers the key concepts and expertise developers need to optimize the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Optimizing Apache Spark Applications presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. Download full course description What You'll Learn Students who successfully complete this course will be able to: Understand Apache Spark's architecture, job execution, and how techniques such as lazy execution and pipelining can improve runtime performance Evaluate the performance characteristics of core data structures such as RDD and DataFrames Select the file formats that will provide the best performance for your application Identify and resolve performance problems caused by data skew Use partitioning, bucketing, and join optimizations to improve SparkSQL performance Understand the performance overhead of Python-based RDDs, DataFrames, and user-defined functions Take advantage of caching for better application performance Understand how the Catalyst and Tungsten optimizers work Understand how Workload XM can help troubleshoot and proactively monitor Spark applications performance Learn how the Adaptive Query Execution engine improves performance What to Expect This course is designed for software developers, engineers, and data scientists who have experience developing Spark applications and want to learn how to improve the performance of their code. This is not an introduction to Spark. Spark examples and hands-on exercises are presented in Python and the ability to program in this language is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful. DATE: September 1-3, 2026 9:30 - 17:30 (GMT+8 SGT TIMEZONE) Virtual Classroom, APAC Read more

ILT - DANA-262: Analyzing with Cloudera Data Warehouse - 5114319

This four-day Analyzing with Data Warehouse course will teach you to apply traditional data analytics and business intelligence skills to big data. This course presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages. Download full course description What you'll learn Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the ecosystem, learning how to: Use Apache Hive and Apache Impala to access data through queries Identify distinctions between Hive and Impala, such as differences in syntax, data formats, and supported features Write and execute queries that use functions, aggregate functions, and subqueries Use joins and unions to combine datasets Create, modify, and delete tables, views, and databases Load data into tables and store query results Select file formats and develop partitioning schemes for better performance Use analytic and windowing functions to gain insight into their data Store and query complex or nested data structures Process and analyze semi-structured and unstructured data Optimize and extend the capabilities of Hive and Impala Determine whether Hive, Impala, an RDBMS, or a mix of these is the best choice for a given task Utilize the benefits of CDP Public Cloud Data Warehouse What to expect This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Some knowledge of SQL is assumed, as is basic Linux command-line familiarity. DATE: August 18-21, 2026 9:30 - 17:30 (GMT+8 SGT TIMEZONE) Virtual Classroom, APAC Read more

ILT - DENG-254: Preparing with Cloudera Data Engineering and Apache Spark - 5114313

This four-day hands-on training course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform (CDP). Hands-on exercises allow students to practice writing Spark applications that integrate with CDP core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with “big data” stored in a distributed file system. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries. Download full course description What you'll learn During this course, you will learn how to: Distribute, store, and process data in a CDP cluster Write, configure, and deploy Apache Spark applications Use the Spark interpreters and Spark applications to explore, process, and analyze distributed data Query data using Spark SQL, DataFrames, and Hive tables Deploy a Spark application on the Data Engineering Service What to expect This course is designed for developers and data engineers. All students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful. Prior knowledge of Spark and Hadoop is not required. DATE: August 11-14, 2026 9:30 - 17:30 (GMT+8 SGT TIMEZONE) Virtual Classroom, APAC Read more

ILT - ADMIN-230: Administering Cloudera on premises - 5114310

Cloudera is a fully integrated edge to AI product set. Cloudera Manager is purposely built as the DevOps tooling for building and managing the Cloudera platform. This four-day hands-on course presents detailed explanation, comprehensive theory, key skills, and recommended practices for successful platform administration. Upon completion of this course a Cloudera Administrator will learn the full range of functionality and capability of Cloudera Manager. DATE: August 4-7, 2026 9:30 - 17:30 (GMT+8 SGT TIMEZONE) Virtual Classroom, APAC Read more

Cloudera Educational Services

Upcoming Sessions

13

ILT - DENG-256: Optimizing Apache Spark Applications - 5015886 - public APAC

13

ILT - DOPS-246: Streaming on Cloudera with Apache Flink - 4959775 - public EMEA

See All Upcoming Sessions

Shopping Cart