Upcoming Sessions
-
November
17
ILT - DENG-254: Preparing with Cloudera Data Engineering and Apache Spark - 4664969
Starting:2025/11/17 @ 09:00 AM BerlinEnding:2025/11/20 @ 05:00 PM Berlin -
November
17
ILT - DSCI-272: Predicting with MLOps on Cloudera AI - 4664966
Starting:2025/11/17 @ 09:00 AM Central Time (US & Canada)Ending:2025/11/20 @ 05:00 PM Central Time (US & Canada)
See All Upcoming Sessions
Enterprise data science teams need collaborative access to business data, tools, and computing resources required to develop and deploy machine learning workflows. Cloudera AI, part of the Cloudera platform, provides the solution, giving data science teams the required resources. This four-day course covers machine learning workflows and operations using Cloudera AI. Participants explore, visualize, and analyze data. You will also train, evaluate, and deploy machine learning models. The course walks through an end-to-end data science and machine learning workflow based on realistic scenarios and datasets from a fictitious technology company. The demonstrations and exercises are conducted in Python (with PySpark) using Cloudera AI. Download full course description DATE: November 17-20, 2025 Virtual Classroom, AMER 9:00 - 17:00 (Central US TIMEZONE) Read more
Overview This three-day hands-on training course delivers the key concepts and expertise developers need to optimize the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Optimizing Apache Spark Applications presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. Download full course description What You'll Learn Students who successfully complete this course will be able to: Understand Apache Spark's architecture, job execution, and how techniques such as lazy execution and pipelining can improve runtime performance Evaluate the performance characteristics of core data structures such as RDD and DataFrames Select the file formats that will provide the best performance for your application Identify and resolve performance problems caused by data skew Use partitioning, bucketing, and join optimizations to improve SparkSQL performance Understand the performance overhead of Python-based RDDs, DataFrames, and user-defined functions Take advantage of caching for better application performance Understand how the Catalyst and Tungsten optimizers work Understand how Workload XM can help troubleshoot and proactively monitor Spark applications performance Learn how the Adaptive Query Execution engine improves performance What to Expect This course is designed for software developers, engineers, and data scientists who have experience developing Spark applications and want to learn how to improve the performance of their code. This is not an introduction to Spark. Spark examples and hands-on exercises are presented in Python and the ability to program in this language is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful. DATE: November 19-21, 2025 Virtual Classroom, EMEA - FRENCH 9:00 - 17:00 (CET TIMEZONE) Read more
Welcome to our Introduction Series for Cloudera Education. The following contains excerpts from Admin:230 Administering Cloudera on premises, that is part of our full OnDemand Training Library. The complete library is available for purchase. Upon completion you will see a path to continue your Cloudera training journey. Many courses include hands-on labs and the OnDemand library comes with 100 hands-on lab hours to practice the concepts and exercises taught. Please enjoy this section of your selected course to help you on your data journey. Disclaimer - The following descriptions and objectives are for the full course. Overview Lab environment is included with this course. Starting in lesson 04 you will be able to launch your environment. This course presents detailed explanation, comprehensive theory, key skills, and recommended practices for successful platform administration. Upon completion of this course a Cloudera Administrator will learn the full range of functionality and capability of Cloudera Manager. This course provides an in-depth explanation and skills to become highly productive with Cloudera Manager and the Cloudera platform. Cloudera Manager is a full featured and mature DevOps tool. It is used to install, configure, operate, troubleshoot, report, and upgrade Cloudera. Many Cloudera Administrators only use a fraction of the capabilities built into Cloudera Manager. This course teaches the architecture, deployment, configuration, logging, reporting, REST API, and much more. The course provides references for architecture and recommended practices used by enterprises around the globe. What to expect While this course is an entry point for aspiring Cloudera Administrators this course is detailed enough for more senior Cloudera Administrators to discover new functionality and capabilities. This course is intended for Linux Administrators who are taking up roles as Platform Administrators. We recommend a minimum of 2 years of system administration experience in industry. Students must have proficiency in Linux. Knowledge of Directory Services, Transport Layer Security, Kerberos, and SQL select statements is helpful. Students must have access to the Internet to reach Amazon Web Services. Read more
Welcome to our Introduction Series for Cloudera Education. The following contains excerpts from DSCI:272 - Predicting with MLOps on Cloudera AI, that is part of our full OnDemand Training Library. The complete library is available for purchase. Upon completion you will see a path to continue your Cloudera training journey. Many courses include hands-on labs and the OnDemand library comes with 100 hands-on lab hours to practice the concepts and exercises taught. Please enjoy this section of your selected course to help you on your data journey. Disclaimer - The following descriptions and objectives are for the full course. Overview Enterprise data science teams need collaborative access to business data, tools, and computing resources required to develop and deploy machine learning workflows. Cloudera AI, part of the Cloudera platform, provides the solution, giving data science teams the required resources. This course covers machine learning workflows and operations using Cloudera AI. Participants explore, visualize, and analyze data. You will also train, evaluate, and deploy machine learning models. The course walks through an end-to-end data science and machine learning workflow based on realistic scenarios and datasets from a fictitious technology company. The demonstrations and exercises are conducted in Python (with PySpark) using Cloudera AI. Course Length This course includes approximately 8.5 hours of video lectures and demonstrations. You will need your own environment to work on the labs. The labs will take approximately 7 hours to complete. What to expect The course is designed for data scientists who need to understand how to utilize Cloudera AI and the Cloudera platform to achieve faster model development and deliver production machine learning at scale. Data engineers, developers, and solution architects who collaborate with data scientists will also find this course valuable. Read more
About This Course Explore the core features of Cloudera AI and how they power modern AI and ML workflows. This course introduces Cloudera’s end-to-end AI capabilities—from data engineering and warehousing to model deployment—all built on a unified platform designed for enterprise-grade AI. You’ll gain foundational knowledge of tools like the Cloudera AI Workbench, AI Inference Service, Model Hub, AI Registry, and Private AI architecture, and understand how they come together to support scalable, secure, and efficient AI solutions. Learn how the enterprise can Infuse, Build, and Run AI with Cloudera. Goal: By the end of this course, you’ll be able to identify the key components of Cloudera AI and explain how they support enterprise AI initiatives from data to deployment. TIME: 45 minutes Read more
Recovery Assistance Cloudera’s Recovery Assistance offering is available to help customers with time-sensitive Non-Break Fix (NBF) issues specifically targeting production down, environment-specific NBFs. Customer’s purchase, and Cloudera (Government Solutions) Inc.’s (CGSI) performance, of Recovery Assistance is governed by the terms and conditions of the existing Enterprise Subscription Master Agreement between CGSI and Customer, or between Customer and CGSI’s Authorized Partner, as applicable. If no such agreement exists, the terms and conditions of the CGSI Enterprise Subscription Master Agreement at https://www.cloudera.com/legal/commercial-terms-and-conditions/cgsi-enterprise-subscription-master-agreement.html will apply. On a single incident per purchase basis, Cloudera will provide Recovery Assistance for an outage experienced by Customer in a Cloudera-based production system; provided that: (i) such production system was functioning properly prior to such incident; (ii) all Recovery Assistance is provided by Cloudera remotely and no personnel onboarding requirements are permitted; and (iii) Customer has submitted a Support Services ticket to Cloudera with respect to such incident and Cloudera Customer Support has determined the outage is due to one or more of the following items (which items are currently excluded from Cloudera’s Support Services under the Cloudera Support Policy): (a) the installation or removal of the Cloudera Products; (b) performance tuning; (c) information or assistance on technical issues related to the debugging, installation, administration, or use of Customer’s computer systems and enabling technologies including, but not limited to, databases, computer networks, communications, hardware, hard disks, networks, and printers; (d) Customer’s negligence or misuse of the Cloudera Products; and/or (e) any act or omission of Customer and/or any third party. Recovery Assistance is limited solely to the restoration of the Cloudera-based production system experiencing the outage. No new Cloudera Product functionality or documentation will be provided. The Recovery Assistance purchased hereunder must be used within 12 months of the Effective Date of purchase. Recovery Assistance not used within such 12 month period will expire and no refund will be given. In the event of any conflict between the terms of the Agreement and these terms, these terms will prevail, but only with respect to the Recovery Assistance described herein. Customer’s purchase of Recovery Assistance using a credit card indicates Customer’s acceptance of the terms and conditions provided above. Read more
Shopping Cart
Your cart is empty