Cloudera Educational Services

Upcoming Sessions

See All Upcoming Sessions

This four-day course teaches the architecture, deployment, configuration, and running of CDP Data Services on Embedded Containerized Services (ECS). CDP Data Services are state-of-the-art low code computing fusing together the entire data lifecycle into a single set of tools, reducing the costs of developing Use Cases while accelerating development and deployment. The course begins with practices recommended for managing Docker images and containers resulting in the building of a Docker private registry. The Docker private registry is used to deploy the Data Services cluster on ECS. Students will learn to install, configure, and validate Cloudera Data Engineering, Cloudera Data Warehouse, and Cloudera Machine Learning. Exercises focus on learning Kubernetes, installing Private Cloud Embedded Container Service (ECS), and deploying Cloudera Data Services. The course includes requirements for networking and hardware, and explanations of Kubernetes pods dynamically scaling to support CDP Data Services. Who should take this course? This immersion course is intended for CDP Administrators who are advancing into CDP Data Services running in a private cloud environment. We recommend a minimum of 3 to 5 years of system administration experience in industry. Students must have proficiency in Linux Command Line Interface, knowledge of Identity Management, Transport Layer Security, and Kerberos. Experience with SQL select statements is helpful. Prior experience with Cloudera products is expected, experience with CDP, CDH, or HDP is sufficient. Students must have access to the Internet to reach Amazon Web Services. March 5  - 8, 2024 Virtual Classroom, APAC 9:00 - 17:00 (Singapore Time) Read More

This three-day hands-on training course delivers the key concepts and expertise developers need to improve the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Apache Spark Application Performance Tuning presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. The course applies to Spark 2.4, but also introduces the Spark 3.0 Adaptive Query Execution framework.   Read More

This three-day hands-on training course delivers the key concepts and expertise developers need to improve the performance of their Apache Spark applications. During the course, participants will learn how to identify common sources of poor performance in Spark applications, techniques for avoiding or solving them, and best practices for Spark application monitoring. Apache Spark Application Performance Tuning presents the architecture and concepts behind Apache Spark and underlying data platform, then builds on this foundational understanding by teaching students how to tune Spark application code. The course format emphasizes instructor-led demonstrations illustrate both performance issues and the techniques that address them, followed by hands-on exercises that give students an opportunity to practice what they've learned through an interactive notebook environment. The course applies to Spark 2.4, but also introduces the Spark 3.0 Adaptive Query Execution framework. [DATE: Month START - END, YEAR] Virtual Classroom, [APAC, EMEA, AMER] 9:00 - 17:00 (TIMEZONE) Read More

This four-day hands-on training course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform (CDP).  Hands-on exercises allow students to practice writing Spark applications that integrate with CDP core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with “big data” stored in a distributed file system. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries. Download full course description  What you'll learn During this course, you will learn how to: Distribute, store, and process data in a CDP cluster Write, configure, and deploy Apache Spark applications Use the Spark interpreters and Spark applications to explore, process, and analyze distributed data Query data using Spark SQL, DataFrames, and Hive tables Deploy a Spark application on the Data Engineering Service What to expect This course is designed for developers and data engineers. All students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful.  Prior knowledge of Spark and Hadoop is not required. 2024-05-7 Virtual Classroom 9:00 - 17:00 (GMT+1) Read More

This four-day hands-on training course delivers the key concepts and knowledge developers need to use Apache Spark to develop high-performance, parallel applications on the Cloudera Data Platform (CDP).  Hands-on exercises allow students to practice writing Spark applications that integrate with CDP core components. Participants will learn how to use Spark SQL to query structured data, how to use Hive features to ingest and denormalize data, and how to work with “big data” stored in a distributed file system. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries. Download full course description  What you'll learn During this course, you will learn how to: Distribute, store, and process data in a CDP cluster Write, configure, and deploy Apache Spark applications Use the Spark interpreters and Spark applications to explore, process, and analyze distributed data Query data using Spark SQL, DataFrames, and Hive tables Deploy a Spark application on the Data Engineering Service What to expect This course is designed for developers and data engineers. All students are expected to have basic Linux experience, and basic proficiency with either Python or Scala programming languages. Basic knowledge of SQL is helpful.  Prior knowledge of Spark and Hadoop is not required. March 12 - 15, 2024 Virtual Classroom, EMEA 9:00 - 17:00 (Paris Time) Read More

Apache Ozone is the next-generation hybrid storage service offering versatility and out-of-the-box compatibility. Ozone is an object storage format exceeding the limitations of HDFS. This course teaches architecture, internal operations, installation, file system usage, best practices, security, maintenance, monitoring, tuning and testing. Download full course description  What you'll learn This course teaches the Ozone internal architecture and how to install, use, maintain, monitor, tune, integrate, and test the the Ozone service in a secure environment. Participants will gain the following skills: Understanding the Benefits of Using Ozone Installing and Configuring Secure Ozone Managing Files and Objects in Ozone Performance Tuning and Doing Baseline Tests Controlling Replication and Understanding Failover and Recovery Performing Maintenance Tasks  Monitoring Ozone Using Recon Service Integrating Hive, Impala, Spark, Nifi, and Flink with Ozone Migrating Data from HDFS to Ozone What to expect This advanced course is for administrators who are currently using CDP Private Cloud Base. The course will appeal to technicians, such as data engineers and applications developers, who are migrating data and applications to Apache Ozone. Prior experience of Cloudera Data Platform, to include HDFS, YARN, and Hive, is expected. Students must have access to the Internet to reach the classroom environments, which are located on Amazon Web Services.   DATE: 12th-15th March 2024 Virtual Classroom, APAC 9:00 - 17:00 IST Read More

Shopping Cart

Your cart is empty