Description
About This Course
Cloudera Data Science Workbench Training prepares learners to complete data science and machine learning projects using Cloudera Data Science Workbench (CDSW).
Course Length
This module includes over 4 hours of video content. Exercises are brief and require the learner to have access to a CDSW environment on a CDP cluster running Apache Spark 2.
Audience and Prerequisites
This OnDemand course is designed for learners at organizations using CDSW under a trial license or a commercial license. The learner must have access to a CDSW environment on a CDP cluster running Apache Spark 2. Some experience with data science using Python or R is helpful but not required. No prior knowledge of Spark or other Hadoop ecosystem tools is required.
Objectives
By the end of this course, you will be able to:
- Navigate CDSW’s web user interface
- Create projects in CDSW and edit them according to your needs
- Edit and run Python code using the built-in editor, JupyterLabs, or a third-party editor
- Load and transform data in CDSW using popular Python or R package, and using Apache Spark
- Work with large-scale data using Apache Spark with PySpark
- Schedule workloads and implement a job dependency chain
- Implement a machine learning workflow, starting with analyzing and visualizing data, then training and testing a model
- Measure and track versions of a model using CDSW's Experiments capability
- Deploy models as REST API endpoints serving predictions using CDSW’s Models capability
- Create custom applications and dashboards that can be shared in and outside your team
- Work collaboratively using CDSW together with Git
Certificate
By completing/passing this course, you will attain the certificate Cloudera OnDemand and ILT Certificate (Centered, Updated)
Shopping Cart
Your cart is empty