About This Course
Cloudera Data Science Workbench Training prepares learners to complete data science and machine learning projects using Cloudera Data Science Workbench (CDSW).
This module includes over 4 hours of video content. Exercises are brief and require the learner to have access to a CDSW environment on a CDP cluster running Apache Spark 2.
Audience and Prerequisites
This OnDemand course is designed for learners at organizations using CDSW under a trial license or a commercial license. The learner must have access to a CDSW environment on a CDP cluster running Apache Spark 2. Some experience with data science using Python or R is helpful but not required. No prior knowledge of Spark or other Hadoop ecosystem tools is required.
By the end of this course, you will be able to:
- Navigate CDSW’s web user interface
- Create projects in CDSW and edit them according to your needs
- Edit and run Python code using the built-in editor, JupyterLabs, or a third-party editor
- Load and transform data in CDSW using popular Python or R package, and using Apache Spark
- Work with large-scale data using Apache Spark with PySpark
- Schedule workloads and implement a job dependency chain
- Implement a machine learning workflow, starting with analyzing and visualizing data, then training and testing a model
- Measure and track versions of a model using CDSW's Experiments capability
- Deploy models as REST API endpoints serving predictions using CDSW’s Models capability
- Create custom applications and dashboards that can be shared in and outside your team
- Work collaboratively using CDSW together with Git
By completing/passing this course, you will attain the certificate Cloudera OnDemand and ILT Certificate (Centered, Updated)
Your cart is empty