Difficulty
Intermediate
Course Length
4 hours
Instructor
OnDemand Moderation
Price
Free
Description
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription.
About This Course
Cloudera Data Science Workbench Training prepares learners to complete data science and machine learning projects using Cloudera Data Science Workbench (CDSW).
Course Length
This module includes over 6 hours of video content.
Audience and Prerequisites
This OnDemand course is designed for learners at organizations using CDSW under a trial license or a commercial license. The learner must have access to a CDSW environment on a CDP cluster running Apache Spark 2. Some experience with data science using Python or R is helpful but not required. No prior knowledge of Spark or other Hadoop ecosystem tools is required.
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription.
Objectives
By the end of this course, you will be able to:
- Navigate CDSW’s web user interface
- Create projects in CDSW and edit them according to your needs
- Edit and run Python code using the built-in editor, JupyterLabs, or a third-party editor
- Load and transform data in CDSW using popular Python or R package, and using Apache Spark
- Work with large-scale data using Apache Spark with PySpark
- Schedule workloads and implement a job dependency chain
- Implement a machine learning workflow, starting with analyzing and visualizing data, then training and testing a model
- Measure and track versions of a model using CDSW's Experiments capability
- Deploy models as REST API endpoints serving predictions using CDSW’s Models capability
- Create custom applications and dashboards that can be shared in and outside your team
- Work collaboratively using CDSW together with Git
Shopping Cart
Your cart is empty