Upcoming Sessions
-
December
10
ILT - Apache Spark Application Performance Tuning Workshop - 3704807
Starting:2024/12/10 @ 08:00 AM (GMT+00:00) UTCEnding:2024/12/12 @ 04:00 PM (GMT+00:00) UTCType:Multi-day Session
See All Upcoming Sessions
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course Cloudera University’s Data Analyst Training course will teach you to apply traditional data analytics and business intelligence skills to big data. This course presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages. Apache Hive makes transformation and analysis of complex, multi-structured data scalable in Cloudera environments. Apache Impala enables real-time interactive analysis of the data stored in Hadoop using a native SQL environment. Together, they make multi-structured data accessible to analysts, database administrators, and others without Java programming expertise. Course Length This course includes 7 hours of video content, plus 2 hours of exercise review. Hands-on exercises will take approximately 10.5 hours. Audience and Prerequisites This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Some knowledge of SQL is assumed, as is basic Linux command-line familiarity. Prior knowledge of Apache Hadoop is not required. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course This course delivers the key concepts and expertise developers need to use Apache Spark to develop high-performance parallel applications. Participants will learn how to use Spark SQL to query structured data and Spark Streaming to perform real-time processing on streaming data from a variety of sources. Developers will also practice writing applications that use core Spark to perform ETL processing and iterative algorithms. The course covers how to work with “big data” stored in a distributed file system, and execute Spark applications on a Hadoop cluster. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries. Course Length This course includes over 5 hours of video content. Hands-on exercises may take up to 10 hours to complete. Audience and Prerequisites This course is designed for developers and engineers who have programming experience, but prior knowledge of Spark and Hadoop is not required. Apache Spark examples and hands-on exercises are presented in Scala and Python. The ability to program in one of those languages is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course This course provides an introduction the Scala language and the functional programming paradigm. Course Length This course includes 2 hours of video content. Hands-On Exercises can take about 2.25 hours to complete. Audience and Prerequisites This course is intended for students who need to become familiar with Scala before progressing to one of the Cloudera developer classes. Familiarity with Java programming, object-oriented programming, and basic computer science concepts is suggested in order to get the most out of this course. However, the class does not refer specifically to big data, data analytics, Cloudera software, or Hadoop, so no experience in those areas is needed. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course This course will teach you the key language concepts and programming techniques you need so that you can concentrate on the subjects covered in Cloudera's developer courses without also having to learn a complex programming language and a new programming paradigm on the fly. Course Length This module includes 2.5 hours of video content. Hands-on exercises may take up to 3 hours. Audience and Prerequisites This course is intended for students who need to become familiar with Python before progressing to one of the Cloudera developer classes. Prior knowledge of Hadoop is not required. Since this course is intended for developers who do not yet have the prerequisite skills writing code in Python, basic programming experience in at least one commonly-used programming language (ideally Java, but Ruby, Perl, Scala, C, C++, PHP, or Javascript will suffice) is assumed. This course does not teach Big Data concepts, nor does it cover how to use Cloudera software. Instead, it is meant as a precursor for one of our developer-focused training courses, which provide those skills. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course Whether you’re building big data applications, developing data pipelines, or working on machine learning projects, it’s essential to manage changes to your code. Although developers and data scientists have employed a variety of tools for this over the years, an open source version control system called git has emerged as the standard tool for thousands of organizations around the world. This course introduces students to the Git version control system through a series of lectures, demonstrations, and hands-on exercises. Course Length This module includes over an hour of video content. Hands-on exercises may take an additional 3 hours. Audience and Prerequisites This course is best suited to developers and data scientists who feel comfortable performing basic operations from the Linux command line. No prior experience with git or other revision control systems is necessary. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. About This Course This course provides the fundamental concepts and experience necessary to automate the ingest, flow, transformation, and egress of data using Apache NiFi. Along with gaining a grasp of the key features, concepts, and benefits of NiFi, participants will create and run NiFi dataflows for a variety of scenarios. Students will gain expertise using processors, connections, and process groups, and will use NiFi Expression Language to control the flow of data from various sources to multiple destinations. Participants will monitor dataflows, examine progress of data through a dataflow, and connect dataflows to external systems such as Kafka and HDFS. After taking this course, participants will have key knowledge and expertise for configuring and managing data ingestion, movement, and transformation scenarios for the enterprise. Course Length This module includes 4 hours of video content. Hands-on exercises will take approximately 9 hours. Audience and Prerequisites This course is designed for Developers, Data Engineers, Data Scientists, and Data Stewards. It provides a no-code, graphical approach to configuring real-time data streaming, ingestion, and management solutions for a variety of use cases. Though programming experience is not required, basic experience with Linux is presumed. Exposure to big data concepts and applications is helpful. Note: Enrolling here will not give you access to the actual course. The course is available by purchasing the Full OnDemand Library subscription. Read more
Shopping Cart
Your cart is empty