Machine Learning with Apache Spark

This course is part of multiple programs. Learn more

Instructors: IBM Skills Network Team +1 more

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

What you'll learn

  •   Describe ML, explain its role in data engineering, summarize generative AI, discuss Spark's uses, and analyze ML pipelines and model persistence.
  •   Evaluate ML models, distinguish between regression, classification, and clustering models, and compare data engineering pipelines with ML pipelines.
  •   Construct the data analysis processes using Spark SQL, and perform regression, classification, and clustering using SparkML.
  •   Demonstrate connecting to Spark clusters, build ML pipelines, perform feature extraction and transformation, and model persistence.
  • Skills you'll gain

  •   Supervised Learning
  •   Data Transformation
  •   Extract, Transform, Load
  •   Data Processing
  •   Regression Analysis
  •   Apache Hadoop
  •   Generative AI
  •   Classification And Regression Tree (CART)
  •   Machine Learning
  •   Data Pipelines
  •   PySpark
  •   Apache Spark
  •   Applied Machine Learning
  •   Predictive Modeling
  •   Unsupervised Learning
  • There are 4 modules in this course

    Start by learning ML fundamentals before unlocking the power of Apache Spark to build and deploy ML models for data engineering applications. Dive into supervised and unsupervised learning techniques and discover the revolutionary possibilities of Generative AI through instructional readings and videos. Gain hands-on experience with Spark structured streaming, develop an understanding of data engineering and ML pipelines, and become proficient in evaluating ML models using SparkML. In practical labs, you'll utilize SparkML for regression, classification, and clustering, enabling you to construct prediction and classification models. Connect to Spark clusters, analyze SparkSQL datasets, perform ETL activities, and create ML models using Spark ML and sci-kit learn. Finally, demonstrate your acquired skills through a final assignment. This intermediate course is suitable for aspiring and experienced data engineers, as well as working professionals in data analysis and machine learning. Prior knowledge in Big Data, Hadoop, Spark, Python, and ETL is highly recommended for this course.

    Machine Learning with Apache Spark

    Data Engineering for Machine Learning using Apache Spark

    Final Project

    Explore more from Machine Learning

    ©2025  ementorhub.com. All rights reserved