Generative AI Advance Fine-Tuning for LLMs
This course is part of multiple programs. Learn more
Instructors: Joseph Santarcangelo +3 more
Instructor ratings
We asked all learners to give feedback on our instructors based on the quality of their teaching style.
What you'll learn
Skills you'll gain
There are 2 modules in this course
You’ll explore advanced fine-tuning techniques for causal LLMs, including instruction tuning, reward modeling, and direct preference optimization. Learn how LLMs act as probabilistic policies for generating responses and how to align them with human preferences using tools such as Hugging Face. You’ll dive into reward calculation, reinforcement learning from human feedback (RLHF), proximal policy optimization (PPO), the PPO trainer, and optimal strategies for direct preference optimization (DPO). The hands-on labs in the course will provide real-world experience with instruction tuning, reward modeling, PPO, and DPO, giving you the tools to confidently fine-tune LLMs for high-impact applications. Build job-ready generative AI skills in just two weeks! Enroll today and advance your career in AI!"
Fine-Tuning Causal LLMs with Human Feedback and Direct Preference
Explore more from Machine Learning
©2025 ementorhub.com. All rights reserved