Ementorhub

Generative AI Advance Fine-Tuning for LLMs

This course is part of multiple programs. Learn more

Instructors: Joseph Santarcangelo +3 more

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

What you'll learn

In-demand generative AI engineering skills in fine-tuning LLMs that employers are actively seeking

Instruction tuning and reward modeling using Hugging Face, plus understanding LLMs as policies and applying RLHF techniques

Direct preference optimization (DPO) with partition function and Hugging Face, including how to define optimal solutions to DPO problems

Using proximal policy optimization (PPO) with Hugging Face to build scoring functions and tokenize datasets for fine-tuning

Skills you'll gain

Quality Assessment

Training and Development

Reinforcement Learning

User Feedback

Generative AI

Performance Tuning

Prompt Engineering

Natural Language Processing

Large Language Modeling

There are 2 modules in this course

You’ll explore advanced fine-tuning techniques for causal LLMs, including instruction tuning, reward modeling, and direct preference optimization. Learn how LLMs act as probabilistic policies for generating responses and how to align them with human preferences using tools such as Hugging Face. You’ll dive into reward calculation, reinforcement learning from human feedback (RLHF), proximal policy optimization (PPO), the PPO trainer, and optimal strategies for direct preference optimization (DPO). The hands-on labs in the course will provide real-world experience with instruction tuning, reward modeling, PPO, and DPO, giving you the tools to confidently fine-tune LLMs for high-impact applications. Build job-ready generative AI skills in just two weeks! Enroll today and advance your career in AI!"

Fine-Tuning Causal LLMs with Human Feedback and Direct Preference

Explore more from Machine Learning

Generative AI Engineering and Fine-Tuning Transformers

Generative AI Engineering with LLMs

Learn Generative AI with LLMs

Reinforcement Learning from Human Feedback