Direct Preference Optimization - Search Videos

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

YouTubeUmar Jamil

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

In this video I will explain Direct Preference Optimization (DPO), an alignment technique for language models introduced in the paper "Direct Preference Optimization: Your Language Model is Secretly a Reward Model". I start by introducing language models and how they are used for text generation. After briefly introducing the topic of AI ...

36K viewsApr 14, 2024

Direct Preference Optimization Tutorial

Direct Preference Optimization Math

Direct Preference Optimization Math

YouTubeLEARNSECTOR

74 views1 month ago

Direct Preference Optimization (DPO) in 1 hour

Direct Preference Optimization (DPO) in 1 hour

YouTubeZachary Huang

2.8K views8 months ago

Direct Preference Optimization (DPO) Explained | Train AI with Human Feedback

Direct Preference Optimization (DPO) Explained | Train AI with Human Feedback

YouTubeTech Pulse Labs

4 views1 month ago

Top videos

Direct Preference Optimization (DPO): Your Language Model is Secretly a Reward Model Explained

Direct Preference Optimization (DPO): Your Language Model is Secretly a Reward Model Explained

YouTubeGabriel Mongaras

19.4K viewsAug 10, 2023

Direct Preference Optimization (DPO) | Paper Explained

Direct Preference Optimization (DPO) | Paper Explained

2.1K views5 months ago

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning

YouTubeLuis Serrano Academy

33.4K viewsJun 21, 2024

Direct Preference Optimization Applications

Direct Preference Optimization (DPO) Explained: AI Alignment

Direct Preference Optimization (DPO) Explained: AI Alignment

YouTubeVLR Software Training

13 views5 months ago

Why Direct Preference Optimization ! Your LLM is Secretly a Reward Model. #ai #llm #researchpaper

Why Direct Preference Optimization ! Your LLM is Secretly a Reward Model. #ai #llm #researchpaper

YouTubeTamil AI Hub

857 views1 month ago

W12L53: Direct Preference Optimization (DPO)

W12L53: Direct Preference Optimization (DPO)

YouTubeIIT Madras - B.S. Degree

1.3K views9 months ago

Direct Preference Optimization (DPO): Your Language Model is Secretly a Reward Model Explained

Direct Preference Optimization (DPO): Your Language Model is Secretly a R…

19.4K viewsAug 10, 2023

YouTubeGabriel Mongaras

Direct Preference Optimization (DPO) | Paper Explained

Direct Preference Optimization (DPO) | Paper Explained

2.1K views5 months ago

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly witho…

33.4K viewsJun 21, 2024

YouTubeLuis Serrano Academy

Direct Preference Optimization (DPO) in 1 hour

Direct Preference Optimization (DPO) in 1 hour

2.8K views8 months ago

YouTubeZachary Huang

Direct Preference Optimization (DPO) Explained | Train AI with Human Feedback

Direct Preference Optimization (DPO) Explained | Train AI with Human Feed…

4 views1 month ago

YouTubeTech Pulse Labs

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

Direct Preference Optimization: Your Language Model is Secretly a Rewar…

40.4K viewsDec 22, 2023

YouTubeAI Coffee Break with Letitia

Direct Preference Optimization Math

Direct Preference Optimization Math

74 views1 month ago

YouTubeLEARNSECTOR

Direct Preference Optimization (DPO) Explained: AI Alignment

13 views5 months ago

YouTubeVLR Software Training

Direct Preference Optimization (DPO) explained： Bradley-Terry model, lo…

222 viewsMay 5, 2025

bilibiliyaojingguo

Why Direct Preference Optimization ! Your LLM is Secretly a Reward Model…

857 views1 month ago

YouTubeTamil AI Hub

W12L53: Direct Preference Optimization (DPO)

1.3K views9 months ago

YouTubeIIT Madras - B.S. Degree Programme

Direct Preference Optimization (DPO) explained + OpenAI Fine-tuning exam…

831 viewsDec 26, 2024

YouTubeSimeon Emanuilov

Hands-on 10: Large Language Model Alignment with Direct Preference Opt…

3.8K views10 months ago

YouTubeBrainOmega

Lecture 40 : Aligning to User Preferences via Direct Preference Op…

467 views8 months ago

YouTubeNPTEL IIT Kharagpur

RLHF Explained (and DPO!)

18K viewsJun 12, 2024

YouTubeMark Hennings

Aligning LLMs with Human Preferences

9 views3 months ago

YouTubeThe AI Opus

How does DPO improve the LLM's performance? | Simple Explanation

213 viewsJan 29, 2025

DPO Coding | Direct Preference Optimization (DPO) Code implement…

445 viewsMar 19, 2025

YouTubeAILinkDeepTech

nlPUG Reading Group (April 2025) - Direct Preference Optimization

14 views2 months ago

Teach AI to Be Nice (DPO vs. RLHF) 😇

117 views2 months ago

YouTubeBookSpokify

Diffusion Model Alignment Using Direct Preference Optimization

50 views4 months ago

bilibilidalaska的欢愉

AI Model Secrets: DPO, RLHF, and Model Merging Explained! #shorts

67 views6 months ago

YouTubeFranksWorld of AI

Lec 10 | Reinforcement Learning from Human Feedback: Part 04

363 views7 months ago

Stanford CME295 L-4 LLM Training in 2 Min

2 views1 week ago

YouTubeTenMinuteTakeaway

Aligning to User Preferences via Direct Preference Optimization #swayampr…

YouTubeCH 19: IIT BOMBAY 03: Electrical Engineering

21. Direct Preference Optimization (DPO) (Rafailov et al., 2023)

26 views6 months ago

YouTubeLOADING_

Direct Preference Optimization

820 viewsApr 9, 2024

YouTubeData Science Gems

DPO : Direct Preference Optimization

340 viewsJun 20, 2024

YouTubeDhiraj Madan

What DPO Really Is (and What It Assumes) #ml #ai #coding #data #in…

66 views3 months ago

YouTubeNeurons Decoded

See more videos