Reinforcement Learning LLM - Search Videos

Distributed RL training for LLM explained part 1

MSNDeep Learning with Yacine

Distributed RL training for LLM explained part 1

An introduction to distributed reinforcement learning for large language models covering core concepts, training setup, and why scaling matters. #AI #MachineLearning #LLM

Deep Reinforcement Learning

Lecture 14 | Deep Reinforcement Learning

Lecture 14 | Deep Reinforcement Learning

YouTubeStanford University School of

385.2K viewsAug 11, 2017

Grokking Deep Reinforcement Learning - Miguel Morales

Grokking Deep Reinforcement Learning - Miguel Morales

Understanding Reinforcement Learning Environment and Rewards

Understanding Reinforcement Learning Environment and Rewards

47.4K viewsApr 1, 2019

Top videos

A new short course on Reinforcement Learning from Human Feedback (RLHF), built in collaboration with Google Cloud, is live now! 🚀 Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences, making them more helpful, honest, and safe. Reinforcement Learning from Human Feedback (RLHF) is a useful technique to address this issue by aligning LLMs with human values, whether you’re training an LLM from scratch

A new short course on Reinforcement Learning from Human Feedback (RLHF), built in collaboration with Google Cloud, is live now! 🚀 Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences, making them more helpful, honest, and safe. Reinforcement Learning from Human Feedback (RLHF) is a useful technique to address this issue by aligning LLMs with human values, whether you’re training an LLM from scratch

FacebookDeepLearning.AI

1.2K viewsDec 13, 2023

MDPs and Reinforcement Learning for LLM Agents

MDPs and Reinforcement Learning for LLM Agents

YouTubeBlackBoard AI

5 views3 months ago

Reinforcement Learning for LLM Reasoning. RL / RLHF / RLAIF.

Reinforcement Learning for LLM Reasoning. RL / RLHF / RLAIF.

YouTubeByte Goose AI.

185 views6 months ago

Reinforcement Learning Tutorial

Reinforcement Learning Tutorial | Reinforcement Learning Example Using Python | Edureka

Reinforcement Learning Tutorial | Reinforcement Learning Example Using Python | Edureka

YouTubeedureka!

133.7K viewsJan 10, 2019

Python Reinforcement Learning Tutorial for Beginners in 25 Minutes

Python Reinforcement Learning Tutorial for Beginners in 25 Minutes

YouTubeNicholas Renotte

68.1K viewsMar 10, 2021

Reinforcement Learning in 3 Hours | Full Course using Python

Reinforcement Learning in 3 Hours | Full Course using Python

YouTubeNicholas Renotte

529.4K viewsJun 6, 2021

A new short course on Reinforcement Learning from Human Feedback (RLHF), built in collaboration with Google Cloud, is live now! 🚀 Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences, making them more helpful, honest, and safe. Reinforcement Learning from Human Feedback (RLHF) is a useful technique to address this issue by aligning LLMs with human values, whether you’re training an LLM from scratch

A new short course on Reinforcement Learning from Human Feedback (RLHF), built in collaboration with Google Cloud, is live now! 🚀 Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences, making them more helpful, honest, and safe. Reinforcement Learning from Human Feedback (RLHF) is a useful technique to address this issue by aligning LLMs with human values, whether you’re training an LLM from scratch

1.2K viewsDec 13, 2023

FacebookDeepLearning.AI

MDPs and Reinforcement Learning for LLM Agents

MDPs and Reinforcement Learning for LLM Agents

5 views3 months ago

YouTubeBlackBoard AI

Reinforcement Learning for LLM Reasoning. RL / RLHF / RLAIF.

Reinforcement Learning for LLM Reasoning. RL / RLHF / RLAIF.

185 views6 months ago

YouTubeByte Goose AI.

Proximal Policy Optimization (PPO) - How to train Large Language Models

Proximal Policy Optimization (PPO) - How to train Large Language Models

83.3K viewsJan 24, 2024

YouTubeLuis Serrano Academy

Reinforcement Learning (RL) for LLMs

Reinforcement Learning (RL) for LLMs

13.9K viewsMar 12, 2025

YouTubeNatasha Jaques

Reinforcement Learning with Human Feedback (RLHF) - How to train and fine-tune Transformer Models

Reinforcement Learning with Human Feedback (RLHF) - How to train and fine-tune Transformer Models

34.8K viewsFeb 12, 2024

YouTubeLuis Serrano Academy

[UCLA RL-LLM] Reinforcement Learning of Large Language Models

[UCLA RL-LLM] Reinforcement Learning of Large Language Models

698 views4 months ago

bilibilirunningteeth

[UCLA RL-LLM] Chapter 3.2: Reinforcement learning with verifiable rewards (RLVR)

3.6K views10 months ago

YouTubeErnest Ryu

LLMs explained (Part 6): Smarter AI through Reinforcement Learning

GRPO: The Reinforcement Learning Trick That Changed Everything

156 views5 months ago

YouTubemathtartic

A new path for LLM fine-tuning — without gradients or Reinforcement Learning

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving

38.8K views11 months ago

FacebookAndrew Ng

Reinforcement Learning in the Era of LLMs

1.8K viewsMar 13, 2024

YouTubeArize AI

Reinforcement Learning Foundations Online Class | LinkedIn Learning, formerly Lynda.com

Reinforcement Learning for LLMs in 2025

15.6K viewsFeb 10, 2025

YouTubeTrelis Research

Master LLM Training with Reinforcement Learning

13 views2 weeks ago

YouTubeGithub Signals

I Trained an LLM to Think Deeper (Here's How)

12.6K viewsFeb 24, 2025

YouTubeAdam Lucek

Get Started with Reinforcement Learning on Azure Machine Learning

Microsoftmarkdefalco

Free Course: Training & Finetuning LLMs

97K viewsOct 5, 2023

YouTubeWeights & Biases

Reinforcement Learning | Course | Stanford Online

Deep Dive into LLMs like ChatGPT

6.2M viewsFeb 5, 2025

YouTubeAndrej Karpathy

Stabilizing Reinforcement Learning for LLMs

24 views5 months ago

YouTubeAI Research Roundup

Why Reinforcement Learning Unlocks Reasoning in LLMs (Aha Moments Explained)

2.3K views4 months ago

YouTubeAI Papers Academy

ERL: Improving LLM Training via Self-Reflection

44 views2 months ago

YouTubeAI Research Roundup

Deep Reinforcement Learning

deepmind.google

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

34.5K viewsSep 3, 2023

YouTubeYannic Kilcher

Reinforcement Learning in Finance: Resources and Expert Advice from Paul Bilokon

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning

31.3K viewsJun 21, 2024

YouTubeSerrano.Academy

What are RLVR environments for LLMs? | Policy, rollouts & rubrics explained

MSNDeep Learning with Yacine

See more