How AI Learned to Feel | 75 Years of Reinforcement Learning

3,452

365 0

Published 2024-07-31

I follow the history of RL (model free), from learning tic tac toe, checkers, backgammon, as well as physical problems (cart and pole), walking, grasping (OpenAI's dexterous robotic hand)...I explain value functions, q functions, policy functions and how they work together. Including how TD learning was used..

Thanks to Jane Street for sponsoring this video. They are hiring people interested in ML! learn more about their work and open roles (and support me), visit their website: www.janestreet.com/machine-learning/?utm_source=yt…

Along the way, we'll encounter the challenges of transferring simulated skills to the real world (domain randomization) and witness the emergence of eerily human-like behaviors in AI agents. It leaves us with a provocative question: where is the line between actions and words? What is the role of an GPT for actions?
Featuring insights from:
Claude Shannon
Arthur Samuel
Gerald Tesauro
Richard Sutton
David Silver
Deep Mind/Open AI etc.

00:00 - Introduction
00:32 - Learning Tic Tac Toe
02:00 - Learning Cart and pole
04:20 - Shannon & Chess
06:50 - Samuel's Checkers
09:25 - TD Gammon (Gerald Tesaruo)
11:00 - TD Learning
14:30 - Learning Atari (DQN)
17:28 - DIrect Policy Gradiant
19:40 - Domain Randomization

All Comments (21)

@ncolmt yesterday

the way you introduce the REAL AI to the world, Nice job
@belibem yesterday

Seems like reinforcement learning's been on a wild trip since forever, but the way Brit breaks it down? It's like he's got a secret map of the RL universe. He makes the crazy journey from old-school ideas to today's stuff actually make sense. It's like watching history unfold, but you know, without falling asleep!
@ArtOfTheProblem yesterday

I hope you enjoy this video, please let me know what you think below. 👇 STAY TUNED & SUBSCRIBE: Next video on Road to AGI + new topics please LIKE/SHARE in your network to help AOP grow :_aop: FULL AI series: youtube.com/playlist?list=PLbg3ZX2pWlgKV8K6bFJr5dh… Thanks Jane Street for sponsoring. They are hiring people interested in ML: www.janestreet.com/machine-learning/?utm_source=yt… SUPPORT AOP: www.patreon.com/artoftheproblem
@rickandelon9374 yesterday

Hands down the best AI history channel in the world
@brainmuffins6052 yesterday

I reinforced my positive behavior in watching this video with plenty of ice cream.
@jonathonreed2417 yesterday

Another great video. It's super interesting to see the DeepMind is attempting to figure out how much real world learning vs simulated learning is optimal while LLM researchers are simultaneously asking questions about the use of "synthetic data", naively (if the "synthetic data" approach proves successful at scale) it seems to vaguely point towards a further generalization in the machine learning field. I think a great follow video to this one would be about multi model models and maybe at the end discuss the idea of synthesizing this robotic action model with something like chatgpt, or maybe not just spitballing. EDIT: just read your pinned comment, seems like your already a few steps ahead of me on this, not surprised
@timl2k11 yesterday

The music that starts @ 25:40 provides a nice transition and nicely conveys the future potential of the technology.
@TheLoneCone yesterday

Seeing this video at 466 views currently and shocked it doesn’t have hundreds of thousands if not millions. Awesome video
@Ayel-wl4ix 22 hours ago

Omg it took so much to make machines to this level. Their patience and big brain 😮
@princetonpoh4637 yesterday

I really liked the historical perspective on how RL started. It helps stair-step my way up to modern day concepts :)
@AdamJeffries-r4f 23 hours ago

Thank you for the credit at the end. You compressed the data well and thus, the info regarding the value function was more easily understood, in my opinion.
@notgaybear5544 yesterday

Another amazingly lucid video. Thank you! By the end it feels like were just getting started.
@jimlbeaver yesterday

Really great video! Awesome summary of the history of RL.… Very clear. Nice job.
@Dr.Menendez yesterday

All your videos are excellent. Congratulations.
@vinniepeterss yesterday

great video as always!
@shawnbibby yesterday

lucky me for this video today!
@77batering yesterday

I love this channel so much I only wish you made videos faster but it's always such engaging content I can see why it takes a while
@KalebPeters99 yesterday

Awesome stuff!! I just love the way you explain things 🙏💕 I feel like I'm closer than ever to actually understanding AI 😅😅
@maryjanecruise1674 yesterday

Brilliant work!
@kingdodongo4126 18 hours ago

You have a magical ability to explain with such eloquence and clarity that you make me feel intelligent. All that lead up to the moment (and also from your previous videos) when you explain Domain randomization 19:45 “you actually need less precise simulation” that realization felt like an explosion in my mind. Thanks for your channel man