A Little Bit of Reinforcement Learning from Human Feedback -- Nathan Lambert
A Little Bit of Reinforcement Learning from Human Feedback -- Nathan Lambert
https://bsky.app/profile/natolambert.bsky.social/post/3lh5jih226k2k
Anyone interested in learning about RLHF? This text isn't complete yet, but looks to be a pretty useful resource as is already.