58 points | by JnBrymn 4 days ago ago
1 comments
reasonable post with a decent analogy explaining on-policy learning, only major thing I take issue with is
> Reinforcement learning is a technical subject—there are whole textbooks written about it.
and then linking to the still wip RLHF book instead of the book on RL: Sutton & Barto.
reasonable post with a decent analogy explaining on-policy learning, only major thing I take issue with is
> Reinforcement learning is a technical subject—there are whole textbooks written about it.
and then linking to the still wip RLHF book instead of the book on RL: Sutton & Barto.