← 返回论文库

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (MuZero)

Schrittwieser, et al. (DeepMind) · 2019

L5.1 · Algorithmic FoundationsNature 588 (2020)#rl#model-based

CORE IDEA

不需要知道 game rules：learned world model + MCTS planning。

L-ANCHOR · 为什么在这一层重要

model-based RL 顶点

arXiv:1911.08265 ↗

相关论文

QuantFactor REINFORCE

DeepSeek-R1: Incentivizing Reasoning in LLMs via RL

Playing Atari with Deep Reinforcement Learning (DQN)