← 返回论文库
Mastering Chess and Shogi by Self-Play (AlphaZero)
Silver, et al. (DeepMind) · 2017
L5.1 · Algorithmic Foundations
arXiv:1712.01815
#rl
#self-play
CORE IDEA
单一算法 + self-play 学 Go/Chess/Shogi,零先验。
L-ANCHOR · 为什么在这一层重要
general game RL
arXiv:1712.01815 ↗
相关论文
QuantFactor REINFORCE
L0.3
2024
DeepSeek-R1: Incentivizing Reasoning in LLMs via RL
L4.2
2025
Q-Learning
L5.1
1989
Playing Atari with Deep Reinforcement Learning (DQN)
L5.1
2013