← 返回论文库
Encoding Recurrence into Transformers
Huang, Lu, Cai, Qin, Fang, Tian, Guodong Li (HKU) · 2023
L4.1 · Foundation Model Tech StackICLR 2023 Oral#architecture
CORE IDEA
REM + RSA:把 RNN lightweight encode 进 attention,better sample efficiency。
L-ANCHOR · 为什么在这一层重要
architectural prior anchor for RQ1