【Agentic RL / 强化学习 / OPD】OpenClaw-RL 源码阅读笔记 --- (2)--- On-Policy Distillation - 罗西的思考
【Agentic RL / 强化学习 / OPD】OpenClaw-RL 源码阅读笔记 (2) On-Policy Distillation 目录【Agentic RL / 强化学习 / OPD】OpenClaw-RL 源码阅读笔记 (2) On-Policy Distillation0x00 概要
0x00 OpenClaw-RL OpenClaw-RL / OpenClaw-RL Online RL <ul><li>openclaw-rlBinary RL / GRPO</li><li>openclaw-opdOn-Policy Distillation, OPD</li><li>openclaw-combine PPO RL reward OPD teacher signal… [+15560 chars]