使用 OpenCode 进行 RL 后训练的自动化 Pipeline,包含两阶段分离验证(anti-reward-hacking)和断点续跑支持。
[!CAUTION] This plugin uses OpenAI's official OAuth authentication (the same method as OpenAI's official Codex CLI) for personal development use with your ChatGPT Plus/Pro subscription.