Announcement_18
Ant RL technical report “Reinforcement learning with rubric anchors“(extending RLVR with 10k+ Rubric rewards) is now released.
Ant RL technical report “Reinforcement learning with rubric anchors“(extending RLVR with 10k+ Rubric rewards) is now released.