Our code is based on open-r1, with our customized Trainer for mixed SFT+GRPO training. Some other updates focus on the white-box RL (reward function design) and post-completion training (replacement ...
A new report finds that the overall college completion rate has stayed relatively steady at 61% for the past four years. The national six-year college completion rate held steady at roughly 61% this ...