Topic

#gpqa

1 article exploring gpqa. Expert insights and analysis from our editorial team.

Showing 1–1 of 1 articles

Articles

Newest first
Agents & Frameworks

ml-intern's 32% GPQA Gain on a Single H100 Exposes the Assumption That Post-Training Still Needs a Human ML Researcher

ml-intern hit 32% on GPQA in under 10 hours, beating Claude Code's 22.99% on the same task — but a 51% instruction-tuned ceiling marks what the autonomous loop cannot close.