Topic

#reasoning-faithfulness

1 article exploring reasoning-faithfulness. Expert insights and analysis from our editorial team.

Showing 1–1 of 1 articles

Articles

Newest first
Models & Research

The Last Word Often Wins: A Format Confound Inflates Chain-of-Thought Corruption Robustness Scores

A format confound in CoT corruption benchmarks—suffix sensitivity collapsed 19× when final-answer text was stripped—means published faithfulness scores are inflated.