Topic

#reinforcement-learning

2 articles exploring reinforcement-learning. Expert insights and analysis from our editorial team.

Showing 1–2 of 2 articles

Articles

Newest first
Models & Research

Learning, Fast and Slow: What arXiv 2605.12484 Proposes for LLMs That Adapt Continually

Fast-Slow Training splits LLM updates into prompt fast weights and parametric slow weights, cutting KL drift by 70% and lifting sample efficiency by 3×, keeping plasticity.

Models & Research

Fixed Entropy Coefficients Break Down on Mixed-Difficulty Tasks: What AER Means for Teams Running LLM RL at Scale

Static entropy regularization in GRPO underperforms on mixed-difficulty tasks. Difficulty-aware allocation closes the gap by 7-10 points on pass@1 without extra compute.