Topic
#reinforcement-learning
2 articles exploring reinforcement-learning. Expert insights and analysis from our editorial team.
Showing 1–2 of 2 articles
Articles
Newest first
Models & Research
Learning, Fast and Slow: What arXiv 2605.12484 Proposes for LLMs That Adapt Continually
Fast-Slow Training splits LLM updates into prompt fast weights and parametric slow weights, cutting KL drift by 70% and lifting sample efficiency by 3×, keeping plasticity.
Models & Research
Fixed Entropy Coefficients Break Down on Mixed-Difficulty Tasks: What AER Means for Teams Running LLM RL at Scale
Static entropy regularization in GRPO underperforms on mixed-difficulty tasks. Difficulty-aware allocation closes the gap by 7-10 points on pass@1 without extra compute.