Google researchers introduce ‘Internal RL,’ a technique that steers an models' hidden activations to solve long-horizon tasks ...
Why reinforcement learning plateaus without representation depth (and other key takeaways from NeurIPS 2025) ...