Transformer on MSNOpinion
Against the METR graph
METR’s benchmark has become a bellwether of AI capability growth, but its design isn’t up to the task, argues Nathan Witkin ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results