The big headlines on this release are efficiency, with OpenAI reporting that GPT-5.4 uses far fewer tokens (47% fewer on some tasks) than its predecessors).
A benchmark called OSWorld-Verified, designed to monitor AI's ability to navigate desktop environments, found that GPT 5.4 scored 75%, up from 47.3% with its GPT 5.2 model. That also beats the average ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results