Google’s new Android Bench ranks the top AI models for Android coding, with Gemini 3.1 Pro Preview leading Claude Opus 4.6 and GPT-5.2-Codex.
GPT-5.4 is also more reliable, producing 18% fewer errors and 33% fewer false claims than GPT-5.2, according to OpenAI.