GPT-5.4 Can Use Your Computer Better Than You

What Happened

OpenAI just shipped GPT-5.4 — and it's the first model that can use your computer better than you can. On the OSWorld benchmark for computer use tasks, GPT-5.4 scores 75%. Humans score 72%.

This isn't a chatbot upgrade. GPT-5.4 combines reasoning, coding, and agentic workflows into a single frontier model. It's available in ChatGPT (as GPT-5.4 Thinking and GPT-5.4 Pro) and in the API alongside Codex.

Why This Matters

Computer use is the bridge between "AI that talks" and "AI that does." When a model can navigate interfaces, fill out forms, extract data from dashboards, and chain together multi-step workflows across applications — it stops being a tool and starts being a coworker.

For orchestrators, this changes the game. You're no longer limited to API integrations. You can now build agents that interact with any software that has a screen — including legacy systems with no API at all.

The Numbers

75% — GPT-5.4's score on OSWorld computer use benchmark
72% — Human score on the same benchmark
First model to credibly handle coding, computer use, and knowledge work at frontier level

What Orchestrators Should Do

If you're building agent workflows, test GPT-5.4's computer use capabilities against your existing tool-calling pipelines. In many cases, screen-based interaction will be simpler and more reliable than building custom API integrations — especially for tools that update their APIs frequently or don't have them at all.

The age of the screen-aware agent is here. Learn to orchestrate them.