GPT 5.5.
Faster, more efficient, and twice the price. Test before switching.
↗ Originally posted on SubstackOpenAI shipped GPT-5.5 this week. Faster, more autonomous, and 2x the per-token price.
The headline is efficiency. It needs less hand-holding on multi-step tasks. Broader task descriptions get usable outputs. Less prompt engineering, fewer turns to completion.
The catch is price. Double per token. The pitch is that fewer tokens get consumed because the model gets there quicker. Maybe. Depends on the workload.
Don’t switch defaults yet
Everyone I know flipped their default to GPT-5.5 the day it dropped. That’s lazy.
Pick your three highest-volume tasks. Run them on GPT-5.5 and whatever you use today. Measure four things:
- Tokens used end-to-end
- Time to completion
- Output quality against a rubric you actually care about
- Cost per completed task, not per token
Per-token pricing is the wrong unit now. Per-task cost is the metric. A model that costs 2x per token but finishes in half the turns is cheaper. A model that costs 2x per token and finishes in the same turns is just more expensive.
Where it wins
Coding. Data analysis. Scientific research. Anywhere the task is long, the context is dense, and the previous model needed three prompts to get to a useful answer.
I ran it against a codebase audit I normally do on Claude Opus. GPT-5.5 got there in fewer turns, with a tighter diff. Cost came out roughly even. Quality was a wash.
Where it loses
Short tasks. One-shot rewrites. Anything where the previous model already finished in one turn. You are just paying 2x for the same output.
Also anything where you have prompt scaffolding tuned for a specific model. Switching tears that up. Migration cost is real.
The bigger shift
Release cycles are now monthly. The model you picked six weeks ago is not the model you should be using this week. Tool loyalty is a tax.
Build your stack so you can swap models per task, not per company. Route the long autonomous work to whichever frontier model is leading this month. Route the cheap high-volume stuff to Haiku or whatever the small model du jour is.
If you cannot swap models in under an hour, you have already lost.
Where this lands
GPT-5.5 is good. Not obviously worth 2x for most workloads. Test before switching.
The real takeaway is not about this model. It is about how fast the ground is moving. Stop picking a vendor. Start picking a per-task winner. Re-evaluate every month.