Чи варто переходити з GPT-5.4 на GPT-5.5 в Codex?

Так — якщо GPT-5.5 вже з'явився у вашому model picker. GPT-5.5 використовує приблизно на 40% менше токенів на ту ж задачу, тому реальне споживання ліміту залишається на тому ж рівні або знижується, попри вищу ціну за токен в API. Для autonomous і multi-step задач різниця відчутна. Якщо GPT-5.5 ще не доступний — оновіть Codex App або CLI до останньої версії; під час rollout продовжуйте з GPT-5.4.

Яка різниця між GPT-5.5 і GPT-5.3-Codex?

GPT-5.3-Codex — спеціалізована coding-модель з 400K контекстом і фокусом на agentic software engineering. GPT-5.5 ширша: coding + reasoning + computer use + knowledge work, і при цьому розумніша — 82.7% проти 77.3% на Terminal-Bench 2.0. Однак GPT-5.3-Codex залишається доступним через API-key у Codex (GPT-5.5 через API-key у Codex-поверхні поки недоступний), тому для API-key workflows вона все ще актуальна.

GPT-5.5 доступний безкоштовно?

Ні. У Codex GPT-5.5 доступний тільки для платних планів: Plus (від $20/міс), Pro ($100 або $200/міс), Business ($30/user/міс), Enterprise, Edu і Go. Free-план не має доступу до GPT-5.5 ні в ChatGPT, ні в Codex

Що таке GPT-5.5 Thinking і чим він відрізняється від стандартного?

GPT-5.5 Thinking — варіант з посиленою рефлексією перед відповіддю. Дає лаконічніші і точніші відповіді для складних задач: архітектурних рішень, глибокого дебагу, research-питань. Доступний для Plus і вище в ChatGPT і Codex. Не плутати з GPT-5.5 Pro — це окремий найпотужніший варіант тільки для Pro, Business і Enterprise.

Чим GPT-5.5 в Codex відрізняється від GPT-5.5 в ChatGPT і в API?

Та сама модель, різні поверхні і контекст. В Codex: контекстне вікно 400K токенів, доступ до репозиторію, terminal, браузеру, PR workflow, Skills і Fast mode. В ChatGPT: стандартний діалог з інструментами, web search, Python. В API: контекстне вікно 1M токенів, ціна $5 за 1M input і $30 за 1M output токенів, повна кастомізація. Для agentic coding — Codex правильна поверхня.

AI_TOOLS 10 May 2026 20 min read 159 view

GPT-5.5 in Codex: what has changed for developers in 2026

Updated: 10 May 2026

Language: 🇺🇦 🇺🇸 🇩🇪 🇪🇸

Dmitro Petrov

A Tech Lead who builds AI/ML systems for production — and writes about how they actually work.

GPT-5.5 in Codex: what has changed for developers in 2026

On April 23, 2026, OpenAI released GPT-5.5 — and immediately made it the default model in Codex. But not every update actually changes anything in daily work. This one does. Three things that are important for a developer: fewer tokens for the same tasks, the same speed as GPT-5.4, and a qualitatively new level of agentic work on complex multi-step tasks.

This article contains specific numbers from official benchmarks, an honest comparison of where GPT-5.5 wins, and where Claude Opus 4.7 still leads, and a practical analysis of the new Thinking and Fast modes. No hype — only what is useful for a developer to know in May 2026.

In short: GPT-5.5 is the most powerful model in Codex today. Terminal-Bench 2.0: 82.7%. Fewer tokens for the same task. The same per-token latency as GPT-5.4. Available for Plus, Pro, Business, Enterprise, Edu, and Go plans. But there are important nuances — more on them below.

GPT-5.5 — a brief overview of the model

Short answer: GPT-5.5 was released on April 23, 2026 — seven weeks after GPT-5.4 (March 2026). OpenAI calls it "a new class of intelligence for real-world work." According to internal classification, the model received the codename Spud.

As of May 2026, GPT-5.5 is the smartest model in the Codex ecosystem. But it's important to understand that this is not just "GPT-5.4 with bigger numbers" — it's a qualitatively different approach to agentic work: the model is designed for tasks where planning, using tools, verifying its own solution, and continuing is required — even if the initial plan changes.

Availability in Codex

An important nuance that was not present in previous models: GPT-5.5 in Codex and GPT-5.5 in the API are different surfaces with different context windows.

Parameter	Codex (App / CLI / IDE)	API
Context window	400K tokens	1M tokens
Authorization	ChatGPT OAuth (subscription)	API key (since April 24, 2026)
API Price	–	$5 / 1M input, $30 / 1M output
Available Codex plans	Plus, Pro, Business, Enterprise, Edu, Go	Pay-as-you-go
Fast mode	✅ (1.5× faster, 2.5× limit cost)	Priority: 2.5× price

Readers often confuse contexts: "GPT-5.5 supports 1M tokens" is true, but only for the API. In the Codex surface, the limit is 400K. This must be taken into account when working with large repositories.

GPT-5.5 Variants

OpenAI released three variants within a single release:

GPT-5.5 (standard) — the default model in Codex and ChatGPT for Plus and above. Best for most agentic coding tasks.
GPT-5.5 Thinking — a more reflective variant: more concise and accurate answers for complex tasks where quality is important, not speed. Available for Plus, Pro, Business, Enterprise.
GPT-5.5 Pro — the most powerful variant for the most complex tasks. Only for Pro, Business, Enterprise in ChatGPT. Not yet separately distinguished in Codex.

Sources: OpenAI — Introducing GPT-5.5 (April 23, 2026) | Codex Models — official documentation

What specifically has changed in Codex with GPT-5.5

Short answer: four real changes — two technical (token efficiency and latency), one qualitative (better agentic work), and one new (Fast mode). Each of them affects the daily workflow differently.

1. Token efficiency — ~40% fewer tokens for the same task

OpenAI has purposefully tuned Codex for GPT-5.5: the same task requires approximately 40% fewer output tokens compared to GPT-5.4. This is not just a marketing claim — it means that the real usage limit for most tasks remains at the same level or decreases, despite the higher per-token price in the API ($30 vs $15 per 1M output).

Practical illustration: if GPT-5.4 used, say, 10,000 tokens for module refactoring, GPT-5.5 will perform the same task for ~6,000. With a 2× price per token, the real cost increases by approximately 20%, not double. If GPT-5.5 also requires fewer retries — it's break-even or savings.

For Codex subscriptions: Pro users received 2× Codex usage until May 31, 2026 as compensation during the new model's rollout.

2. Same per-token latency with higher intelligence

A common problem with the release of more powerful models is that they are slower. GPT-5.5 is an exception: it fully matches GPT-5.4 in per-token latency under real-world service conditions. This was made possible by joint development with NVIDIA based on GB200/GB300 NVL72 rack-scale systems.

For the developer, this means: you get more results for the same waiting time. Not "smarter but slower" — but "smarter and just as fast."

3. Qualitatively better agentic work — less back-and-forth

GPT-5.5 handles vague, multi-step tasks better without constant clarification. It independently plans, selects tools, verifies its solution, and continues — even if the initial plan had to be adjusted in the process. OpenAI describes this as the ability to "understand what you're trying to do faster and take on more work."

A specific example from statistics: on Expert-SWE — an internal OpenAI benchmark where tasks have a median human execution time of 20 hours — GPT-5.5 showed 73.1% compared to 68.5% for GPT-5.4. This is not a synthetic test, but an approximation of a real agentic scenario.

A separate fact that was almost not publicized: GPT-5.5, along with Codex, even before its official release, rewrote OpenAI's own production infrastructure. Codex analyzed weeks of real traffic and wrote load balancing heuristics that increased token generation speed by more than 20%. The model literally helped optimize the system that serves it.

4. Fast mode — a new mode for interactive work

Along with GPT-5.5, Codex introduced Fast mode — a mode that did not exist before. It generates tokens 1.5× faster for 2.5× the limit cost.

Purpose: an interactive feedback loop where response speed is important, not the depth of autonomous planning. Essentially, an alternative to Codex-Spark for those who do not have a Pro plan or need a context larger than 128K. More details on Fast mode in the next section.

Sources: OpenAI — Introducing GPT-5.5 | Vellum — Everything You Need to Know About GPT-5.5 | Codex Changelog — May 2026

Benchmarks — what the numbers say

Short answer: GPT-5.5 leads on most benchmarks for agentic coding — but not all. Claude Opus 4.7 remains stronger on SWE-Bench Pro. The honest table below.

Important: Benchmarks show relative strength on standardized tasks. Real-world performance on your project may vary. Use the numbers as a starting point, not a verdict.

Model Comparison Table

Benchmark	GPT-5.3-Codex	GPT-5.4	GPT-5.5	Claude Opus 4.7	Gemini 3.1 Pro
Terminal-Bench 2.0	77.3%	75.1%	82.7% 🏆	69.4%	68.5%
SWE-Bench Pro	56.8%	–	58.6%	64.3% 🏆	–
Expert-SWE (internal)	–	68.5%	73.1% 🏆	–	–
FrontierMath T1-3	–	–	51.7% 🏆	–	–
Graphwalks BFS >128K	–	21.4%	73.7% 🏆	–	–
MRCR v2 at 1M tokens	–	36.6%	74.0% 🏆	–	–

What Each Benchmark Means for a Developer

Terminal-Bench 2.0 (82.7%) — most relevant for working in Codex. According to OpenAI's official changelog, this benchmark measures an agent's ability to perform complex CLI tasks requiring planning, iteration, and tool usage. GPT-5.5 leads by over 13 percentage points against Claude Opus 4.7. For autonomous terminal workflows, this is a decisive advantage.

SWE-Bench Pro (58.6%) — solving real GitHub issues. Here, Claude Opus 4.7 with 64.3% remains ahead. OpenAI notes that the difference may be partly due to memoization of parts of the benchmark — but there is no independent confirmation of this. An honest conclusion: for code review and repository reasoning, Claude Opus 4.7 is still competitive.

Expert-SWE (73.1%) — an internal OpenAI benchmark for the most complex long-horizon tasks. The median human completion time is 20 hours. GPT-5.5 showed 73.1% compared to GPT-5.4's 68.5%: a +4.6 pp improvement on the hardest tasks.

Graphwalks BFS at >128K (73.7% vs 21.4%) — numbers that show how much better long-context handling has become. GPT-5.4 sharply degraded on tasks exceeding 128K tokens. GPT-5.5 maintains 73.7% at 256K — a qualitative change for working with large codebases.

Artificial Analysis Coding Index

An independent aggregated rating from Artificial Analysis (weighted average of 10 evaluations, including Terminal-Bench Hard, GPQA Diamond, Humanity's Last Exam, SciCode, and others): GPT-5.5 holds the top position at half the cost of competitors among frontier coding models. This is an external independent assessment, not from OpenAI.

Sources: OpenAI — Introducing GPT-5.5 (benchmarks) | Handy AI — Model Drop: GPT-5.5 | LLM Stats — GPT-5.5 vs GPT-5.4: detailed analysis | Vellum — Everything You Need to Know About GPT-5.5

GPT-5.5 Thinking and Fast mode — when to use which

Short answer: GPT-5.5 is not a single model, but three modes with different trade-offs. Standard is for most tasks. Thinking is when deeper reflection is needed. Fast mode is when feedback speed is important.

GPT-5.5 Thinking — what it is and when to use it

Thinking is a variant of GPT-5.5 with enhanced reflection before responding. According to OpenAI: "unlocks faster assistance with more complex tasks, providing more insightful and concise answers." Available for Plus, Pro, Business, and Enterprise in ChatGPT. In Codex, it's enabled via the model picker — selected as a separate option.

Thinking is your choice when:

An architectural decision where quality matters more than speed: e.g., how to break down a monolithic service into microservices considering existing dependencies
Complex debugging where the problem isn't obvious: a production error without a clear stack trace, a race condition in async code, an unstable test that fails 1 in 10 times
Research tasks: choosing between two architectural approaches with trade-off analysis
Any task where you want to hear "why this way and not another," not just the finished code

Stick to standard GPT-5.5 when:

The task is clear and only the result is needed: writing tests, adding validation, refactoring a method
An autonomous task where Codex will solve sub-steps itself — Thinking would add overhead without benefit
Most regular agentic coding tasks — the standard mode is the recommended default

Fast mode — what it is and when to use it

Fast mode is a new mode in Codex that didn't exist before GPT-5.5. According to 9to5Mac, it generates tokens 1.5× faster for 2.5× the limit cost. It's enabled in the model picker next to the main GPT-5.5.

Essentially, Fast mode is an alternative to GPT-5.3-Codex-Spark for those who don't have a Pro plan (Spark is Pro only, research preview) or need a context larger than 128K tokens (Spark is limited to 128K). Fast mode is available for all plans that have GPT-5.5 and uses the full 400K context.

Fast mode is your choice when:

Active debugging where you want a response in seconds, not minutes: reviewing a stack trace, proposing a hypothesis, wanting to test the next one
Quick refactoring of a single method or class — the task is small and doesn't require deep planning
Real-time code review: you're given a diff and want instant feedback
Iterating through implementation options: tried one approach, want to see an alternative — and so on several times

Stick to standard GPT-5.5 when:

Long-term autonomous tasks — planning, feature implementation, writing tests: response speed is not critical, result quality is important
Limit is under pressure — Fast mode costs 2.5× the standard; with active use, the monthly limit will be exhausted much faster
Most tasks — standard GPT-5.5 is the recommended default, Fast mode for exceptional situations

Mode Comparison Table

Mode	Speed	Limit Cost	Best for	Availability
GPT-5.5 (Standard)	like GPT-5.4	1×	Most tasks, autonomous coding	Plus, Pro, Business, Enterprise, Edu, Go
GPT-5.5 Thinking	slightly slower (reflection)	1× (not confirmed separately)	Architecture, complex debugging, research	Plus and above
Fast mode	1.5× faster	2.5× limit	Interactive loop, quick refactoring	All plans with GPT-5.5
GPT-5.3-Codex-Spark	>1000 tokens/sec	Separate limit	Real-time coding, 128K context	Pro only (research preview)

Sources: OpenAI — Introducing GPT-5.5 (Thinking and Fast mode) | 9to5Mac — OpenAI upgrades ChatGPT and Codex with GPT-5.5 | Codex Models — official documentation

How to switch to GPT-5.5 in Codex

Short answer: update the app and select GPT-5.5 in the model picker. If the model isn't there yet — the rollout is gradual, that's normal. Below are the steps for each interface and what to do while GPT-5.5 is unavailable.

According to the official changelog from May 2026: "GPT-5.5 is the recommended choice for most tasks in Codex. If you don't see GPT-5.5 — update your CLI, IDE extension, or Codex App to the latest version. During the rollout, continue using GPT-5.4."

Codex App (macOS / Windows)

Update the Codex App to the latest version via the menu or App Store
Open a new thread → in the composer, find the model picker
Select GPT-5.5 from the list (or GPT-5.5 Thinking for complex tasks)
Fast mode is enabled by a separate toggle next to the model selection

Codex CLI

Launch with a specific model using a flag:

codex --model gpt-5.5

Change the model in an active thread without restarting:

/model gpt-5.5

Set GPT-5.5 as default in config.toml:

[model]
default = "gpt-5.5"

VS Code extension

Update the extension via the Extensions panel (Ctrl+Shift+X → update Codex)
In the composer — the model selector is below the input box
Select GPT-5.5; the change applies to the current and subsequent threads

JetBrains extension

Update via the JetBrains Marketplace (Settings → Plugins → Updates)
In the Codex composer — the model selector is below the input box, similar to VS Code

API (from April 24, 2026)

GPT-5.5 is available via the Responses and Chat Completions API from April 24, 2026. Model string: gpt-5.5. Context in API is 1M tokens.

// Example for Responses API
{
  "model": "gpt-5.5",
  "input": "Refactor this Spring Boot service..."
}

Important: on Codex surfaces (App / CLI / IDE), GPT-5.5 is only available via ChatGPT OAuth (subscription). For API-key workflows in Codex, use gpt-5.4 or gpt-5.2-codex for now.

If GPT-5.5 hasn't appeared yet

The rollout is gradual — this is normal, not all accounts get access simultaneously
Check if the app / CLI is updated to the latest version — this is the most common reason
Temporarily: continue with GPT-5.4. For most tasks, the difference is not critical
Check availability status: chatgpt.com/codex/settings/usage

Sources: Codex Changelog — official | Codex Models — documentation

Practical Comparison: GPT-5.4 vs GPT-5.5 for Typical Tasks

Short answer: GPT-5.5 wins most on complex, ambiguous, multi-step tasks. On simple and well-defined ones — the difference is minimal, and sometimes it's more important to choose the right mode (Fast mode, Thinking) than the model itself.

Module Refactoring

	GPT-5.4	GPT-5.5
Single class / method	Handles well	Comparable, slightly fewer tokens
Multi-file refactoring	May lose inter-file connections on a large scope	Better "understands the system's shape" — where the problem lies and what else will be affected
Legacy code with implicit dependencies	Requires clarification	Less back-and-forth, better navigation of non-obvious connections

Recommendation: for a single class, the difference is minimal — you can stick with GPT-5.4. For refactoring affecting multiple modules — GPT-5.5 is noticeably more accurate.

Writing Tests

	GPT-5.4	GPT-5.5
Unit tests for known patterns	Good	Comparable, ~30–40% fewer tokens
Integration tests with non-obvious edge cases	Misses non-trivial scenarios	Better at finding non-obvious boundary cases
Tests for legacy code without documentation	Requires detailed logic description	Better at inferring logic from code independently

Recommendation: for typical unit tests on clean code — use Spark (faster) or GPT-5.4 mini (cheaper). GPT-5.5 is justified for complex integration tests and legacy code.

Debugging Production Issues

Scenario	Recommended Model	Why
Known stack trace, clear cause	Fast mode or Spark	Response speed is more important than depth
Intermittent error, unclear cause	GPT-5.5 Thinking	Reflection before responding, fewer false hypotheses
Production issue affecting multiple services	GPT-5.5 Standard	Analysis of cross-service dependencies, planning

Autonomous Feature Development

This is where GPT-5.5 shows the biggest gap compared to GPT-5.4. Expert-SWE with a median human completion time of 20 hours — 73.1% vs 68.5% — is precisely about this: long, complex, multi-step tasks where the model plans independently, encounters obstacles, and continues without losing context.

A typical autonomous workflow in Codex with GPT-5.5:

Receives a task: "Add an endpoint for exporting reports to PDF with date filtering"
Reads the existing codebase: controllers, services, repositories, DTOs
Plans: which classes to modify, which to create, how to fit into the existing architecture
Writes code, runs tests, sees a failure → determines the cause itself → fixes it
Returns a ready diff or PR for review — without intermediate clarifications

GPT-5.4 handles this scenario, but more often requires clarification in non-standard situations and can lose context on a large scope.

From My Experience — The AskYourDocs Case

I tested GPT-5.5 on real tasks from two Spring Boot projects. Here are specific observations:

Case 1: Refactoring the RAG pipeline in AskYourDocs. The task was to break down a monolithic document processing service into three separate ones: ingestion, chunking, and embedding. It affected 6 classes and Spring AI configurations. GPT-5.4 on the same task previously required 3–4 clarifications regarding bean dependencies. GPT-5.5 completed it without intermediate questions — it analyzed the @ConditionalOnProperty configuration and independently accounted for the dependency on OpenRouter and Ollama providers. The result was accepted with minimal edits.

Case 2: Writing tests for WebsCraft. For unit tests of simple services — the difference from GPT-5.4 is minimal. Where the difference was felt: tests for Thymeleaf templates with JSON-LD and complex conditional blocks. GPT-5.5 found two edge cases (empty FAQ list and missing breadcrumb parent) that I myself missed during manual review.

What was disappointing: Fast mode, with active use, eats up the limit quickly. In one active day of interactive debugging, the 2.5× cost multiplier is felt by evening. Now I only enable Fast mode for truly short interactive tasks, and leave long autonomous ones on standard GPT-5.5.

Sources: OpenAI — Introducing GPT-5.5 | Developer Tech — GPT-5.5 Codex developer workflows

Limitations and Nuances

Short answer: GPT-5.5 is the most powerful model in Codex, but it has real limitations that you should be aware of before starting. Particularly important are: the difference between context in Codex and API, authorization limits, and the cost of Fast mode.

400K Context in Codex — Not 1M

The most common point of confusion: GPT-5.5 supports 1M tokens of context — but only in the API. In Codex surfaces (App / CLI / IDE), the limit is 400K tokens. This is confirmed by official documentation. For very large repositories (over 400K tokens) — either split the context manually, or use GPT-5.4 via API with a 1M window.

GPT-5.5 in Codex — Only via ChatGPT OAuth

In Codex surfaces (App / CLI / IDE), GPT-5.5 is only available when authorizing through ChatGPT OAuth (Plus subscription and above). According to official Codex documentation: for API-key workflows in Codex, use gpt-5.4 or gpt-5.2-codex. In the direct API (Responses / Chat Completions), GPT-5.5 has been available via API-key since April 24, 2026.

Higher Price Per Token in API

GPT-5.5 in the API costs $5 / 1M input and $30 / 1M output tokens — twice as expensive as GPT-5.4 ($2.50 / $15). OpenAI claims that token efficiency (~40% fewer tokens per task) compensates for the difference for most workloads. For Codex subscriptions (Plus, Pro, etc.) — the cost is calculated within the subscription limit, not directly in dollars.

Fast mode — 2.5× the Cost of the Limit

Fast mode is convenient but expensive: each request costs 2.5× more of the limit than standard GPT-5.5. With intensive use throughout the day, the monthly limit is depleted significantly faster than when working with the standard mode. Recommendation: enable Fast mode only for short interactive tasks, leave autonomous tasks on standard.

Gradual Rollout — GPT-5.5 May Not Be Available

Even with an active plan and an updated application, GPT-5.5 may not appear in your model picker. The rollout is gradual and takes several weeks. What to do: update the app/CLI to the latest version (most common reason for absence), continue with GPT-5.4 — for most tasks, the difference is not critical.

GPT-5.5 Pro — Only Pro/Business/Enterprise in ChatGPT

GPT-5.5 Pro (the most powerful version) is only available for Pro, Business, and Enterprise in ChatGPT. Plus users get standard GPT-5.5 and Thinking, but not the Pro version. In Codex, GPT-5.5 Pro is not separately highlighted in the model picker — standard GPT-5.5 is used.

Limitations Table

Limitation	Detail	Workaround
Context in Codex	400K (not 1M)	GPT-5.4 via API for tasks with >400K context
Authorization in Codex	ChatGPT OAuth only	gpt-5.4 or gpt-5.2-codex for API-key workflows
API Price	2× more expensive per token than GPT-5.4	Token efficiency partially compensates; use Batch for non-urgent tasks
Fast mode cost	2.5× subscription limit	Only for short interactive tasks
Rollout	Gradual, may be delayed	Update app; temporarily GPT-5.4
GPT-5.5 Pro	Not available in Codex separately	Standard GPT-5.5 for Codex tasks

Sources: Codex Models — Official Documentation | OpenAI — Introducing GPT-5.5 (Pricing and Availability)

From My Experience — First Weeks with GPT-5.5 in Codex

I tested GPT-5.5 on two Spring Boot projects — WebsCraft and AskYourDocs — in real-world conditions, not on synthetic tasks. Here's what has actually changed in daily work compared to GPT-5.4.

Where I Felt the Biggest Improvement

Multi-file tasks — the main difference. On AskYourDocs, refactoring the RAG pipeline affected 6 classes simultaneously. GPT-5.4 on similar tasks regularly "lost" dependencies between Spring beans and required 3-4 clarifications. GPT-5.5, for the first time, completed the entire refactoring without intermediate questions — it recognized the @ConditionalOnProperty configuration and accounted for both providers (OpenRouter for prod, Ollama for local) without prompting. This feels like a qualitative change, not just "a little better."

Tests for non-trivial code. For Thymeleaf templates with JSON-LD, GPT-5.5 found two edge cases that I myself missed during manual review: an empty FAQ list and a missing breadcrumb parent. GPT-5.4 in the same scenario wrote tests only for the happy path.

Where the Difference is Minimal or Spark is Better

Simple unit tests and boilerplate. For writing tests on standard CRUD services or generating DTOs — the difference between GPT-5.4 and GPT-5.5 is practically unnoticeable. In these scenarios, I continue to use GPT-5.3-Codex-Spark (if the task is small and speed is needed) or GPT-5.4 mini (if parallel processing is needed without consuming the main limit).

Active debugging with a known stack trace. Here, Fast mode on GPT-5.5 gives good results, but Spark is still faster — if it's available in your plan. For monitoring and debugging during an active session, Spark remains my first choice.

What Was Disappointing

Fast mode and the limit. The first week, I used Fast mode too aggressively — on tasks where it wasn't necessary. The 2.5× cost multiplier for the limit eats up the monthly allowance very quickly. Now the rule is simple: Fast mode — only for interactive tasks up to 10 minutes. Anything longer — standard GPT-5.5.

400K context in Codex — a real limitation. I tried to analyze the entire WebsCraft project (Spring Boot with all templates and configs) in one pass — it didn't fit. I had to divide it into logical parts: web-layer separately, service-layer separately, Thymeleaf separately. For very large repositories, 400K is a real ceiling.

My Current Workflow with GPT-5.5

Standard GPT-5.5 — autonomous tasks: feature implementation, multi-file refactoring, writing complex tests
GPT-5.5 Thinking — architectural decisions and complex debugging
Fast mode — short interactive sessions, exploring implementation options
Spark — active debugging with a known cause where response speed is critical
GPT-5.4 mini — subagents for parallel routine tasks

Codex Hub with all materials on models — Codex from OpenAI: A Complete Guide 2026.

Frequently Asked Questions (FAQ)

Is it worth switching from GPT-5.4 to GPT-5.5 in Codex?

Yes — if GPT-5.5 has already appeared in your model picker. Thanks to token efficiency (~40% fewer tokens per task), the actual limit consumption for most tasks remains at the same level or decreases, despite the higher price per token in the API. For autonomous and multi-step tasks, the difference is noticeable. If GPT-5.5 is not yet available — update the application; during rollout, continue with GPT-5.4.

What is the difference between GPT-5.5 and GPT-5.3-Codex?

GPT-5.3-Codex is a specialized coding model with 400K context and a focus on agentic software engineering. GPT-5.5 is broader: coding + reasoning + computer use + knowledge work, and at the same time smarter — 82.7% vs 77.3% on Terminal-Bench 2.0. However, GPT-5.3-Codex remains available via API-key in Codex (GPT-5.5 via API-key in Codex surfaces is not yet available), so it is still relevant for API-key workflows.

Is GPT-5.5 available for free?

No. In Codex, GPT-5.5 is only available for paid plans: Plus ($20/month), Pro ($100 or $200/month), Business ($30/user/month), Enterprise, Edu, and Go. The Free plan does not have access to GPT-5.5 in either ChatGPT or Codex.

What is GPT-5.5 Thinking and how does it differ from standard?

Thinking is a variant of GPT-5.5 with enhanced reflection before responding. It provides more concise and accurate answers for complex tasks: architectural decisions, deep debugging, research questions. It may be slightly slower. Available for Plus and above. Do not confuse with GPT-5.5 Pro — that is a separate, most powerful version only for Pro/Business/Enterprise in ChatGPT.

How does GPT-5.5 in Codex differ from GPT-5.5 in ChatGPT and in the API?

The same model, different surfaces and context. In Codex: 400K context window, access to repository, terminal, browser, PR workflow, Skills, Fast mode. In ChatGPT: standard dialogue with tools, web search, Python. In API: 1M context window, full customization, price $5/$30 per 1M tokens. For agentic coding, Codex is the correct surface.

Conclusions

GPT-5.5 was released on April 23, 2026 and became the default model in Codex. Key changes for developers: ~40% fewer tokens for the same task, the same per-token latency as GPT-5.4, qualitatively better agentic work on multi-step tasks.
Context in Codex — 400K tokens, in API — 1M. This difference is critical when choosing a surface for working with large repositories.
GPT-5.5 leads on Terminal-Bench 2.0 (82.7%) and Expert-SWE (73.1%), but Claude Opus 4.7 remains stronger on SWE-Bench Pro (64.3% vs 58.6%). The choice of model depends on the type of task — not the brand.
Two new modes change the workflow: Thinking — for architectural decisions and complex debugging; Fast mode — for interactive tasks, but costs 2.5× the limit.
The most effective approach in 2026 is a combination of modes based on task type: Standard GPT-5.5 for autonomous coding, Thinking for architecture, Fast mode for interactive, Spark for real-time, mini for subagents.

Main takeaway: GPT-5.5 is a real step forward, not a marketing update. However, the maximum benefit comes not from the model itself, but from the correct choice between standard GPT-5.5, Thinking, and Fast mode for a specific task. This requires a week or two of practice — after which the workflow becomes significantly more efficient.

Full overview of the Codex ecosystem and all models — Codex from OpenAI: A Complete Guide 2026. Details on GPT-5.3-Codex-Spark — GPT-5.3-Codex-Spark: Real-time Coding in 2026. Comparison of GPT-5.4 and GPT-5.3-Codex — GPT-5.4 vs GPT-5.3-Codex: What's the Difference and What to Choose.

Categories