On April 23, 2026, OpenAI released GPT-5.5 — and immediately made it the default model in Codex.
But not every update actually changes anything in daily work. This one does.
Three things that are important for a developer: fewer tokens for the same tasks,
the same speed as GPT-5.4, and a qualitatively new level of agentic work
on complex multi-step tasks.
This article contains specific numbers from official benchmarks, an honest comparison of where GPT-5.5
wins, and where Claude Opus 4.7 still leads, and a practical analysis of the new Thinking
and Fast modes. No hype — only what is useful for a developer to know in May 2026.
In short: GPT-5.5 is the most powerful model in Codex today.
Terminal-Bench 2.0: 82.7%. Fewer tokens for the same task. The same per-token latency as GPT-5.4.
Available for Plus, Pro, Business, Enterprise, Edu, and Go plans. But there are important nuances —
more on them below.
GPT-5.5 — a brief overview of the model
Short answer: GPT-5.5 was released on April 23, 2026 —
seven weeks after GPT-5.4 (March 2026). OpenAI calls it
"a new class of intelligence for real-world work."
According to internal classification, the model received the codename Spud.
As of May 2026, GPT-5.5 is the smartest model in the Codex ecosystem.
But it's important to understand that this is not just "GPT-5.4 with bigger numbers" —
it's a qualitatively different approach to agentic work:
the model is designed for tasks where planning, using tools,
verifying its own solution, and continuing is required — even if the initial plan changes.
Availability in Codex
An important nuance that was not present in previous models:
GPT-5.5 in Codex and GPT-5.5 in the API are different surfaces with different context windows.
Parameter
Codex (App / CLI / IDE)
API
Context window
400K tokens
1M tokens
Authorization
ChatGPT OAuth (subscription)
API key (since April 24, 2026)
API Price
–
$5 / 1M input, $30 / 1M output
Available Codex plans
Plus, Pro, Business, Enterprise, Edu, Go
Pay-as-you-go
Fast mode
✅ (1.5× faster, 2.5× limit cost)
Priority: 2.5× price
Readers often confuse contexts: "GPT-5.5 supports 1M tokens" is true, but only for the API.
In the Codex surface, the limit is 400K. This must be taken into account when working with large repositories.
GPT-5.5 Variants
OpenAI released three variants within a single release:
GPT-5.5 (standard) — the default model in Codex and ChatGPT for
Plus and above. Best for most agentic coding tasks.
GPT-5.5 Thinking — a more reflective variant: more concise
and accurate answers for complex tasks where quality is important, not speed.
Available for Plus, Pro, Business, Enterprise.
GPT-5.5 Pro — the most powerful variant for the most complex tasks.
Only for Pro, Business, Enterprise in ChatGPT. Not yet separately distinguished in Codex.
What specifically has changed in Codex with GPT-5.5
Short answer: four real changes — two technical
(token efficiency and latency), one qualitative (better agentic work),
and one new (Fast mode). Each of them affects the daily workflow differently.
1. Token efficiency — ~40% fewer tokens for the same task
OpenAI has purposefully tuned Codex for GPT-5.5: the same task
requires approximately 40% fewer output tokens compared to GPT-5.4.
This is not just a marketing claim — it means that the real usage limit
for most tasks remains at the same level or decreases,
despite the higher per-token price in the API ($30 vs $15 per 1M output).
Practical illustration: if GPT-5.4 used, say, 10,000 tokens for module refactoring,
GPT-5.5 will perform the same task for ~6,000. With a 2× price per token, the real cost
increases by approximately 20%, not double.
If GPT-5.5 also requires fewer retries — it's break-even or savings.
For Codex subscriptions: Pro users received 2× Codex usage until May 31, 2026
as compensation during the new model's rollout.
2. Same per-token latency with higher intelligence
A common problem with the release of more powerful models is that they are slower.
GPT-5.5 is an exception: it fully matches GPT-5.4 in per-token latency
under real-world service conditions. This was made possible by joint development
with NVIDIA based on GB200/GB300 NVL72 rack-scale systems.
For the developer, this means: you get more results for the same waiting time.
Not "smarter but slower" — but "smarter and just as fast."
3. Qualitatively better agentic work — less back-and-forth
GPT-5.5 handles vague, multi-step tasks better without constant clarification.
It independently plans, selects tools, verifies its solution, and continues —
even if the initial plan had to be adjusted in the process.
OpenAI describes this as the ability to "understand what you're trying to do
faster and take on more work."
A specific example from statistics: on Expert-SWE — an internal OpenAI benchmark
where tasks have a median human execution time of 20 hours —
GPT-5.5 showed 73.1% compared to 68.5% for GPT-5.4. This is not a synthetic test,
but an approximation of a real agentic scenario.
A separate fact that was almost not publicized: GPT-5.5, along with Codex,
even before its official release, rewrote OpenAI's own production infrastructure.
Codex analyzed weeks of real traffic and wrote load balancing heuristics that increased token generation speed
by more than 20%.
The model literally helped optimize the system that serves it.
4. Fast mode — a new mode for interactive work
Along with GPT-5.5, Codex introduced Fast mode — a mode that did not exist before.
It generates tokens 1.5× faster for 2.5× the limit cost.
Purpose: an interactive feedback loop where response speed is important, not the depth
of autonomous planning. Essentially, an alternative to Codex-Spark for those who do not have a Pro plan
or need a context larger than 128K. More details on Fast mode in the next section.
Short answer: GPT-5.5 leads on most benchmarks
for agentic coding — but not all. Claude Opus 4.7 remains stronger
on SWE-Bench Pro. The honest table below.
Important: Benchmarks show relative strength on standardized tasks.
Real-world performance on your project may vary.
Use the numbers as a starting point, not a verdict.
Model Comparison Table
Benchmark
GPT-5.3-Codex
GPT-5.4
GPT-5.5
Claude Opus 4.7
Gemini 3.1 Pro
Terminal-Bench 2.0
77.3%
75.1%
82.7% 🏆
69.4%
68.5%
SWE-Bench Pro
56.8%
–
58.6%
64.3% 🏆
–
Expert-SWE (internal)
–
68.5%
73.1% 🏆
–
–
FrontierMath T1-3
–
–
51.7% 🏆
–
–
Graphwalks BFS >128K
–
21.4%
73.7% 🏆
–
–
MRCR v2 at 1M tokens
–
36.6%
74.0% 🏆
–
–
What Each Benchmark Means for a Developer
Terminal-Bench 2.0 (82.7%) — most relevant for working in Codex.
According to OpenAI's official changelog, this benchmark measures an
agent's ability to perform complex CLI tasks requiring planning, iteration, and tool usage.
GPT-5.5 leads by over 13 percentage points against Claude Opus 4.7.
For autonomous terminal workflows, this is a decisive advantage.
SWE-Bench Pro (58.6%) — solving real GitHub issues.
Here, Claude Opus 4.7 with 64.3% remains ahead.
OpenAI notes that the difference may be partly due to memoization of parts of the benchmark —
but there is no independent confirmation of this.
An honest conclusion: for code review and repository reasoning, Claude Opus 4.7 is still competitive.
Expert-SWE (73.1%) — an internal OpenAI benchmark for the most complex
long-horizon tasks. The median human completion time is 20 hours.
GPT-5.5 showed 73.1% compared to GPT-5.4's 68.5%: a +4.6 pp improvement on the hardest tasks.
Graphwalks BFS at >128K (73.7% vs 21.4%) — numbers that show
how much better long-context handling has become.
GPT-5.4 sharply degraded on tasks exceeding 128K tokens.
GPT-5.5 maintains 73.7% at 256K — a qualitative change for working with large codebases.
Artificial Analysis Coding Index
An independent aggregated rating from
Artificial Analysis
(weighted average of 10 evaluations, including Terminal-Bench Hard, GPQA Diamond,
Humanity's Last Exam, SciCode, and others):
GPT-5.5 holds the top position at half the cost of competitors
among frontier coding models.
This is an external independent assessment, not from OpenAI.
GPT-5.5 Thinking and Fast mode — when to use which
Short answer: GPT-5.5 is not a single model, but three modes
with different trade-offs. Standard is for most tasks. Thinking is when deeper
reflection is needed. Fast mode is when feedback speed is important.
GPT-5.5 Thinking — what it is and when to use it
Thinking is a variant of GPT-5.5 with enhanced reflection before responding.
According to OpenAI: "unlocks faster assistance with more complex tasks,
providing more insightful and concise answers."
Available for Plus, Pro, Business, and Enterprise in ChatGPT.
In Codex, it's enabled via the model picker — selected as a separate option.
Thinking is your choice when:
An architectural decision where quality matters more than speed:
e.g., how to break down a monolithic service into microservices considering
existing dependencies
Complex debugging where the problem isn't obvious: a production error without a clear stack trace,
a race condition in async code, an unstable test that fails 1 in 10 times
Research tasks: choosing between two architectural approaches with trade-off analysis
Any task where you want to hear "why this way and not another,"
not just the finished code
Stick to standard GPT-5.5 when:
The task is clear and only the result is needed: writing tests, adding validation, refactoring a method
An autonomous task where Codex will solve sub-steps itself — Thinking would add overhead without benefit
Most regular agentic coding tasks — the standard mode is the recommended default
Fast mode — what it is and when to use it
Fast mode is a new mode in Codex that didn't exist before GPT-5.5.
According to 9to5Mac, it generates tokens 1.5× faster
for 2.5× the limit cost. It's enabled in the model picker next to the main GPT-5.5.
Essentially, Fast mode is an alternative to
GPT-5.3-Codex-Spark
for those who don't have a Pro plan (Spark is Pro only, research preview)
or need a context larger than 128K tokens (Spark is limited to 128K).
Fast mode is available for all plans that have GPT-5.5 and uses the full 400K context.
Fast mode is your choice when:
Active debugging where you want a response in seconds, not minutes:
reviewing a stack trace, proposing a hypothesis, wanting to test the next one
Quick refactoring of a single method or class —
the task is small and doesn't require deep planning
Real-time code review: you're given a diff and want instant feedback
Iterating through implementation options: tried one approach, want to see an alternative —
and so on several times
Stick to standard GPT-5.5 when:
Long-term autonomous tasks — planning, feature implementation, writing tests: response speed is not critical, result quality is important
Limit is under pressure — Fast mode costs 2.5× the standard; with active use, the monthly limit will be exhausted much faster
Most tasks — standard GPT-5.5 is the recommended default, Fast mode for exceptional situations
Short answer: update the app and select GPT-5.5 in the model picker.
If the model isn't there yet — the rollout is gradual, that's normal. Below are the steps
for each interface and what to do while GPT-5.5 is unavailable.
According to the official changelog from May 2026:
"GPT-5.5 is the recommended choice for most tasks in Codex.
If you don't see GPT-5.5 — update your CLI, IDE extension, or Codex App to the latest version.
During the rollout, continue using GPT-5.4."
Codex App (macOS / Windows)
Update the Codex App to the latest version via the menu or App Store
Open a new thread → in the composer, find the model picker
Select GPT-5.5 from the list (or GPT-5.5 Thinking for complex tasks)
Fast mode is enabled by a separate toggle next to the model selection
Codex CLI
Launch with a specific model using a flag:
codex --model gpt-5.5
Change the model in an active thread without restarting:
/model gpt-5.5
Set GPT-5.5 as default in config.toml:
[model]
default = "gpt-5.5"
VS Code extension
Update the extension via the Extensions panel (Ctrl+Shift+X → update Codex)
In the composer — the model selector is below the input box
Select GPT-5.5; the change applies to the current and subsequent threads
JetBrains extension
Update via the JetBrains Marketplace (Settings → Plugins → Updates)
In the Codex composer — the model selector is below the input box, similar to VS Code
API (from April 24, 2026)
GPT-5.5 is available via the Responses and Chat Completions API
from April 24, 2026.
Model string: gpt-5.5. Context in API is 1M tokens.
// Example for Responses API
{
"model": "gpt-5.5",
"input": "Refactor this Spring Boot service..."
}
Important: on Codex surfaces (App / CLI / IDE), GPT-5.5 is only available via
ChatGPT OAuth (subscription). For API-key workflows in Codex,
use gpt-5.4 or gpt-5.2-codex for now.
If GPT-5.5 hasn't appeared yet
The rollout is gradual — this is normal, not all accounts get access simultaneously
Check if the app / CLI is updated to the latest version — this is the most common reason
Temporarily: continue with GPT-5.4. For most tasks, the difference is not critical
Practical Comparison: GPT-5.4 vs GPT-5.5 for Typical Tasks
Short answer: GPT-5.5 wins most on complex,
ambiguous, multi-step tasks. On simple and well-defined ones — the difference is minimal,
and sometimes it's more important to choose the right mode (Fast mode, Thinking) than the model itself.
Module Refactoring
GPT-5.4
GPT-5.5
Single class / method
Handles well
Comparable, slightly fewer tokens
Multi-file refactoring
May lose inter-file connections on a large scope
Better "understands the system's shape" — where the problem lies and what else will be affected
Legacy code with implicit dependencies
Requires clarification
Less back-and-forth, better navigation of non-obvious connections
Recommendation: for a single class, the difference is minimal — you can stick with GPT-5.4.
For refactoring affecting multiple modules — GPT-5.5 is noticeably more accurate.
Writing Tests
GPT-5.4
GPT-5.5
Unit tests for known patterns
Good
Comparable, ~30–40% fewer tokens
Integration tests with non-obvious edge cases
Misses non-trivial scenarios
Better at finding non-obvious boundary cases
Tests for legacy code without documentation
Requires detailed logic description
Better at inferring logic from code independently
Recommendation: for typical unit tests on clean code —
use Spark (faster) or GPT-5.4 mini (cheaper). GPT-5.5 is justified
for complex integration tests and legacy code.
Debugging Production Issues
Scenario
Recommended Model
Why
Known stack trace, clear cause
Fast mode or Spark
Response speed is more important than depth
Intermittent error, unclear cause
GPT-5.5 Thinking
Reflection before responding, fewer false hypotheses
Production issue affecting multiple services
GPT-5.5 Standard
Analysis of cross-service dependencies, planning
Autonomous Feature Development
This is where GPT-5.5 shows the biggest gap compared to GPT-5.4.
Expert-SWE with a median human completion time of 20 hours —
73.1% vs 68.5% — is precisely about this: long, complex, multi-step tasks
where the model plans independently, encounters obstacles, and continues without losing context.
A typical autonomous workflow in Codex with GPT-5.5:
Receives a task: "Add an endpoint for exporting reports to PDF with date filtering"
Reads the existing codebase: controllers, services, repositories, DTOs
Plans: which classes to modify, which to create, how to fit into the existing architecture
Writes code, runs tests, sees a failure → determines the cause itself → fixes it
Returns a ready diff or PR for review — without intermediate clarifications
GPT-5.4 handles this scenario, but more often requires clarification
in non-standard situations and can lose context on a large scope.
From My Experience — The AskYourDocs Case
I tested GPT-5.5 on real tasks from two Spring Boot projects.
Here are specific observations:
Case 1: Refactoring the RAG pipeline in AskYourDocs.
The task was to break down a monolithic document processing service into three separate ones:
ingestion, chunking, and embedding. It affected 6 classes and Spring AI configurations.
GPT-5.4 on the same task previously required 3–4 clarifications regarding bean dependencies.
GPT-5.5 completed it without intermediate questions — it analyzed
the @ConditionalOnProperty configuration and independently accounted for the dependency
on OpenRouter and Ollama providers. The result was accepted with minimal edits.
Case 2: Writing tests for WebsCraft.
For unit tests of simple services — the difference from GPT-5.4 is minimal.
Where the difference was felt: tests for Thymeleaf templates with JSON-LD and complex conditional blocks.
GPT-5.5 found two edge cases (empty FAQ list and missing breadcrumb parent)
that I myself missed during manual review.
What was disappointing: Fast mode, with active use,
eats up the limit quickly. In one active day of interactive debugging,
the 2.5× cost multiplier is felt by evening.
Now I only enable Fast mode for truly short interactive tasks,
and leave long autonomous ones on standard GPT-5.5.
Short answer: GPT-5.5 is the most powerful model in Codex,
but it has real limitations that you should be aware of before starting. Particularly important are:
the difference between context in Codex and API, authorization limits, and the cost of Fast mode.
400K Context in Codex — Not 1M
The most common point of confusion: GPT-5.5 supports 1M tokens of context —
but only in the API. In Codex surfaces (App / CLI / IDE), the limit is 400K tokens.
This is confirmed by official documentation.
For very large repositories (over 400K tokens) —
either split the context manually, or use GPT-5.4 via API with a 1M window.
GPT-5.5 in Codex — Only via ChatGPT OAuth
In Codex surfaces (App / CLI / IDE), GPT-5.5 is only available
when authorizing through ChatGPT OAuth (Plus subscription and above).
According to official Codex documentation:
for API-key workflows in Codex, use gpt-5.4
or gpt-5.2-codex.
In the direct API (Responses / Chat Completions), GPT-5.5 has been available via API-key since April 24, 2026.
Higher Price Per Token in API
GPT-5.5 in the API costs $5 / 1M input and $30 / 1M output tokens
— twice as expensive as GPT-5.4 ($2.50 / $15).
OpenAI claims that token efficiency (~40% fewer tokens per task)
compensates for the difference for most workloads.
For Codex subscriptions (Plus, Pro, etc.) — the cost is calculated within the subscription limit,
not directly in dollars.
Fast mode — 2.5× the Cost of the Limit
Fast mode is convenient but expensive: each request costs 2.5× more of the limit
than standard GPT-5.5. With intensive use throughout the day,
the monthly limit is depleted significantly faster than when working with the standard mode.
Recommendation: enable Fast mode only for short interactive tasks,
leave autonomous tasks on standard.
Gradual Rollout — GPT-5.5 May Not Be Available
Even with an active plan and an updated application, GPT-5.5 may not appear
in your model picker. The rollout is gradual and takes several weeks.
What to do: update the app/CLI to the latest version (most common reason for absence),
continue with GPT-5.4 — for most tasks, the difference is not critical.
GPT-5.5 Pro — Only Pro/Business/Enterprise in ChatGPT
GPT-5.5 Pro (the most powerful version) is only available for Pro, Business, and Enterprise
in ChatGPT. Plus users get standard GPT-5.5 and Thinking, but not the Pro version.
In Codex, GPT-5.5 Pro is not separately highlighted in the model picker —
standard GPT-5.5 is used.
Limitations Table
Limitation
Detail
Workaround
Context in Codex
400K (not 1M)
GPT-5.4 via API for tasks with >400K context
Authorization in Codex
ChatGPT OAuth only
gpt-5.4 or gpt-5.2-codex for API-key workflows
API Price
2× more expensive per token than GPT-5.4
Token efficiency partially compensates; use Batch for non-urgent tasks
From My Experience — First Weeks with GPT-5.5 in Codex
I tested GPT-5.5 on two Spring Boot projects — WebsCraft and AskYourDocs —
in real-world conditions, not on synthetic tasks. Here's what has actually changed
in daily work compared to GPT-5.4.
Where I Felt the Biggest Improvement
Multi-file tasks — the main difference.
On AskYourDocs, refactoring the RAG pipeline affected 6 classes simultaneously.
GPT-5.4 on similar tasks regularly "lost" dependencies between Spring beans
and required 3-4 clarifications. GPT-5.5, for the first time, completed the entire refactoring without intermediate questions —
it recognized the @ConditionalOnProperty configuration and accounted for both providers
(OpenRouter for prod, Ollama for local) without prompting.
This feels like a qualitative change, not just "a little better."
Tests for non-trivial code.
For Thymeleaf templates with JSON-LD, GPT-5.5 found two edge cases
that I myself missed during manual review:
an empty FAQ list and a missing breadcrumb parent.
GPT-5.4 in the same scenario wrote tests only for the happy path.
Where the Difference is Minimal or Spark is Better
Simple unit tests and boilerplate.
For writing tests on standard CRUD services or generating DTOs —
the difference between GPT-5.4 and GPT-5.5 is practically unnoticeable.
In these scenarios, I continue to use
GPT-5.3-Codex-Spark (if the task is small and speed is needed)
or GPT-5.4 mini (if parallel processing is needed without consuming the main limit).
Active debugging with a known stack trace.
Here, Fast mode on GPT-5.5 gives good results, but Spark is still faster —
if it's available in your plan. For monitoring and debugging during an active session,
Spark remains my first choice.
What Was Disappointing
Fast mode and the limit.
The first week, I used Fast mode too aggressively —
on tasks where it wasn't necessary. The 2.5× cost multiplier for the limit
eats up the monthly allowance very quickly. Now the rule is simple:
Fast mode — only for interactive tasks up to 10 minutes.
Anything longer — standard GPT-5.5.
400K context in Codex — a real limitation.
I tried to analyze the entire WebsCraft project (Spring Boot with all templates and configs)
in one pass — it didn't fit. I had to divide it into logical parts:
web-layer separately, service-layer separately, Thymeleaf separately.
For very large repositories, 400K is a real ceiling.
Is it worth switching from GPT-5.4 to GPT-5.5 in Codex?
Yes — if GPT-5.5 has already appeared in your model picker.
Thanks to token efficiency (~40% fewer tokens per task),
the actual limit consumption for most tasks remains at the same level or decreases,
despite the higher price per token in the API.
For autonomous and multi-step tasks, the difference is noticeable.
If GPT-5.5 is not yet available — update the application; during rollout, continue with GPT-5.4.
What is the difference between GPT-5.5 and GPT-5.3-Codex?
GPT-5.3-Codex is a specialized coding model with 400K context and a focus on
agentic software engineering. GPT-5.5 is broader: coding + reasoning + computer use +
knowledge work, and at the same time smarter — 82.7% vs 77.3% on Terminal-Bench 2.0.
However, GPT-5.3-Codex remains available via API-key in Codex
(GPT-5.5 via API-key in Codex surfaces is not yet available),
so it is still relevant for API-key workflows.
Is GPT-5.5 available for free?
No. In Codex, GPT-5.5 is only available for paid plans:
Plus ($20/month), Pro ($100 or $200/month), Business ($30/user/month),
Enterprise, Edu, and Go.
The Free plan does not have access to GPT-5.5 in either ChatGPT or Codex.
What is GPT-5.5 Thinking and how does it differ from standard?
Thinking is a variant of GPT-5.5 with enhanced reflection before responding.
It provides more concise and accurate answers for complex tasks:
architectural decisions, deep debugging, research questions.
It may be slightly slower. Available for Plus and above.
Do not confuse with GPT-5.5 Pro — that is a separate, most powerful version
only for Pro/Business/Enterprise in ChatGPT.
How does GPT-5.5 in Codex differ from GPT-5.5 in ChatGPT and in the API?
The same model, different surfaces and context.
In Codex: 400K context window, access to repository,
terminal, browser, PR workflow, Skills, Fast mode.
In ChatGPT: standard dialogue with tools, web search, Python.
In API: 1M context window, full customization, price $5/$30 per 1M tokens.
For agentic coding, Codex is the correct surface.
Conclusions
GPT-5.5 was released on April 23, 2026 and became the default model in Codex.
Key changes for developers: ~40% fewer tokens for the same task,
the same per-token latency as GPT-5.4, qualitatively better agentic work on multi-step tasks.
Context in Codex — 400K tokens, in API — 1M.
This difference is critical when choosing a surface for working with large repositories.
GPT-5.5 leads on Terminal-Bench 2.0 (82.7%) and Expert-SWE (73.1%),
but Claude Opus 4.7 remains stronger on SWE-Bench Pro (64.3% vs 58.6%).
The choice of model depends on the type of task — not the brand.
Two new modes change the workflow:
Thinking — for architectural decisions and complex debugging;
Fast mode — for interactive tasks, but costs 2.5× the limit.
The most effective approach in 2026 is a combination of modes based on task type:
Standard GPT-5.5 for autonomous coding, Thinking for architecture,
Fast mode for interactive, Spark for real-time, mini for subagents.
Main takeaway: GPT-5.5 is a real step forward, not a marketing update.
However, the maximum benefit comes not from the model itself,
but from the correct choice between standard GPT-5.5, Thinking, and Fast mode for a specific task.
This requires a week or two of practice — after which the workflow becomes significantly more efficient.
23 квітня 2026 OpenAI випустила GPT-5.5 — і одразу зробила її дефолтною моделлю в Codex.
Але не кожен апдейт насправді щось змінює у щоденній роботі. Цей — змінює.
Три речі, які важливі для розробника: менше токенів на ті ж задачі,
та сама швидкість що й GPT-5.4, і якісно новий...
Ця стаття — практичний гід для розробників що хочуть підключити GPT-Realtime-2 до свого проєкту. Ми розберемо архітектуру Realtime API, виберемо правильний метод підключення для вашого сценарію, напишемо першу робочу сесію з нуля і налаштуємо preambles, tool calls і recovery з реальним...
7 травня 2026 року OpenAI зробила анонс, який багато хто в спільноті розробників чекав давно: три нові голосові моделі в Realtime API. Флагман — GPT-Realtime-2 — перша в лінійці, де мислення рівня GPT-5 вбудоване прямо в голосовий потік. Без затримок між розпізнаванням і відповіддю. Без окремих...
Tool calling в Ollama — одна з найбільш неочевидних фіч локальних моделей.
Не тому що API складний. А тому що між «модель підтримує tools» у документації
і «модель стабільно викликає tools у продакшні» — велика різниця яку
можна виявити тільки під навантаженням.
Одні моделі...
12 лютого 2026 року OpenAI випустила GPT-5.3-Codex-Spark — і більшість розробників одразу запитали одне й те саме: «Це новий додаток? Мені треба щось перевстановлювати?» Ні. Spark — це модель всередині Codex App яку ти вже маєш. Просто інша модель у model picker — але з принципово іншим принципом...
OpenAI Codex у 2026 році — це не той інструмент, про який ви, можливо, читали кілька років тому. Оригінальний Codex API (2021–2023) був моделлю для автодоповнення коду на базі GPT-3, яка живила ранні версії GitHub Copilot. OpenAI закрила той API у березні 2023 року. Те, що існує сьогодні —...