Artificial intelligence has learned to write code faster than humans can review it. The code review queue has stretched to several days, and the quality of reviews has dropped — simply because there aren't enough reviewers physically.
Spoiler: Anthropic decided to automate the review process itself: the new Claude Code Review tool launches five parallel AI agents that find errors even before a human sees the code.
⚡ TLDR
- ✅ Problem: AI generates more code than developers can manually review
- ✅ Solution: five parallel agents search for different classes of errors simultaneously
- ✅ Result within Anthropic: the share of thoroughly reviewed pull requests increased from 16% to 54%
- 🎯 You will learn: how it works, how much it costs, and how competitors are responding
- 👇 Below — details, figures, and market context
📚 Table of Contents
🎯 Why it appeared: AI creates too much code
Why review became a bottleneck
Tools like GitHub Copilot and Claude Code allow a single developer to
generate code three times faster — and within Anthropic, productivity has grown even
more: code output per engineer increased by 200% per year.
But people have to review this code at the same pace as before. The review
queue has turned into a bottleneck that slows down the entire development cycle.
«When engineers lower the barrier to creating new features, the demand for reviews sharply
increases» — Cat Wu, Head of Product, Claude Code, Anthropic
(TechCrunch).
Imagine a factory conveyor belt: machines became twice as fast, but the quality control department
remained the same. Sooner or later, the warehouse will be overflowing with unchecked parts.
This is exactly what is happening in software development worldwide right now.
Why manual review no longer scales
With AI assistants, developers write code 3–4 times faster than two years ago.
Reviewers, however, can physically review roughly the same amount of code as
before. The result is either a queue or superficial "diagonal" reviews. Before the launch
of Claude Code Review, only 16% of pull requests at Anthropic received
meaningful comments from reviewers.
Practical example
Large technology companies — Uber, Salesforce, Accenture — have already encountered this
problem. They use Claude Code for code generation and at the same time are looking for
ways to automate its review. It was their request that accelerated the emergence of Claude Code
Review: according to Cat Wu, the product appeared due to «insane market demand»
from enterprise clients.
- ✔️ AI increased code writing speed by 3–4 times, and at Anthropic — by 200% per year
- ✔️ The throughput of human review remained unchanged
- ✔️ The bottleneck shifted from writing to reviewing
Conclusion Claude Code Review is a response to a specific and
painful problem that arose precisely because of the success of AI code generation.
📌 How Code Review works: five agents instead of one
Parallel review instead of sequential
Instead of one agent sequentially reading through all the code, Claude Code Review
launches several specialized agents simultaneously. Each searches for its own class of problems.
Then the results are combined, duplicates are removed, and findings are ranked by
criticality — and presented to the reviewer as a single structured comment in GitHub.
The tool finds errors even before a human reviewer sees the code — and
this is its main value
(
Anthropic).
The principle of operation is similar to how several teams work in parallel on one product in large companies: one checks security, another — performance, a third —
compliance with code standards. Claude Code Review does the same, but automatically.
What happens inside
After a pull request is opened, the system launches parallel agents — each
specializes in its type of errors: logical bugs, security vulnerabilities, performance issues.
Next, a verification step is triggered, which filters out false
positives. Findings are marked with colors:
red — critical, yellow — worth reviewing,
purple — problem exists in old code next to changes.
The reviewer sees one consolidated comment + inline annotations for specific lines.
Convincing figures
The effect scales with PR size. For large changes (1000+ lines of code),
84% of reviews find real problems, averaging 7.5 issues
per PR. For small PRs (less than 50 lines) — 31% of reviews provide comments.
At the same time, developers reject less than 1% of findings as irrelevant — an accuracy
indicator that no classic linter can boast of.
Important detail: agents do not replace humans
Agents do not approve or reject pull requests — that remains with the human.
Cat Wu explains it this way: the tool focuses exclusively on logical errors,
not code style — «so that developers only receive what needs to be acted upon
immediately». The reviewer spends time on solutions, not on finding problems.
- ✔️ Average review time — about 20 minutes
- ✔️ The share of thoroughly reviewed PRs within Anthropic increased from 16% to 54%
- ✔️ 84% of large PRs (1000+ lines) receive meaningful findings
- ✔️ Less than 1% of findings are rejected as false positives — the final word is always with the human
Conclusion: The multi-agent architecture solves the main
problem — it scales with the amount of code, while human review does not.
💡
Want to dive deeper? In the next article —
the architecture from the inside: how parallelism is structured, how it differs
from SonarQube and ESLint, and how much tokens cost when scaling.
Under the Hood of Claude Code Review: How Multi-Agent Architecture Changes Code Review
📌 How much it costs and who has access
Short answer: $15–25 per review, enterprise only
The cost of a review ranges from $15 to $25 depending on the code volume — the price
is token-based, meaning a larger PR will cost more. The tool is available
in research preview for Claude for Teams and Claude for Enterprise clients.
For small businesses and individual developers — not yet available.
Cat Wu states directly: «This product is very much targeted towards our
larger scale enterprise users» — companies like Uber, Salesforce,
Accenture, who already use Claude Code and now need help
with the PR flow it generates
(TechCrunch).
There's also convenience for administrators: team leaders can enable Code Review
for the entire team at once — and it will automatically run on every PR.
You can also set a monthly spending limit to make the cost predictable.
Expensive or cheap: the right comparison
Comparing $15–25 with CodeRabbit ($12/month per user) or
free GitHub Copilot is the wrong perspective, say Anthropic.
The correct comparison is with the cost of a production incident. Within Anthropic,
the tool has already caught a real bug: an innocent change in one line was supposed
to break the authentication mechanism of the entire service. One such error in
production costs more than a month of Code Review.
- ✔️ Price: $15–25 per review, token-based model
- ✔️ Access: research preview for Teams and Enterprise clients
- ✔️ Monthly spending limit available for budget control
- ✔️ First clients: Uber, Salesforce, Accenture
- ✔️ Claude Code run-rate revenue exceeded $2.5 billion since launch
📌 What Anthropic says
Depth, not speed — and this is a conscious position
Anthropic positions Code Review as a tool for deep analysis, not
quick feedback. The product underwent months of internal testing
before its public launch on March 9, 2026. The company deliberately limited its focus:
only logical errors, no style.
«We decided we're going to focus purely on logic errors. This way we're catching
the highest priority things to fix» — Cat Wu, Head of Product, Claude Code
(TechCrunch).
The explanation is simple: developers have long learned to ignore automated
tools that flood them with comments about indentation and variable names.
If a tool is noisy — they turn it off. Anthropic decided to play differently:
fewer comments, but each one actionable.
From internal test to product
Before launch, Anthropic tested Code Review on its own processes for months.
The result — the share of thoroughly reviewed PRs increased from 16% to 54%. During
testing, the tool caught a real bug: a developer changed one line
in a production service, and this «innocent» fix was supposed to break the
authentication mechanism. A human reviewer would have missed it. The agent — no.
Customization for the team
Teams can configure their own review rules via the CLAUDE.md file —
add project-specific standards that agents will pay attention to.
This makes the tool adaptable to a specific stack and team culture, not
just a universal set of rules.
- ✔️ Launch: March 9, 2026, research preview
- ✔️ Focus: exclusively logical errors, not style
- ✔️ Internal result: 16% → 54% thorough reviews
- ✔️ Customization: CLAUDE.md file for custom rules
Anthropic: Anthropic deliberately sacrificed breadth
for depth — and internal data confirms that this bet is justified.
📌 Market reacts: OpenAI and GitHub Copilot are not sleeping
GitHub Copilot already does reviews — but differently
GitHub Copilot Code Review exists and has already accumulated over 60 million reviews.
But its approach is different: faster and broader, not necessarily deeper. Anthropic
and GitHub have occupied different niches in the same market — and both niches are real.
The difference between players is not whether to automate reviews, but
how deeply, how quickly, and at what price.
GitHub Copilot Code Review is no longer just IDE hints.
According to GitHub, as of early 2026,
the tool conducted over 60 million reviews, and in 71% of them left
actionable comments. Copilot can already analyze an entire repository for
context, integrates with CodeQL and ESLint, and most importantly — for many
teams, it is already included in the subscription cost.
Where Anthropic's competitive advantage lies
The key difference is in depth and focus. Claude Code Review spends an
average of 20 minutes on one PR and is aimed at large, complex changes:
for PRs with 1000+ lines, it finds problems in 84% of cases. Copilot is faster
(seconds instead of minutes), but is positioned as a «first pass,» not
deep analysis. The question the market will decide: is depth worth $15–25
per review if Copilot is already in the subscription?
An honest look at limitations
Claude Code Review still has significant limitations: integration only with GitHub
(no GitLab, no Bitbucket), available only to Teams and Enterprise —
individual developers and small teams are currently cut off. And another irony:
earlier, security researchers found critical vulnerabilities in Claude Code itself.
A tool that checks code is not immune to bugs itself.
- ✔️ GitHub Copilot: 60+ million reviews, 71% with actionable comments, included in subscription
- ✔️ Claude Code Review: deeper analysis, 20 min per PR, $15–25, GitHub only
- ✔️ OpenAI Codex: agent tools are evolving, no direct review analogue yet
- ⚠️ Limitations: GitHub only, Teams/Enterprise only, research preview
Anthropic and GitHub Copilot: — not direct
competitors, but different bets: one on depth and enterprise, the other on scale
and integration into the familiar workflow.
❓ Frequently Asked Questions (FAQ)
Will Claude Code Review replace live reviewers?
No, at least not now. Agents cannot approve or reject pull requests — that remains with a human. The tool takes on the routine task of finding problems, while the reviewer focuses on solutions and architectural issues.
Is the tool suitable for small teams?
At $15–25 per review — most likely no, if you have 2–3 developers and 5 PRs per week. Savings appear at scale: dozens of PRs daily, active use of AI for code generation, large teams.
What programming languages are supported?
Anthropic does not publish an exhaustive list, but Claude Code traditionally works well with Python, JavaScript, TypeScript, Go, and major web development languages. Support for specific corporate languages may be limited.
How safe is it to transfer code to an external AI?
This is a valid question that should be asked. Anthropic offers corporate confidentiality terms, but each company must independently assess the risks according to its security requirements and jurisdiction.
✅ Conclusions
- 🔹 AI code generation created a new problem — human review cannot keep up with the pace, and Claude Code Review is the first attempt to solve this systematically
- 🔹 The multi-agent architecture with parallel checks increased the share of thoroughly reviewed PRs within Anthropic from 16% to 54%
- 🔹 The price of $15–25 per review is justified for large teams, but currently high for small businesses
- 🔹 Anthropic occupies a new niche — deep post-factum PR analysis — rather than directly competing with GitHub Copilot
Main idea:
Claude Code Review is not a tool to get rid of reviewers, but a tool to help reviewers keep up with the pace set by AI itself.