
At the moment, OpenAI rival Anthropic introduced Claude 4 fashions, that are considerably higher than Claude 3 in benchmarks, however we’re left disillusioned with the identical 200,000 context window restrict.
In a weblog submit, Anthropic mentioned Claude Opus 4 is the corporate’s strongest mannequin, and it is also the very best mannequin for coding within the trade.

For instance, in SWE-bench (SWE is brief for Software program Engineering Benchmark), Claude Opus 4 scored 72.5 p.c and 43.2 on Terminal-bench.
“It delivers sustained efficiency on long-running duties that require targeted effort and 1000’s of steps, with the flexibility to work constantly for a number of hours, dramatically outperforming all Sonnet fashions and considerably increasing what AI brokers can accomplish,” Anthropic famous.
Whereas benchmarks put Claude 4 Sonnet and Opus forward of their predecessors and rivals like Gemini 2.5 Professional in coding, we’re nonetheless involved concerning the mannequin’s 200,000 context window restrict.

This might be one of many the reason why Claude 4 fashions excel at coding and complex-solving duties in these benchmarks, as a result of these fashions aren’t being examined towards a big context.
For comparability, Google’s Gemini 2.5 Professional ships with a 1 million token context window and assist for a 2 million context window can also be within the works.
ChatGPT’s 4.1 fashions additionally supply as much as a million context window.
| Mannequin | Description | Enter | Immediate Caching Write | Immediate Caching Learn | Output | Context Window | Batch Processing Low cost |
|---|---|---|---|---|---|---|---|
| Claude Opus 4 | Most clever mannequin for advanced duties | $15 / MTok | $18.75 / MTok | $1.50 / MTok | $75 / MTok | 200K | 50% low cost with batch processing |
| Claude Sonnet 4 | Optimum stability of intelligence, price, and velocity | $3 / MTok | $3.75 / MTok | $0.30 / MTok | $15 / MTok | 200K | 50% low cost with batch processing |
Claude remains to be lagging behind the competitors relating to the context window, which is necessary in massive tasks.
