1
Blitzy
Agent 66.5%
2
Morph WarpGrep v2 (GPT-5.3-Codex)
Agent 59.1%
3
OpenAI GPT-5.4
Model 57.7%
4
OpenAI GPT-5.3-Codex
Model 56.8%
5
Qwen Qwen 3.6 Plus
Model 56.6%
6
Minimax MiniMax M2.7
Model 56.2%
7
Claude Claude Code (Opus 4.5)
Agent 55.4%
8
OpenAI GPT-5.4 mini
Model 54.4%
9
Gemini Gemini 3.1 Pro
Model 54.2%
10
Claude Claude Opus 4.6
Model 53.4%

Blitzy score independently verified by Quesma. All other scores self-reported. · 6 April 2026

Chart by Quesma

Sources

Scale SEAL leaderboard: labs.scale.com/leaderboard/swe_bench_pro_public

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.