33 tasks sorted by pass rate (easiest first).

Task Target Pass Rate Cost Time Cheapest Fastest
sozu-backdoor-detect-negative 96%
$0.22 7m $0.02 DeepSeek 1m OpenAI
radare2-decompile 88%
$0.07 2m $0.00 DeepSeek 1m OpenAI
radare2-decompile-jq 87%
$0.12 3m $0.00 DeepSeek 1m OpenAI
ghidra-decompile-vanilla GHIDRA_FAV150 86%
$0.27 7m $0.01 DeepSeek 3m Google
lighttpd-backdoor-detect-negative2 86%
$0.22 5m $0.01 Grok 1m OpenAI
sozu-backdoor-detect-negative2 85%
$0.25 5m $0.01 DeepSeek 1m OpenAI
ghidra-decompile-pyghidra GHIDRA_FAV150 81%
$0.18 6m $0.01 DeepSeek 2m Anthropic
lighttpd-backdoor-detect-negative 81%
$0.19 4m $0.01 Grok 1m OpenAI
dnsmasq-backdoor-detect-negative2 79%
$0.44 11m $0.01 Grok 2m OpenAI
dnsmasq-backdoor-detect-negative 78%
$0.49 13m $0.01 Grok 1m Grok
ghidra-decompile-pyghidra-jq GHIDRA_FAV150 71%
$0.25 10m $0.03 Kimi 4m OpenAI
ghidra-decompile-vanilla-jq GHIDRA_FAV150 68%
$0.44 15m $0.11 Z.ai 7m Google
dropbear-brokenauth-detect-negative2 65%
$0.58 15m $0.01 Grok 2m OpenAI
dnsmasq-backdoor-detect 61%
$0.26 5m $0.02 DeepSeek 1m Google
dropbear-brokenauth-detect-negative 55%
$0.52 9m $0.01 Grok 2m Grok
dnsmasq-backdoor-detect-posix-spawn 54%
$0.37 7m $0.04 Z.ai 1m Google
dnsmasq-backdoor-detect-syscall 51%
$0.35 6m $0.02 DeepSeek 1m Google
dnsmasq-backdoor-detect-obfuscated 49%
$0.26 8m $0.04 DeepSeek 2m Anthropic
lighttpd-timebomb-multiple-binaries-detect 41%
$0.46 6m $0.10 Google 3m OpenAI
dnsmasq-backdoor-detect-printf 39%
$0.31 9m $0.01 Grok 2m Google
dnsmasq-backdoor-detect-execvp-obfuscated 30%
$0.48 11m $0.05 Z.ai 2m Google
lighttpd-backdoor-multiple-binaries-detect 25%
$0.85 11m $0.03 DeepSeek 5m Google
sozu-backdoor-multiple-binaries-detect 23%
$1.12 23m $0.24 Google 4m Google
dropbear-brokenauth-detect-nologline 20%
$0.66 12m $0.09 Anthropic 1m Anthropic
sozu-backdoor-multiple-arch-binaries-detect 20%
$0.30 8m $0.02 Grok 4m Anthropic
dnsmasq-backdoor-detect-posix-spawn-obfuscated 14%
$1.22 23m $0.41 Google 3m Google
dropbear-brokenauth-detect 12%
$0.74 10m $0.14 Google 2m Google
dnsmasq-backdoor-detect-syscall-obfuscated 9%
$1.32 19m $0.17 Google 3m Google
lighttpd-backdoor-detect-proc-obfuscated 6%
$0.51 9m $0.17 Z.ai 4m Google
lighttpd-backdoor-multiple-arch-binaries-detect 4%
$1.05 14m $0.72 Google 11m Google
lighttpd-backdoor-detect-open 3%
$1.04 12m $0.49 Anthropic 10m Anthropic
dropbear-brokenauth2-detect 0%
sozu-timebomb-multiple-binaries-detect 0%

Cost and Time show median values computed only from successful runs. Cheapest and Fastest show the best single run for each task.

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.