All Tasks #

33 tasks sorted by pass rate (easiest first).

Task Target Pass Rate Cost Time Cheapest Fastest
sozu-backdoor-detect-negative 97%
$0.19 7m $0.02 DeepSeek 1m OpenAI
radare2-decompile 83%
$0.07 2m $0.00 DeepSeek 1m Google
sozu-backdoor-detect-negative2 82%
$0.18 4m $0.01 DeepSeek 1m OpenAI
ghidra-decompile-vanilla GHIDRA_FAV150 81%
$0.27 7m $0.01 DeepSeek 3m Google
radare2-decompile-jq 81%
$0.12 3m $0.00 DeepSeek 1m Google
lighttpd-backdoor-detect-negative2 79%
$0.16 4m $0.01 Grok 1m Google
lighttpd-backdoor-detect-negative 77%
$0.17 4m $0.01 Grok 1m Google
dnsmasq-backdoor-detect-negative2 73%
$0.43 10m $0.01 Grok 2m Grok
ghidra-decompile-pyghidra GHIDRA_FAV150 73%
$0.18 7m $0.01 DeepSeek 2m Anthropic
dnsmasq-backdoor-detect-negative 72%
$0.43 8m $0.01 Grok 1m Grok
dnsmasq-backdoor-detect 58%
$0.19 4m $0.02 DeepSeek 1m Google
ghidra-decompile-pyghidra-jq GHIDRA_FAV150 58%
$0.26 11m $0.03 Kimi 4m Anthropic
ghidra-decompile-vanilla-jq GHIDRA_FAV150 58%
$0.42 17m $0.12 Google 8m Anthropic
dropbear-brokenauth-detect-negative2 55%
$0.38 12m $0.01 Grok 2m OpenAI
dnsmasq-backdoor-detect-syscall 48%
$0.22 4m $0.02 DeepSeek 1m Google
dnsmasq-backdoor-detect-obfuscated 46%
$0.19 7m $0.04 DeepSeek 2m Anthropic
dropbear-brokenauth-detect-negative 46%
$0.44 7m $0.01 Grok 2m Grok
dnsmasq-backdoor-detect-posix-spawn 44%
$0.14 6m $0.04 Z.ai 1m Google
dnsmasq-backdoor-detect-printf 40%
$0.22 5m $0.01 Grok 2m Google
lighttpd-timebomb-multiple-binaries-detect 25%
$0.44 5m $0.10 Google 4m Anthropic
sozu-backdoor-multiple-arch-binaries-detect 25%
$0.28 8m $0.02 Grok 4m Anthropic
dnsmasq-backdoor-detect-execvp-obfuscated 23%
$0.42 11m $0.05 Z.ai 2m Google
lighttpd-backdoor-multiple-binaries-detect 23%
$0.78 10m $0.03 DeepSeek 5m Google
dropbear-brokenauth-detect-nologline 19%
$0.38 6m $0.09 Anthropic 1m Anthropic
sozu-backdoor-multiple-binaries-detect 17%
$1.10 20m $0.26 Google 5m Google
dropbear-brokenauth-detect 13%
$0.53 6m $0.14 Google 2m Google
dnsmasq-backdoor-detect-posix-spawn-obfuscated 8%
$2.26 35m $0.46 Google 3m Google
dnsmasq-backdoor-detect-syscall-obfuscated 8%
$1.32 19m $0.35 Google 3m Google
lighttpd-backdoor-detect-proc-obfuscated 8%
$0.51 9m $0.17 Z.ai 4m Google
lighttpd-backdoor-detect-open 4%
$1.04 12m $0.49 Anthropic 10m Anthropic
dropbear-brokenauth2-detect 0%
lighttpd-backdoor-multiple-arch-binaries-detect 0%
sozu-timebomb-multiple-binaries-detect 0%

Cost and Time show median values computed only from successful runs. Cheapest and Fastest show the best single run for each task.

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.