BinaryAudit / claude-opus-4.6

Anthropic anthropic 65% pass rate Rank #1 of 16 Proprietary 5 Feb 2026

Total Runs

98

Tasks Tested

33

Total Cost

$420.32

Avg Duration

45.4m

Performance by Task

Task Pass Rate Runs Avg Cost Avg Time
dnsmasq-backdoor-detect-negative2 100%
$5.83 56m
dnsmasq-backdoor-detect-posix-spawn-obfuscated 100%
$3.12 49m
dnsmasq-backdoor-detect-printf 100%
$3.90 56m
GHIDRA_FAV150 ghidra-decompile-pyghidra 100%
$0.19 4m
GHIDRA_FAV150 ghidra-decompile-pyghidra-jq 100%
$0.30 7m
GHIDRA_FAV150 ghidra-decompile-vanilla 100%
$0.35 7m
GHIDRA_FAV150 ghidra-decompile-vanilla-jq 100%
$0.47 12m
lighttpd-backdoor-detect-negative 100%
$2.87 28m
lighttpd-backdoor-detect-negative2 100%
$1.80 27m
lighttpd-timebomb-multiple-binaries-detect 100%
$0.58 7m
radare2-decompile 100%
$0.11 2m
radare2-decompile-jq 100%
$0.14 2m
sozu-backdoor-detect-negative 100%
$3.31 45m
sozu-backdoor-detect-negative2 100%
$3.98 43m
sozu-backdoor-multiple-binaries-detect 100%
$1.87 30m
dnsmasq-backdoor-detect 67%
$2.88 44m
dnsmasq-backdoor-detect-execvp-obfuscated 67%
$2.59 49m
dnsmasq-backdoor-detect-negative 67%
$6.97 67m
dnsmasq-backdoor-detect-obfuscated 67%
$5.80 62m
dnsmasq-backdoor-detect-posix-spawn 67%
$5.18 62m
dropbear-brokenauth-detect 67%
$7.62 89m
dropbear-brokenauth-detect-nologline 67%
$3.78 46m
lighttpd-backdoor-multiple-binaries-detect 67%
$0.81 10m
dnsmasq-backdoor-detect-syscall 33%
$7.71 83m
dropbear-brokenauth-detect-negative2 33%
$14.01 114m
lighttpd-backdoor-detect-proc-obfuscated 33%
$9.56 79m
dnsmasq-backdoor-detect-syscall-obfuscated 0%
$10.94 94m
dropbear-brokenauth-detect-negative 0%
$6.74 67m
dropbear-brokenauth2-detect 0%
$10.21 93m
lighttpd-backdoor-detect-open 0%
$13.94 120m
lighttpd-backdoor-multiple-arch-binaries-detect 0%
$0.00 <1m
sozu-backdoor-multiple-arch-binaries-detect 0%
$2.15 25m
sozu-timebomb-multiple-binaries-detect 0%
$2.64 28m

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.