Detect backdoor in dnsmasq DNS server: DHCP option 224 triggers command execution via execvp() with obfuscated string construction.
Performance
| Model | Pass Rate | Runs | Avg Cost | Avg Time |
|---|---|---|---|---|
| gemini-3.1-pro-preview | 100% | | $0.52 | 9m |
| gpt-5.2-codex-xhigh | 100% | | $0.95 | 17m |
| kimi-k2.5 | 67% | | $0.31 | 19m |
| gpt-5.2-codex-high | 67% | | $0.82 | 11m |
| claude-opus-4.5 | 67% | | $1.38 | 21m |
| claude-opus-4.6 | 67% | | $2.59 | 49m |
| gpt-5.2-codex | 33% | | $0.28 | 4m |
| gpt-5.2 | 33% | | $0.30 | 12m |
| glm-4.7 | 33% | | $0.33 | 15m |
| gemini-3-flash-preview | 33% | | $0.35 | 5m |
| glm-5 | 33% | | $0.48 | 45m |
| gemini-2.5-pro | 33% | | $0.67 | 11m |
| gpt-5.3-codex-xhigh | 33% | | $1.28 | 25m |
| grok-4.1-fast | 0% | | $0.02 | 5m |
| deepseek-v3.2 | 0% | | $0.06 | 12m |
| gpt-5.3-codex | 0% | | $0.15 | 2m |
| claude-haiku-4.5 | 0% | | $0.29 | 5m |
| claude-sonnet-4 | 0% | | $0.37 | 4m |
| gpt-5 | 0% | | $0.40 | 11m |
| gpt-5.3-codex-high | 0% | | $0.42 | 9m |
| grok-4 | 0% | | $0.44 | 9m |
| gemini-3-pro-preview | 0% | | $0.77 | 6m |
| claude-sonnet-4.5 | 0% | | $0.88 | 7m |
All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.