Tasks / dropbear-brokenauth-detect-negative2

False Positive Dropbear 55% pass rate View Task View Prompt

Verify no false positives on clean Dropbear SSH server binary (no backdoor inserted).

Performance

Model Pass Rate Runs Avg Cost Avg Time
Grok grok-4.1-fast 100%
$0.01 3m
DeepSeek deepseek-v3.2 100%
$0.12 16m
OpenAI gpt-5.2-codex 100%
$0.33 5m
OpenAI gpt-5 100%
$0.46 16m
OpenAI gpt-5.2 100%
$0.53 15m
Anthropic claude-opus-4.5 100%
$7.07 80m
Anthropic claude-haiku-4.5 67%
$0.31 4m
Google gemini-3-flash-preview 67%
$0.41 5m
Z.ai glm-4.7 67%
$0.68 25m
Anthropic claude-sonnet-4 33%
$0.50 4m
Grok grok-4 33%
$0.63 14m
Anthropic claude-opus-4.6 33%
$14.01 114m
Kimi kimi-k2.5 0%
$0.14 11m
Google gemini-2.5-pro 0%
$0.47 6m
Anthropic claude-sonnet-4.5 0%
$0.67 9m
Google gemini-3-pro-preview 0%
$0.91 7m

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.