Tasks / java-distributed-context-propagation

Java 0% pass rate View Task View Prompt

Instrument a Java client-server application with minimal OTEL tracing. Client makes two workflows (valid token, invalid token). Must produce exactly 2 trace IDs.

Common failure modes

- Test expects exactly 2 trace IDs, models produce 3 - Java OTEL SDK complexity - Models create extra traces for initialization/startup

Performance

Model	Pass Rate	Avg Cost	Avg Time
gpt-5.2-codex	0%	$0.00	121m
grok-4.1-fast	0%	$0.10	16m
gemini-3-flash-preview	0%	$0.16	5m
kimi-k2-thinking	0%	$0.16	21m
deepseek-v3.2	0%	$0.21	23m
glm-4.7	0%	$0.23	12m
gpt-5.1	0%	$0.47	18m
claude-haiku-4.5	0%	$0.53	10m
gemini-3-pro-preview	0%	$0.57	10m
claude-sonnet-4.5	0%	$0.64	7m
gpt-5.2	0%	$0.72	17m
gpt-5.1-codex-max	0%	$1.00	25m
claude-opus-4.5	0%	$1.15	9m
grok-4	0%	$1.30	22m

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.