OTelBench
OpenTelemetry instrumentation benchmark for AI coding agents. Tests models on real-world tasks adding distributed tracing, metrics, and logging to multi-language codebases.
Open-source benchmarks for evaluating AI coding agents on real-world software engineering tasks.
OpenTelemetry instrumentation benchmark for AI coding agents. Tests models on real-world tasks adding distributed tracing, metrics, and logging to multi-language codebases.
Build system benchmark for AI coding agents. Tests models on fixing compilation errors, updating dependencies, and navigating complex build configurations.