Back to Dashboard
CategoryWeight: 1.0x

Coding Tasks

General programming challenges including algorithm implementation, data structure design, and system architecture tasks.

Best Score

0.0

Avg Score

0.0

Tests

3

Performance Over Time — All Models

Model Rankings

1
Claude Sonnet 4.6

Category score

View
99.7BEST
Tokens6.8k
Total6.8k
2
Claude Opus 4.8

Category score

View
99.0-0.7 pts
Tokens15.9k
Total15.9k
3
GPT-5.5

Category score

View
98.3-1.4 pts
Tokens37.4k
Total37.4k
4
Grok

Category score

View
97.7-2.0 pts
Tokens53.6k
Total53.6k

Test Breakdown

Graph Algorithm Implementation

Implement Dijkstra with priority queue and handle edge cases

Claude Sonnet 4.6
99.7
Claude Opus 4.8
99.0
GPT-5.5
98.3
Grok
97.7

REST API Design

Design and implement a paginated REST API with filtering

Claude Sonnet 4.6
99.7
Claude Opus 4.8
99.0
GPT-5.5
98.3
Grok
97.7

Concurrent Data Pipeline

Build a producer-consumer pipeline with backpressure handling

Claude Sonnet 4.6
99.7
Claude Opus 4.8
99.0
GPT-5.5
98.3
Grok
97.7