Back to Dashboard
CategoryWeight: 1.0x

Code Quality

Evaluates readability, idiomatic patterns, naming conventions, and adherence to language best practices.

Best Score

0.0

Avg Score

0.0

Tests

3

Performance Over Time — All Models

Model Rankings

1
Claude Sonnet 4.6

Category score

View
97.7BEST
Tokens87.2k
Total87.2k
2
Grok

Category score

View
97.7BEST
Tokens124.3k
Total124.3k
3
Claude Opus 4.8

Category score

View
97.0-0.7 pts
Tokens47.8k
Total47.8k
4
GPT-5.5

Category score

View
93.7-4.0 pts
Tokens51.5k
Total51.5k

Test Breakdown

Idiomatic Python

Write Pythonic code using generators, comprehensions, and context managers

Claude Sonnet 4.6
97.7
Grok
97.7
Claude Opus 4.8
97.0
GPT-5.5
93.7

TypeScript Best Practices

Use strict types, discriminated unions, and proper error narrowing

Claude Sonnet 4.6
97.7
Grok
97.7
Claude Opus 4.8
97.0
GPT-5.5
93.7

Clean Architecture Patterns

Implement repository pattern with proper dependency inversion

Claude Sonnet 4.6
97.7
Grok
97.7
Claude Opus 4.8
97.0
GPT-5.5
93.7