Swirlsmith
Back to leaderboard
#1

Claude Code

๐Ÿ”’ Closed

Anthropic's agentic CLI with deep shell integration, multi-file editing, extended thinking, and sub-agent spawning.

๐Ÿ’ฐ Included with Claude subscription ยท cli

76.9Overall Score
Non-Gameable Scoring

Scores are derived from established benchmarks, adjusted for harness-specific performance across four dimensions: Coding, Reasoning, Tool Use, and Autonomy.

Each dimension starts from public benchmark data and applies harness-specific modifiers based on tool integration, context handling, and orchestration quality. The overall score is a weighted composite that penalizes narrow optimization.

ModelOverall
Claude Opus 4.676.9
Claude Sonnet 4.660.1
Claude Sonnet 4.541.2
Claude Opus 4.538.5
Claude Opus 4.129.2
Claude Opus 421.9