Swirlsmith
Back to leaderboard
#2

OpenClaw

Open-source agentic framework with deep shell integration, recursive planning, and multi-provider support.

91.8-0.4% (24h)
Non-Gameable Scoring

Scores are derived from established benchmarks, adjusted for harness-specific performance across four dimensions: Coding, Reasoning, Tool Use, and Autonomy.

Each dimension starts from public benchmark data and applies harness-specific modifiers based on tool integration, context handling, and orchestration quality. The overall score is a weighted composite that penalizes narrow optimization.

ModelOverall
Claude Sonnet 4.693.5
GPT-5.491.2
Grok 390.1