← Back to Home

AI BENCHMARKS

Comprehensive performance evaluations of leading AI models

Chess Benchmark

Rank	Model
1	OpenAI o3
2	Gemini 2.5 Pro

Watch the matches: YouTube Playlist

Evaluating the logical reasoning capabilities of leading AI models across various complex reasoning tasks

Rank	Model	Total Score
1	Gemini 3 Pro (preview)	36/40
2	Gpt 5.2 (xhigh)	34/40
3	Grok 4 Reasoning	27/40
4	Gpt-5	26/40
5	Gpt 5.1	25/40
5	Gemini 3 Flash (preview)	25/40
7	Qwen 3 Max (thinking)	22/40
7	Kimi K2 thinking	22/40
9	Claude Sonnet 4.5 (high) Openrouter	21/40
10	Grok 4 Fast reasoning	20/40
11	Gemini 2.5 Pro	19/40
11	Claude Opus 4.1	19/40
13	Claude 4.5 Haiku	14/40
14	MIn-max M2	10/40
15	Gpt-oss-120b	8/40
16	Qwen 3 235b - d22 b 2507	4/40

I'll be adding more AI benchmark evaluations soon

I'm working on additional AI benchmarks that will be added to this page in the future.

Subscribe to my YouTube channel to be notified when new benchmarks are released: