OpenAI Beats X’s Grok & Google Gemini in AI Chess Tournament

Magnus vs. Hikaru, Fischer vs. Spassky, Karpov vs. Kasparov. Chess has seen some great rivalries down the years. Is Musk vs. Altman next?
OpenAI's o3 model has just triumphed over Elon Musk's xAI Grok 4 in a three-day AI chess tournament, in an exhibition of the power and potential of the world’s largest AI companies.
The competition took place on Google-owned platform Kaggle, bringing together eight LLMs from major AI developers including Anthropic, Google, OpenAI, xAI and Chinese firms DeepSeek and Moonshot AI.
Unlike traditional chess computers built specifically for the game, these models were designed for general consumer applications, making their strategic gameplay particularly noteworthy for AI researchers.
"Up until the semi finals, it seemed like nothing would be able to stop Grok 4 on its way to winning the event," says Pedro Pinhata, a Chess.com writer covering the tournament.
"Despite a few moments of weakness, X's AI seemed to be by far the strongest chess player... But the illusion fell through on the last day of the tournament."
Strategic errors cost Grok the championship
OpenAI's o3 emerged unbeaten throughout the competition, with Grok 4 making critical mistakes in the final rounds including repeatedly losing its queen.
In his report of the tournament, Pedro described Grok's play as "unrecognisable" and "blundering", enabling OpenAI’s model to secure "convincing wins".
Chess grandmaster Hikaru Nakamura observed during his livestream commentary: "Grok made so many mistakes in these games, but OpenAI did not."
Google's Gemini model claimed third place after defeating a different OpenAI model in earlier rounds.
Musk downplays chess focus ahead of final
Before Thursday's championship match, Musk posted on X that xAI's tournament success had been a "side effect" and the company "spent almost no effort on chess".
This comment preceded Grok's disappointing final performance, where strategic errors undermined its earlier dominance in the competition.
The rivalry between Musk and OpenAI CEO Sam Altman has intensified as both claim their latest models represent the most advanced AI systems available.
Chess remains benchmark for AI development
Tech companies have long used chess to evaluate computer progress and capabilities, with modern chess engines now virtually unbeatable against top human players.
AI developers employ such tests as benchmarks to examine their models' reasoning and strategic thinking abilities across rule-based games.
The tournament reflects broader industry trends where general-purpose AI models are being tested on specialised tasks to demonstrate their versatility and problem-solving capabilities.
DeepMind's AlphaGo famously defeated human Go champions in the late 2010s, leading South Korean master Lee Se-Dol to retire in 2019.
“There is an entity that cannot be defeated,” he said after his defeat.
Historical context of human versus machine competition
Since IBM’s Deep Blue beat Garry Kasparov in 1997, it has been widely accepted that humans cannot match the abilities of technology in the game.
Kasparov later reflected on the US$10m computer's capabilities, comparing its intelligence to an alarm clock but noting: "losing to a US$10m alarm clock did not make me feel any better".
Instead, AI companies now measure themselves against one another, rather than grandmasters.
These competitions continue demonstrating AI's evolution from specialised chess computers to versatile systems capable of strategic gameplay alongside their primary consumer applications.


