Google's Gemini beat Pokémon Blue (helps)

Google's most expensive AI model seems to have crossed an important milestone: beating a 29-year-old video game.

Last night, Google CEO Sundar Pichai posted a victory on X, "That's great! The Gemini 2.5 Pro just finished Pokémon Blue!"

To be clear, Gemini plays Pokemon Live was created by (in his own words), "a 30-year-old software engineer has nothing to do with Google," but Joel Z.

For example, Google AI Studio's product head Logan Kilpatrick released last month that Gemini "has made great strides in completing Pokémon" and "winned the fifth badge (the next best model has only 3 badges so far, but despite different agency barriers)," Pichai laughed and joked, "We're kidding, "We're in API, artificial Pokémon Intelligence working of Api, artificial Pokémon Intelligence offer Intelligence :)".

Why Pokémon? Back in February, everyone highlighted the progress made by its Claude AI model in "Pokémon Red", and he wrote that Claude's "Extended Thinking and Agent Training" brought "significant improvements" to "big surprises" tasks such as playing classic games. ("Pokémon Red" and "Blue" are different versions of the Gameboy Title first released in 1996 and are kidnapped with the long-running Pokémon series). There is even a Claude playing the Pokemon Twitch channel, which Joel Z cited inspiration.

Despite the progress, Claude seems to have not defeated Pokémon Red. Does this mean that Gemini is objectively better in the game? Joel Z urged the audience on his Twitch page: "Please don't think of it as a benchmark for LLM to play Pokemon. You can't really make direct comparisons - Gemini and Claude have different tools and receive different information."

And, both AI models need help playing the game - this is where the aforementioned proxy harness comes in, providing the model with a screenshot of the game superimposed with other information, allowing the model to decide how to respond (which may involve calling a professional agent) and then press the button corresponding to the AI's instructions.

TechCrunch Events

Berkeley, CA | June 5

Book now

Joel Z acknowledged that there are other “development interventions” that can help Gemini complete the game, but insisted that it wasn’t cheating.

"My interventions improve Gemini's overall decision-making and reasoning skills," he said. "I don't give specific tips - there is no drill or direct explanation of specific challenges like Moon Mountain. The only thing that comes close is to let Gemini know it needs to talk to Rocket Gunt to get the lift keys, which was a bug that was later fixed in Pokemon Yellow."

Additionally, he said: “Gemini plays Pokémon and the framework continues to evolve.”