Google previews Gemini 2.5 Flash hybrid reasoning model

Google has introduced an early preview of its Gemini 2.5 Flash hybrid reasoning model. An early version of the model is available through the Gemini API via Google AI Studio and Vertex AI, according to an April 17 Google blog post.

Gemini 2.5 Flash builds on the foundation of Gemini 2.0 Flash and delivers a “major upgrade” in reasoning capabilities while prioritizing speed and cost, Google said. Gemini 2.5 Flash is Google’s first fully hybrid reasoning model, giving developers the ability to turn thinking on or off. The model allows developers to set thinking budgets to find the right tradeoff between quality, cost, and latency, the company said.

Google said its Gemini 2.5 models are thinking models, which can reason through thoughts before responding. Rather than immediately generating an output, the model can perform a “thinking” process to better understand the prompt, break down complex tasks, and plan a response. For complex tasks requiring multiple sets of reasoning, such as solving math problems, the thinking process allows the model to arrive at more accurate, comprehensive answers.