VerdictBloomIndependent AI tool verdicts
NewsCalculatorChatbotsImage genVideoCodeWritingAI searchSEO and GEOAudioProductivityDesign

News - Jul 4, 2026

Gemini 2.5 Pro Gets Smarter — But Still Costs You

Google updated Gemini 2.5 Pro this week, posting improved benchmark scores across coding, math, and reasoning tasks. On the LMArena leaderboard, the updated model climbed to the top position, displacing OpenAI's GPT-4o. In internal evals, Google reports gains on GPQA Diamond (graduate-level science reasoning) and Codeforces competitive programming benchmarks, though independent verification is still limited. Pricing stays the same: $3.50 per million input tokens and $10.50 per million output tokens via API, with a steeper rate above the 200K context threshold. Free-tier Gemini users get access through the Gemini app, but rate limits apply and heavy users will hit paywalls fast. Who benefits: developers doing complex code generation or multi-step reasoning tasks will likely notice real differences. Researchers using long-context document analysis get a more capable model at the same price point. Who gets hurt: casual users expecting the free tier to keep up will find the best performance gated. Enterprises already locked into OpenAI or Anthropic contracts have little immediate reason to switch mid-cycle. Benchmarks are Google's own or community leaderboards — neither is a controlled independent test. Real-world performance gaps between frontier models at this level are often smaller than the numbers suggest.

What changed

Updated Gemini 2.5 Pro model with improved scores on GPQA Diamond and Codeforces benchmarks; topped LMArena leaderboard. Pricing unchanged at $3.50/$10.50 per million input/output tokens.

Who this affects

Developers and researchers doing heavy coding, math, or long-context tasks at the same API price. Free-tier users get access but face rate limits.

Our take

Google's benchmark wins are real enough to pay attention to, but until independent researchers stress-test this at scale, treat the leaderboard position as a signal, not a verdict.

Based on Google's official model update announcements and LMArena public leaderboard data as of the update date. No sponsored content or affiliate relationship with Google.

VerdictBloom is editorially independent. No company reviewed or approved this article before publication.