Tools Notable

Gemini in Google Sheets Hits 70% on Spreadsheet Benchmark, Nearing Human Experts

March 10, 2026 2 min read Source: Google AI Blog

Gemini in Google Sheets just achieved state-of-the-art performance.

Image: Google

A 70.48% success rate on SpreadsheetBench. That's the number Google is leading with as it rolls out new Gemini features in Google Sheets, and it's a genuinely impressive result. SpreadsheetBench is a public benchmark that tests how well AI can autonomously manipulate complex, real-world spreadsheets - think formatting tables, writing formulas across ranges, cleaning messy data, and building pivot summaries without hand-holding. Google says this score "not only exceeds competitors but nears human expert ability."

The new beta features let you describe what you want in plain English and Gemini handles the rest: creating sheets from scratch, reorganizing data, running complex analyses, and editing existing spreadsheets. This goes well beyond the "write me a SUM formula" assistance that's been available for a while. Google is positioning this as full autonomous spreadsheet manipulation - you describe the outcome, Gemini figures out the steps.

What 70% Actually Means in Practice

Benchmarks are useful but imperfect. A 70% success rate means roughly 3 out of 10 tasks still trip up the AI, which tracks with what most people experience using AI for spreadsheet work - it's great for common patterns but stumbles on edge cases or ambiguous instructions. The fact that this "nears human expert ability" says as much about how error-prone spreadsheet work is for humans as it does about Gemini's capabilities.

The features are launching in beta across Google Workspace, alongside broader Gemini updates for Drive, Docs, and Slides. Google hasn't announced specific pricing changes, so these appear to be rolling into existing Workspace plans with Gemini access.

For anyone spending hours each week wrestling with spreadsheets, this is the most concrete evidence yet that AI spreadsheet tools are getting genuinely useful rather than just demo-impressive. The gap between benchmark scores and daily reliability still matters, but 70% on real-world tasks is a meaningful threshold.

Source

Google AI Blog Gemini in Google Sheets just achieved state-of-the-art performance. →

What 70% Actually Means in Practice

Source

Related Tools

More from today

The Circular Logic Problem: When AI Writes Both Your Code and Your Tests

Developer Claims 2x Productivity by Teaching Claude Code to Do Less

The 'Last Mile' Problem: Why Most AI-Built Apps Never Reach Production

Cookie Preferences