How Google’s DS-STAR turns natural language questions into verified answers over raw data files - plus my open-source implementation.

I’m taking a couple weeks off between jobs right now, and wanted to keep the brain in gear (relatively speaking). I’m also a data scientist who’s into agentic workflows and more than a little paranoid they’ll eat our lunch. So instead of doom-scrolling, I spent some time with Google’s DS-STAR paper and turned it into something you can run.

Here’s the hook: Let’s say you’ve got a folder of CSVs, a JSON dump, or a mix of semi-structured files. Someone asks: “What was our top-selling product last quarter?” or “Which regions had the highest churn?” Today, that usually means writing some code, debugging it, trying a few iterations, and only then getting an answer. What if an agent could do the exploration, coding, and verification for you?

Google’s DS-STAR (Data Science Agent via Iterative Planning and Verification) tries to do exactly that. I’ve open-sourced a Python implementation so you can run it on your own data with Gemini or OpenAI models.

What DS-STAR Does

DS-STAR answers factoid questions over structured or unstructured data. You give it:

A question (e.g. “What’s the average order value by region?”)
A data directory (CSV, JSON, Markdown, etc.)
Optional guidelines for how you want the final answer formatted

It returns a concrete answer plus the code and reasoning it used to get there, no hand-written code required.

How It Works Under the Hood

The pipeline is a multi-agent loop:

Analyzer: Explores your raw data files and builds summaries so downstream agents know what’s available.
Planner → Coder → Execute → Debugger → Verifier: Plans a solution, generates code, runs it, fixes errors, and checks that the result actually answers the question. If verification fails, the Router decides the next step and the loop continues.
Finalizer: Turns the verified result into a clear, formatted answer (e.g. markdown, bullet points).

So you get planning, code execution, and verification in one system, closer to how a data scientist would work than to a one-shot LLM answer.

Selling Points For Data Scientists

Point and ask: Point at a data directory and ask in plain language; exploration of raw data files is built-in.
Auditable: You get the code that produced the answer and logs of the plan and verification steps.
Flexible: Works with mixed file types and supports both Gemini and OpenAI backends.

Try It Yourself

Install with Poetry, point it at your data directory and output directory, and call run_analysis() with your question. The repo includes a minimal example and a DABstep-based example for harder, multi-step tasks.

Implementation (Python, Gemini + OpenAI):
https://github.com/jonathan-hsiao/DS-STAR

Paper:
DS-STAR: Data Science Agent via Iterative Planning and Verification (arXiv)

If you use it on your own datasets or extend it, I’d love to hear how it goes. Drop a note in the repo discussions or reach out.

What DS-STAR Does

How It Works Under the Hood

Selling Points For Data Scientists

Try It Yourself

A personal projects blog.

Based in San Francisco.

Feb 2 DS-STAR: Iterative Data Science Agent

What DS-STAR Does

How It Works Under the Hood

Selling Points For Data Scientists

Try It Yourself

Feb 17 Visualizing Shopify's Product Catalog

Aug 16 resu-make.io: Resume Builder Web App

A personal projects blog.

Based in San Francisco.