Designing Small Agents That Actually Run Unsupervised

An AI agent that requires human review on every third step isn't an agent — it's a very expensive interface.

The AI field is currently obsessed with agents. Every conference talk, every framework, every demo is about autonomous agents doing complex multi-step tasks. Most of these demos would fall apart in production in under an hour.

I'm not against agents. I build them regularly. But I've become obsessive about the word "small."

The Scope Problem

Most agent projects fail because the scope is wrong from the start.

Teams define the agent's goal as something like "manage our customer onboarding process" or "handle support escalations." These are processes with dozens of decision points, edge cases that have accumulated over years, and failure modes that cascade in non-obvious ways.

When the agent makes its first mistake on a real task (and it will), the team either adds more human checkpoints (defeating the purpose) or tolerates errors (eroding trust until the project gets shelved).

The fix: define the agent's scope by the blast radius of its mistakes, not by the scale of the problem you wish you could automate.

A small agent does one thing in one well-understood context, where mistakes are either impossible, reversible, or cheap to catch.

What "Small" Actually Means

Small agents I've shipped that work reliably:

Document classifier. Reads incoming documents, categorizes them, routes to the right folder. Scope: classification only, no deletion, no sending. Mistakes are visible and correctable.

Meeting prep agent. 30 minutes before a scheduled meeting, pulls relevant context from CRM and internal docs, generates a one-page brief. Scope: read-only, output is advisory. Human decides whether to read it.

PR description writer. When a developer opens a PR, reads the diff and drafts a description. Scope: draft only, human edits and submits. Agent can't merge anything.

Data validation agent. After ETL runs, checks output against a schema and business rules, surfaces anomalies. Scope: read and flag. No writes.

Notice the pattern: these agents create information or categorize existing information. They don't take consequential actions directly. They keep humans in the loop for anything irreversible.

Designing for Failure First

Every agent I build has explicit failure modes before it has capabilities.

Before writing a single line of agent logic, I write down answers to:

What happens if the LLM returns something unexpected?
What happens if a required input is missing or malformed?
What's the maximum acceptable error rate, and how will we measure it?
How does a human override or correct the agent's output?
What does the agent do if it's uncertain?

That last one is the most important. Agents that don't have an explicit "I'm not confident, stopping and flagging" state will hallucinate their way through uncertainty confidently. This is how small mistakes become large ones.

The Reliability Stack

Reliable small agents have four layers:

Input validation. Check that what the agent received is what it expected before doing anything. Fail loudly with clear error messages.

Structured output. Force the LLM to return structured data (JSON with a schema), not free text. Parse and validate the output before acting on it.

Idempotency. If the agent runs twice, the second run should either produce the same result or be a no-op. No double-sends, no duplicate writes.

Observability. Log every input, every LLM call, every output, every decision. When something goes wrong (and it will), you need to reconstruct exactly what happened.

When to Scale Up

Once a small agent is running reliably in production — low error rate, predictable behavior, humans comfortable trusting it — you can consider extending its scope.

The extension pattern: add one new capability at a time, with the same failure-mode analysis for each addition. Never extend scope and trust simultaneously.

The teams I see fail with agents are trying to build the full automation in one shot. The teams I see succeed build the smallest useful thing, run it long enough to understand its failure modes, and extend it incrementally.

Boring, but it ships.