Spec-Driven AI Development: Make Your Agents Accountable
Before an agent writes a single line of code, it needs a written plan. Spec-driven development turns chaotic AI output into reviewable, predictable engineering.
What Is Spec-Driven Development?
Spec-driven development is a software engineering practice where every task starts with a written specification — a precise, bounded description of what will change, why, and how success is measured — before any implementation begins. The spec is the contract between the person who defines work and the person (or agent) who executes it.
In traditional engineering, specs are written by architects or tech leads. In spec-driven AI development, the spec serves the same purpose but its consumer is an AI coding agent. The agent reads the spec, understands the scope, and produces code that matches the acceptance criteria. Without a spec, the agent has to infer everything — scope, constraints, affected files, definition of done — from a single prompt. That inference is where things go wrong.
Contrast this with prompt-driven coding, often called vibecoding: you describe what you want in a chat message and hope the agent guesses the right scope. Vibecoding works for throwaway scripts and tiny prototypes. It breaks down the moment your codebase has more than a handful of files, or when two features need to be developed in parallel, or when you want to review code without reading every token the agent produced.
Spec-driven development is not about writing more documentation. It is about creating a shared, machine-readable definition of done before execution starts — so the agent is working toward something verifiable, not just generating plausible-looking code.
Why AI Agents Need Specs, Not Just Prompts
When you hand a prompt to an AI coding agent without a spec, three failure modes appear reliably:
Hallucinated scope
The agent decides on its own which files to touch and how many layers deep to go. A prompt like “add user authentication” could mean touching 3 files or 30 depending on what the model thinks is reasonable. You find out after the fact.
Collateral damage
Without explicit boundaries, agents overwrite things they were never supposed to touch. A refactor leaks into a payment module. A UI change reorganizes an API route. The diff is unreadable.
Unreviewable PRs
If you cannot compare the output against a written intention, code review becomes re-reading the entire diff with no frame of reference. Most developers skip the review and merge — which is when bugs ship.
Specs fix all three problems. A spec names the files in scope. It describes the acceptance test. It explicitly states what must not change. When the agent produces a PR, you review it against the spec — not against your memory of what you asked for.
There is also a parallelism argument. If you want to run two or three agents simultaneously on different features, each agent must operate in a bounded context. Otherwise agents corrupt each other's work. Specs define those boundaries precisely enough for safe parallel execution.
Finally, specs are auditable. Six months from now, you can read the spec for any completed story and understand exactly what decision was made, why, and what the acceptance criteria were. A chat prompt in a conversation history is almost impossible to reconstruct that way.
How Spec-Driven Development Works in AIDEN
AIDEN turns spec-driven development into a four-step workflow that runs end-to-end inside a single desktop app. Here is what happens from the moment you describe a feature to the moment a PR is ready for review.
Write a story card
You write a short story card on the AIDEN kanban board — one or two sentences in plain English. For example: "Add user authentication with email and password. Users should be able to sign up, log in, and log out. Use JWT tokens stored in httpOnly cookies."
AI generates the spec
AIDEN's AI reads your story card and your codebase structure, then generates a full spec. The spec names the exact files to create or modify (e.g., src/lib/auth.ts, src/app/api/auth/login/route.ts, src/middleware.ts), defines the acceptance criteria (login returns a 200 with a set-cookie header, invalid credentials return 401), lists what must not change (existing user table schema, the /api/payments routes), and includes example request/response shapes.
You approve
You review the spec in AIDEN's spec editor. If the scope looks right, you click Approve. If it is too broad or missed a constraint, you edit it before approving. Approval takes under two minutes for most tasks.
Agent branches, codes, and opens a PR
After approval, AIDEN creates a git worktree on a new branch, launches a Claude Code or Codex agent with the spec as its only context, and monitors progress. When the agent marks the task done, AIDEN runs your test suite and opens a PR with the spec attached as the description. You review the diff against the spec — not against a chat log.
The same pattern applies whether you are adding authentication, refactoring a database layer, or building a payment flow. The spec is always the starting point. The agent never starts coding without one.
Writing Good Specs for AI Agents
Whether AIDEN generates the spec or you write it yourself, five rules determine whether the agent produces useful output or a mess.
Rule 1:Be specific about scope
"Add authentication" is not a spec. "Add email/password sign-up and login to the existing Express server in /src/server.ts, using bcrypt for hashing and JWT for session tokens" is. Vague scope produces vague code.
Rule 2:Name the files and functions involved
List the files the agent should create, modify, or read. If a specific function needs to change, name it. This eliminates the hallucinated-scope failure mode entirely — the agent cannot touch what is not on the list.
Rule 3:Define the acceptance test
Write a concrete, verifiable success condition. "The /api/auth/login endpoint returns status 200 and a set-cookie header with a httpOnly JWT when given valid credentials, and returns 401 with an error message when given invalid credentials." If the spec includes a testable assertion, the agent can verify its own work before opening a PR.
Rule 4:Say what NOT to change
Explicitly list files, modules, and behaviors that must remain untouched. "Do not modify src/lib/stripe.ts, the User database schema, or any existing API routes outside of /api/auth/". This is often the most valuable part of a spec — it prevents collateral damage.
Rule 5:Include example inputs and outputs
For any function or endpoint that handles data, show a concrete example. A sample request body, a sample response payload, an example SQL row. Agents are much more accurate when they have a concrete example to match rather than an abstract description to interpret.
Spec-Driven vs Prompt-Driven: Side-by-Side
Here is how the two approaches compare across the dimensions that matter for production software:
| Dimension | Prompt-driven (vibecoding) | Spec-driven (AIDEN) |
|---|---|---|
| Scope control | Agent decides on its own | Explicitly bounded by spec |
| Reviewability | Diff vs. a chat message | Diff vs. written acceptance criteria |
| Parallel agents | Risky — agents conflict | Safe — each agent has isolated scope |
| Collateral damage | Common | Prevented by explicit exclusion list |
| Auditability | Chat log (if you kept it) | Spec stored alongside PR |
| Definition of done | Implicit / subjective | Concrete acceptance test in spec |
Frequently Asked Questions
What is a spec in AIDEN?
Do I write specs manually?
Can specs be generated by AI?
How long does spec approval take?
Related reading
Stop vibecoding. Start shipping with specs.
AIDEN generates specs from your story cards, runs bounded agents in parallel, and opens PRs you can actually review. Free to start.
Download AIDEN — free