AI Agent Harness: How AIDEN Guides AI Coding Agents

Quick answer

An AI agent harness is the structure AIDEN wraps around raw Claude Code / Codex CLI agents: an approved spec before implementation, codebase context, an isolated git branch per story, in-loop test runs, an optional LLM PR review, and a stop button. Two are enforced; four are best-effort conventions.

Hard guarantees: Git isolation per story · spec approval gate (v1.5.21+)
Best-effort conventions: Codebase context, test iteration, LLM PR review, cancellation
Works with: Your local Claude Code and/or Codex CLI
Platform: macOS 12+ · Free tier (1 project), Solo $19/mo

Why raw agents need a harness

Claude Code and Codex CLI can read a repository, plan a change, write the code, and run the tests without a human in the loop. Run one bare in a terminal and the workflow is entirely on you: you write the prompt, create the branch, watch the run, review the diff, open the PR. For one agent on one task, that works. It stops working the moment you run agents in parallel or step away, the core problem an agentic IDE exists to solve.

A harness is the set of conventions and primitives wrapped around the agent so its output is easier to trust and cheap to discard. None of it replaces your judgement. All of it targets the three failure modes every unstructured agent eventually hits:

Confidently wrong direction

Given an ambiguous request, an agent picks one interpretation and executes it fully. You find out the direction was wrong only after the run.

No codebase context

Agents that skip reading the project reimplement existing code, contradict deliberate architecture, and add duplicate dependencies.

Shared working directory

Two agents editing the same checkout clobber each other. A failed run leaves half-written files you have to clean up by hand.

The six conventions at a glance

AIDEN ships six conventions built into every story’s lifecycle, from the kanban card to the pull request. Two are hard guarantees; four are best-effort defaults you can lean on but should not mistake for gates.

1 · Spec approval, enforced

The AI drafts a spec from your story; the agent does not start implementing until you approve it. A hard gate since v1.5.21.

2 · Codebase context

AIDEN analyzes the whole project into a technical and business overview, so agents start with real context instead of a cold repo.

3 · Git isolation, enforced

Every story runs on its own branch or worktree, created before the agent launches. Structural, it cannot be bypassed.

4 · In-loop test iteration

The agent has a terminal in its worktree and can run your test suite, then iterate on failures. Best-effort, not a guarantee.

5 · LLM PR review

An optional second model pass critiques the diff against the spec before you click the one-click PR button.

6 · Mid-run cancellation

Stop any agent from the story card. The worktree is preserved as-is, so you can inspect, salvage, or discard.

Ship your first agent today

Download AIDEN free and point it at your existing Claude Code or Codex setup. No credit card, running in minutes.

Download AIDEN free

Free to start · macOS 12+ · No credit card required

The two hard guarantees

Git isolation: one worktree per story

Before an agent launches, AIDEN creates a dedicated branch and worktree for its story, the same isolation you would build by hand with:

git worktree add ../story-142 -b story-142

Because each agent gets a separate working directory on disk, parallel agents cannot touch each other’s files, and a bad run is disposable: delete the branch and worktree, and main is untouched. This is the deepest guarantee on the page and the reason cancellation is always clean. The full mechanics are in the parallel agents and git worktrees guide.

The spec approval gate

A story moves from Stories to Spec Review on the board: the AI drafts a spec, scope, files to touch, edge cases, and since v1.5.21 the agent is blocked from implementing until you approve it. Catching a wrong approach here costs an edit to a document, not a discarded run. Why specs beat prompts is its own topic, see spec-driven AI development.

The four best-effort conventions

Codebase context

AIDEN reads the whole project and builds a technical and business overview that every agent receives, alongside workspace tools, a file browser, terminal, git access. Project files like CLAUDE.md or AGENTS.md are picked up automatically. Guidance, not enforcement: the agent decides how much to read, the way a new engineer does.

In-loop test iteration

The agent can run your test suite in its worktree, Jest, pytest, cargo test, anything your project uses, and failure output flows back into its context, so it fixes and re-runs as part of the loop. This is a convention, not a gate: AIDEN does not inspect exit codes or refuse to open a PR. The story card shows what the agent ran; you decide.

LLM PR review and one-click PR

Before opening a PR you can trigger a separate model pass over the diff: correctness concerns, missed edge cases, drift from the spec. It is informational, ship, send the agent back, or discard. The one-click PR sits next to it; the whole story-to-PR path is covered in the AI PR automation workflow.

Mid-run cancellation

Click Stop on any story card and the run terminates, leaving the worktree exactly as the agent left it. Inspect the partial work, keep what is useful, or delete the branch. Git isolation makes this safe by construction, no shared state to clean up, no other agent affected.

AIDEN vs raw Claude Code CLI vs chat IDE

The CLI is not the problem, AIDEN runs on top of it, as described in Claude Code orchestration. The difference is who does the workflow work: raw CLI puts the spec, branching, review, and PR on you; a chat-style IDE handles edits but not orchestration.

Convention	AIDEN	Raw Claude Code CLI	Chat IDE
Spec before implementation	Drafted + approval gate	You write it in the prompt	You write it in the chat
Codebase context	Project overview + tools	CLAUDE.md, maintained by you	Open files / index
Git isolation	Per-story worktree, enforced	Manual worktree setup	Shared working directory
Test iteration	Terminal in worktree, default	Yes, manual setup	Partial
LLM PR review + PR	Built-in pass, one-click PR	You run and open it	Usually absent
Cancellation	Stop button, worktree kept	Ctrl+C in the terminal	Stop the chat

AI agent harness, FAQ

Are the six conventions enforced or optional?

Two are enforced. Git isolation is structural: AIDEN creates a dedicated branch or worktree for every story before the agent launches, and that cannot be bypassed. The spec approval gate has been enforced since v1.5.21: the agent does not start implementing until you approve the spec. The other four, codebase context, in-loop test iteration, LLM PR review, and cancellation, are best-effort conventions built into how AIDEN configures and runs the agent, not hard gates. You decide when a PR gets opened.

Does the spec approval gate slow me down?

It adds one review step before implementation starts: the AI drafts the spec, you read it and approve or edit it. Approving a good spec is one click. In exchange, wrong directions, wrong API surface, wrong data model, wrong architecture, get caught before the agent spends its whole run implementing them, when the fix is an edit to a document instead of a discarded branch.

What happens when tests fail during an agent run?

The agent has a real terminal inside its own worktree and can run your test suite the way you would, npm test, pytest, cargo test, whatever your project uses. Failure output flows back into its context and it iterates: read the error, patch, re-run. This is best-effort, not a guarantee. AIDEN does not block PR creation on a green suite; you see what the agent ran on the story card and make the call yourself.

How is a harness different from a CI pipeline?

CI runs after a PR is opened, on your CI provider. AIDEN's conventions run earlier, inside the agent's loop, on the agent's own worktree, before a PR exists. Git isolation means a failed run is a deleted branch, not a reverted merge. The spec gate, codebase context, test iteration, and LLM review give the agent chances to catch mistakes before you see the diff. CI stays valuable afterwards; the two are complementary.

Does the harness work with Codex CLI, or only Claude Code?

Both. AIDEN orchestrates whichever you have installed locally, Claude Code, Codex CLI, or the two side by side. The conventions are the same either way: AIDEN handles the workflow (specs, branches, review, PRs) while the CLIs handle the model calls.

The AI agent harness: six conventions around every agent run