Engineering With AI Agents: Ship Real Code, Not Vibes

Quick answer

Engineering with AI agents means giving each task a written, approved spec, an isolated git branch, and a reviewable pull request, instead of prompting an assistant and hoping. AIDEN runs this workflow on top of your local Claude Code and Codex CLIs.

What AIDEN is: Desktop workspace running your Claude Code + Codex CLIs on parallel git branches
Hard guarantees: Spec approval gate before coding · one branch/worktree per story
Best-effort conventions: Agents run tests and iterate · optional LLM PR review pass
Pricing: Free (1 project) · Solo $19/mo · Lifetime $169 · Team $10/seat

What Vibecoding Is, and Where It Breaks

Vibecoding is generating code by casually prompting an AI assistant: type “add a login page”, accept whatever comes back, repeat until the app roughly works. Andrej Karpathy popularized the term in early 2025, and for prototypes it is genuinely the right tool, speed is the feature and the code is disposable.

The trouble starts when a vibecoded prototype graduates to production. Each prompt is a fresh negotiation with a model that has no memory of your previous decisions, and the failure modes compound in predictable ways:

Context resets every session

The model cannot know the users table was intentionally denormalized or that auth middleware lives in a non-obvious file. Every new prompt risks contradicting a prior one, and the model complies without complaint.

Nothing is written down

Nobody owns architectural decisions because nobody recorded them. Six months later there is no spec, no rationale, no acceptance criteria, just a chat log, if you kept it.

Everything lands on one branch

One chat thread, one working copy, changes piling onto main. Two developers vibecoding the same repo produce code that conflicts at the seams.

Review becomes archaeology

With no written intent to compare against, reviewing means re-reading the whole diff cold. Most people skim and merge, which is exactly when bugs ship.

Stopping vibecoding does not mean using less AI. It means adding the thin layer of engineering discipline that an agentic IDE is built around: specs, branches, and pull requests.

The Workflow: From Story to Pull Request

Here is the loop AIDEN runs for every story, end to end. Board columns are Stories → Spec Review → In Progress → Review → Done, and each card moves through them as the steps below complete.

1
Write a story
One or two plain-English sentences on the kanban board describing what you want. AIDEN has already analyzed your codebase, architecture, dependencies, conventions, so you do not have to restate context.
2
Approve the spec
AIDEN's AI drafts a full spec from your story: files in scope, acceptance criteria, what must not change. You edit and approve it, an enforced gate since v1.5.21, and the core of spec-driven AI development. No agent codes without an approved spec.
3
The agent works on its own branch
AIDEN opens a git worktree on a dedicated branch and launches Claude Code or Codex with the spec as its prompt. Isolation is the hard guarantee here, it is what makes a multi-agent workflow safe, with several stories in flight at once.
4
The agent runs tests and iterates
Agents can run your test suite in their own terminal, read the failures, and fix them before marking the story done. This is a best-effort convention built into AIDEN's prompts, not a guarantee, you still review the result.
5
Optional LLM review pass
Before you look at the diff, a second model can review the change against the spec and flag issues. One more filter between the agent's output and your attention.
6
One-click PR: you review diffs, not keystrokes
Opening the pull request is a single click from the story card, with the spec attached as the description. You review the outcome and merge or redirect, the full loop is covered in AI PR automation.

Before and After, Side by Side

Ship your first agent today

Download AIDEN free and point it at your existing Claude Code or Codex setup. No credit card, running in minutes.

Download AIDEN free

Free to start · macOS 12+ · No credit card required

Four Principles That Make It Work

Agents work on specs, not prompts

A prompt is a one-time instruction; a spec is a contract. It tells the agent exactly what done looks like, criteria, edge cases, exclusions, so the agent can check its own output before you see it.

Each story gets its own branch

Shared branches mean colliding changes and bleeding context. One worktree per story removes the problem and unlocks real parallelism. This is AIDEN's hard guarantee, enforced by git itself.

Tests live inside the agent loop

Tests are the only feedback an agent can act on without you. AIDEN's prompt conventions push agents to write and run tests as part of the story, a workflow discipline, not a system gate.

Engineers review PRs, not keystrokes

Watching an AI type is the worst use of your attention. Reviewing a finished PR, approach, edge cases, tests, is engineering judgment applied at the right altitude.

These principles are not unique to AIDEN, they are how the best agentic tools converge. Our comparison of agentic IDEs in 2026 scores each tool against them.

Setup in Four Steps

AIDEN is a macOS 12+ desktop app (Apple Silicon and Intel, signed and notarized) that requires at least one of Claude Code or Codex CLI installed locally. It works with either, better with both:

# 1. Install a CLI if you don't have one
npm install -g @anthropic-ai/claude-code

# 2. Download AIDEN from aidenapp.org, drag to Applications
# If macOS shows an "app is damaged" warning:
xattr -cr /Applications/AIDEN.app

On first launch, AIDEN picks up your existing CLI config, keys stay in ~/.claude and ~/.codex, never read or transmitted, and inherits your MCP servers. Open a local repo or clone from GitHub, let the codebase analysis run, then write your first story. Your code never leaves your machine.

When to Vibe, When to Engineer

Vibecoding is not bad, it is wrong-shaped for production. The honest version of this guide is a judgment call, not a rule:

Vibe when

Building a throwaway prototype
Exploring an unfamiliar API
Sketching a UI before a design review
Writing a one-off script
Learning a new framework

Engineer when

Shipping to production
Working in a shared codebase
Building anything that needs tests
Running multiple stories in parallel
The code will outlive the week

Switching modes is cheap. AIDEN's free tier covers one project with the full workflow, if a prototype starts becoming real software, write a spec, approve it, and let an agent take it from there.

FAQ

What is vibecoding?

Vibecoding is generating code by casually prompting an AI assistant, type a rough idea, accept what comes back, repeat until the app roughly works. Andrej Karpathy coined the term in early 2025. It is genuinely fast for prototypes and throwaway scripts, but it produces untested, unstructured code that becomes a liability the moment it has to live in production.

How is engineering with AI agents different from a chat IDE like Cursor?

A chat IDE puts an assistant next to your editor: you prompt, it suggests, you accept line by line. Engineering with agents flips the unit of work, an agent receives an approved spec, implements it on its own git branch, and hands you a pull request. You review finished diffs instead of watching suggestions. AIDEN orchestrates several such agents in parallel on separate branches.

Do I need Claude Code experience to use AIDEN?

You need at least one of Claude Code (Anthropic) or Codex CLI (OpenAI) installed and authenticated locally, AIDEN is an orchestrator on top of your CLIs, not a replacement for them. But you do not need to be a CLI power user: AIDEN drives the sessions for you from a kanban board, and it inherits your existing config, including MCP servers, automatically.

Do agents guarantee that tests pass before the PR?

No, and any tool that promises this is overselling. AIDEN's agents can run your test suite in their own terminal and iterate on failures, that is a convention baked into the workflow, and it works well in practice, but it is best-effort, not a hard gate. The two things AIDEN does enforce are git isolation (every story on its own branch or worktree) and the spec approval gate before an agent starts coding.

Is vibecoding ever the right choice?

Yes. For throwaway prototypes, one-off scripts, exploring an unfamiliar API, or learning a new framework, the vibe loop is faster than any formal process, the code is disposable, so discipline buys you nothing. The judgment call is noticing when a prototype stops being disposable. That is the moment to write a spec and switch modes.

Engineering With AI Agents: How to Stop Vibecoding

Quick answer

What Vibecoding Is, and Where It Breaks

Context resets every session

Nothing is written down

Everything lands on one branch

Review becomes archaeology

The Workflow: From Story to Pull Request

Write a story

Approve the spec

The agent works on its own branch

The agent runs tests and iterates

Optional LLM review pass

One-click PR: you review diffs, not keystrokes

Before and After, Side by Side

Ship your first agent today

Four Principles That Make It Work

Agents work on specs, not prompts

Each story gets its own branch

Tests live inside the agent loop

Engineers review PRs, not keystrokes

Setup in Four Steps

When to Vibe, When to Engineer

Vibe when

Engineer when

FAQ

Keep reading

What is agentic engineering?

Vibe coding vs agentic engineering

What is an agentic development environment (ADE)?

Spec-driven AI development

What is an agentic IDE?

Multi-agent coding workflow

AI PR automation

The AI agent harness

Stop vibecoding. Start engineering.