Best Practices — ADD | Agent Driven Development

ADD / Guides / Best Practices

Writing Good Specs

Specs are the contract between you and your agent team. A well-written spec produces well-written code; a vague spec produces chaos. Every minute spent on spec clarity saves ten minutes of rework.

Be specific about acceptance criteria — vague ACs lead to vague tests, which lead to implementations that “work” but don’t meet expectations.
Include happy path AND error cases — agents will only test what you specify. If you skip error handling in the spec, you skip it in the code.
Define data models explicitly — field names, types, required vs. optional, constraints. Don’t leave agents guessing your schema.
Include API contracts if applicable — request/response shapes, status codes, headers. The spec should be enough to build a mock server.
Reference related specs for cross-feature dependencies — if feature B depends on feature A’s auth module, link to the auth spec explicitly.

Tip: The best specs read like a contract. If an AC is ambiguous, the agent will ask during the /add:spec interview. Let the interview do its job — answer thoroughly, and the spec writes itself.

Acceptance Criteria That Work

Acceptance criteria are the atomic unit of verification. Each one becomes a test, and each test becomes a guarantee. Get these right and everything downstream improves.

Use numbered format: AC-001, AC-002, etc. — makes traceability from spec to test to implementation unambiguous.
Each AC should be independently testable — if you can’t write a single test for it, it’s not a good AC.
Avoid compound ACs — “user can login AND see dashboard” is two behaviors. Split them into AC-001 (login) and AC-002 (dashboard redirect).
Include boundary conditions — what happens at limits? What about zero? What about maximum?

Good AC: “AC-001: Given a valid email and password, when the user submits the login form, then they receive a JWT token and are redirected to /dashboard”

Bad AC: “AC-001: Login works correctly” — What does “correctly” mean? What constitutes “login”? This AC is untestable because it specifies nothing.

Edge Cases

The difference between a prototype and a production system is edge case handling. Always include these categories in your specs:

Input boundaries: empty inputs, maximum lengths, special characters, unicode, null vs. undefined
Concurrency: simultaneous access, race conditions, duplicate submissions
Network: timeouts, connection failures, partial responses, retries
API-specific: malformed requests, missing authentication, expired tokens, rate limiting
UI-specific: screen sizes, keyboard navigation, loading states, error states, empty states

Tip: Use a mental checklist: “What if it’s empty? What if it’s huge? What if two happen at once? What if the network dies?” Cover those four and you handle most edge cases.

TDD Discipline

ADD enforces a strict four-phase TDD cycle. Each phase has a purpose, and skipping any of them undermines the entire methodology.

RED phase: Write ALL failing tests before any implementation. Every acceptance criterion becomes at least one test. The test suite should be a complete expression of the spec.
GREEN phase: Write MINIMAL code to pass — no gold-plating, no “while I’m here” additions. The only goal is a green test suite.
REFACTOR phase: Clean up with confidence. Your tests are the safety net. Improve naming, extract functions, reduce duplication — but change no behavior.
VERIFY phase: An independent agent runs quality gates in a different context with no shared state. This catches assumptions the implementing agent baked in.

Warning: Never skip phases. The discipline IS the value. RED without GREEN is just a wish list. GREEN without RED is untested code. REFACTOR without tests is hope-driven development. VERIFY without independence is self-grading homework.

Test Strategy

Different test types serve different purposes. Match your test strategy to what you’re verifying.

Unit tests for business logic — fast, isolated, high volume. These are your first line of defense.
Integration tests for API boundaries — verify that modules talk to each other correctly. Test real interactions, not mocks of mocks.
E2E tests for critical user paths (Beta+ maturity) — expensive to run but catch what unit and integration tests miss.
Test naming: describe what behavior is being tested, not how. should reject expired tokens is better than test checkTokenExpiry function.

Coverage Targets

Coverage expectations scale with project maturity. Don’t over-invest in coverage for a POC, and don’t ship a GA product at 40%.

Maturity Level	Coverage Target	Rationale
POC	None required	Validate the idea first. Tests slow exploration.
Alpha	60%	Core paths tested. Gaps acceptable in experimental areas.
Beta	80%	Production-bound code needs real coverage. E2E tests added.
GA	90%	Ship with confidence. Remaining 10% is generated code or trivial getters.

Human-Agent Collaboration

ADD defines three engagement modes that control how much autonomy agents have. Choosing the right mode prevents both bottlenecks and runaway decisions.

Guided mode — Use when exploring unfamiliar codebases, onboarding to a new project, or making architectural decisions. The agent asks before acting.
Balanced mode (default) — Use for established projects where specs exist and the agent understands the codebase. The agent acts on clear specs, asks on ambiguity.
Autonomous mode — Use only for well-specified features at GA maturity. The agent executes full TDD cycles without interruption. Requires clear specs and established patterns.

Tip: Always define clear boundaries before /add:away. Review the briefing from /add:back before continuing work — the agent may have queued decisions that need your input.

Away Mode Best Practices

Away mode lets your agent team work while you’re gone. But autonomy without boundaries is a recipe for irreversible mistakes.

Define scope explicitly: “Work on specs/auth-login.md only” is clear. “Make progress” is not.
Set boundaries: “No database schema changes,” “No new dependencies,” “No merges to main.”
Set duration: Default is 2 hours. Adjust based on work scope — larger scope needs more time, but also more risk.
Review the away log when you return — it contains every action taken, every decision made, and every item queued for your review.
Run /add:back for a structured briefing — don’t just read the log; let the agent summarize what matters.

Warning: Agents in away mode will never deploy to production, merge to main, or implement features without specs — these are hard boundaries. But they CAN commit, push, and open PRs. Make sure your scope is intentional.

Running Effective Retros

Retrospectives are how ADD gets smarter over time. Without them, agents repeat the same mistakes across cycles.

Run after every milestone or every 2 weeks — whichever comes first. Stale retros lose context.
Agent auto-checkpoints capture data continuously — after every verify, TDD cycle, deploy, and away session. The data is there; retros organize it.
Interactive retros (/add:retro) capture both perspectives — the human sees things the agent missed, and vice versa. Both viewpoints matter.
Agree on 2–3 concrete changes — don’t try to change everything at once. Small, specific improvements compound over time.
Promote broadly useful learnings to your cross-project library — if it helped here, it will help there.

Knowledge Management

ADD’s three-tier knowledge system prevents agents from starting every project from scratch. Let it work for you.

Let checkpoints accumulate naturally — they trigger automatically after verify, TDD cycle, deploy, and away sessions. Don’t force them.
Review .add/learnings.md periodically — prune outdated entries, merge duplicates, and clarify vague ones.
During retros, promote universal patterns to ~/.claude/add/library.md — cross-project wisdom that helps every future project.
Project-specific quirks stay in Tier 3 — “this project uses Prisma for ORM” is project-specific. “Always validate foreign keys in integration tests” is universal.

Tip: Agents read all three tiers before starting any task. Well-maintained knowledge tiers mean agents start smarter on every feature.

Knowledge Promotion

Knowledge flows upward through the tiers: project discoveries can be promoted to your user library during retros, and truly universal insights can be promoted to plugin-global (in the ADD dev project only).

Tier	Location	Scope	Promote When
Tier 3: Project	`.add/learnings.md`	This project only	Auto-checkpoints; always active
Tier 2: User	`~/.claude/add/library.md`	All your projects	During `/add:retro` when a learning is universal
Tier 1: Global	`knowledge/global.md`	All ADD users	Rare; only for fundamental ADD insights

Warning: Resist the urge to promote everything to Tier 2. A noisy knowledge library is worse than an empty one — agents waste time processing irrelevant context. Only promote patterns you’d bet money will help your next project.

Common Mistakes

These are the mistakes we see most often. Every one of them has cost teams real time — learn from their pain.

Anti-Pattern	Why It’s Bad	Do This Instead
Skipping specs	Implementation drifts, no traceability, agents guess at requirements	Always `/add:spec` before code
Writing tests after code	Tests validate implementation, not behavior — they pass by definition	Strict RED → GREEN order
Gold-plating in GREEN phase	Wastes time, tests may not cover extras, scope creep	Minimal code to pass tests, nothing more
Ignoring retros	Same mistakes repeated, no learning accumulation	Run retros every 2 weeks minimum
Autonomous mode too early	Agent makes wrong decisions without context, rework required	Start Guided, graduate to Balanced
Compound acceptance criteria	Can’t test independently, unclear pass/fail	One behavior per AC
Skipping VERIFY phase	Quality issues slip through, self-grading homework	Always run `/add:verify`
Not setting away boundaries	Agent makes irreversible changes outside intended scope	Define explicit scope and limits
Promoting everything to Tier 2	Knowledge library becomes noisy, agents waste context window	Only promote universal patterns
Starting at GA maturity	Overwhelming process for new projects, friction kills momentum	Start POC, promote as project matures