Harness engineering¶

This is the third guide in a trilogy. In Fundamentals we covered how to pilot a coding agent; in Spec-Driven Development we learned to specify intent so the agent can execute it well. Now comes what's left: designing the system that surrounds the agent — guides, sensors, sandboxes, loops, structured context — so a team can rely on it sustainably.

This isn't a step-by-step tutorial or a manual for any particular product. It's an essay, organized in chapters, about the principles and patterns emerging in teams already operating at this scale (OpenAI, Stripe, Thoughtworks/Böckeler, Geoffrey Huntley) — patterns that show up regardless of stack or tool.

Premises¶

You work in a technical team with experience operating agents and skills.
You want to design the system, not find the magic prompt.
You're willing to invest in internal infrastructure (linters, sandboxes, observability, versioned docs) if it multiplies the agent's output.
You care about sustainability over months, not just the first impressive commit.

Central thesis (in one sentence)¶

The model is the engine. The harness — guides, sensors, sandboxes, loops, structured context — is what turns an LLM into an agent a team can rely on.

How to read this guide¶

The chapters are ordered as mindset → pillars → practice → maintenance → anti-patterns → diagnosis, but each one stands on its own. If you already get the "why", jump directly to the chapter that solves a concrete problem for you.

Index¶

Chapter 0 — Entry point¶

Harness engineering: producing software at scale with agents

Part 1 — Mindset¶

Part 2 — The two pillars¶

Part 3 — Practice¶

Part 4 — Sustainability¶

Part 5 — Pitfalls¶

11. Anti-patterns

Part 6 — Bringing it to your team¶

12. Diagnosis and maturity: where you are and what to fix first

What you won't find in this version¶

Ready-to-copy templates (sample AGENTS.md, hooks, CI configs). Coming in a later version.
Comparisons between specific tools. The guide is deliberately tool-agnostic.
Productivity promises. Stripe's or OpenAI's numbers are references, not guarantees.

Sources¶

This guide synthesizes ideas from:

Birgitta Böckeler (Thoughtworks, published on martinfowler.com) — Harness Engineering. The term "harness" in this sense emerged around LangChain with the formula "Agent = Model + Harness".
Kief Morris (on martinfowler.com) — Humans and Agents in Software Engineering Cycles
Chad Fowler — Relocating Rigor — The Phoenix Architecture
Geoffrey Huntley — everything is a ralph loop
Ryan Lopopolo (OpenAI) — Harness Engineering: leveraging Codex in an agent-first world
Alistair Gray (Stripe, Leverage team) — Minions: Stripe's one-shot, end-to-end coding agents — part 1 · part 2