Skip to content

Harness engineering

This is the third guide in a trilogy. In Fundamentals we covered how to pilot a coding agent; in Spec-Driven Development we learned to specify intent so the agent can execute it well. Now comes what's left: designing the system that surrounds the agent — guides, sensors, sandboxes, loops, structured context — so a team can rely on it sustainably.

This isn't a step-by-step tutorial or a manual for any particular product. It's an essay, organized in chapters, about the principles and patterns emerging in teams already operating at this scale (OpenAI, Stripe, Thoughtworks/Böckeler, Geoffrey Huntley) — patterns that show up regardless of stack or tool.

Premises

  • You work in a technical team with experience operating agents and skills.
  • You want to design the system, not find the magic prompt.
  • You're willing to invest in internal infrastructure (linters, sandboxes, observability, versioned docs) if it multiplies the agent's output.
  • You care about sustainability over months, not just the first impressive commit.

Central thesis (in one sentence)

The model is the engine. The harness — guides, sensors, sandboxes, loops, structured context — is what turns an LLM into an agent a team can rely on.

How to read this guide

The chapters are ordered as mindset → pillars → practice → maintenance → anti-patterns → diagnosis, but each one stands on its own. If you already get the "why", jump directly to the chapter that solves a concrete problem for you.

Index

Chapter 0 — Entry point

Part 1 — Mindset

Part 2 — The two pillars

Part 3 — Practice

Part 4 — Sustainability

Part 5 — Pitfalls

Part 6 — Bringing it to your team

What you won't find in this version

  • Ready-to-copy templates (sample AGENTS.md, hooks, CI configs). Coming in a later version.
  • Comparisons between specific tools. The guide is deliberately tool-agnostic.
  • Productivity promises. Stripe's or OpenAI's numbers are references, not guarantees.

Sources

This guide synthesizes ideas from: