Verification locks: the pattern that stopped my agents from looping

3 sentinels

The escalation primitives that stopped 80% of agent loops

PIVOT_NEEDED · PROBE_NEEDED · SANDBOX_BLOCKED — plus the verification-lock pattern that broke the other 20%

Follow-up to OpenCode fork for Cadence. This post zooms in on the one mechanism that stopped runaway iterations more than anything else: the verification lock.

A coding agent can claim DONE without actually being done. It can rewrite the same wrong code three times and assert success each time. It can interpret a green compile as a green test, or a green test as a green deploy. The failure isn’t dishonesty — it’s that the agent’s evaluation function is its own LLM, and the LLM has reasons to want the task to be over.

The fix is a stricter gate. Not a stricter prompt. A gate the agent doesn’t get to evaluate.

The verification-lock pattern

Every story the planner emits carries a verification Command — a literal shell string. That Command is the lock. Re-running it with success is the only way the writer’s claim of DONE becomes a real DONE.

1

Planner emits story + Command lock created

Story: 'Replace whitespace tokenizer with tree-sitter AST walker in bash-scope.' Command: 'npm run typecheck && bun test packages/opencode/test/bash-scope.test.ts'.
2

Writer implements claim DONE

Edits files inside the worktree allowlist. When ready, calls writer_done with the literal Command string.
3

Verifier matches fingerprint fingerprint

Strips markdown backticks from the claimed Command. Compares literal string match against the planner's. If different — even one flag — the lock rejects.
4

Verifier executes real run

Runs the Command in the worktree. Captures exit code + stdout/stderr.
5

Pass → DONE, fail → escalate gated

Exit 0 + matching output pattern (if specified) → release lock. Anything else → writer gets the verifier's output back and another iteration. Three failures → rabbit-hole guard kicks in.

The key property: the writer can’t claim DONE for a Command different from the one the planner wrote. Adding a --skip-flaky flag changes the fingerprint. Replacing && with ; changes the fingerprint. Rewriting the test path changes the fingerprint. The lock rejects them all.

Why markdown stripping matters

A subtle one. Writers tend to wrap their Command claim in markdown:

"I ran `npm run typecheck && bun test`"

If the verifier extracts the literal string `npm run typecheck && bun test` (with backticks), it doesn’t match the planner’s npm run typecheck && bun test (without). False lock rejection.

The verifier strips markdown backticks before comparison. The writer’s claim style doesn’t affect the lock semantics.

Locks archive per iteration

After a story completes — pass or fail — the planner advances to the next iteration. The locks from the previous iteration get swept into archive/v<N>/ automatically.

The three escalation sentinels

The lock pattern catches DONE-claim drift. The three sentinels catch the other failure modes — when the agent shouldn’t even be trying to DONE yet.

`PIVOT_NEEDED` — orchestrator wants a fresh plan

Mid-iteration, the orchestrator can decide the brief no longer fits reality. Maybe the writer surfaced an architectural issue, maybe a test revealed an assumption was wrong. Instead of forcing the current plan, the orchestrator emits PIVOT_NEEDED and the planner re-enters with mode-aware handoff. Story list gets regenerated. Locks for completed stories stay valid; locks for the in-flight story get archived.

`PROBE_NEEDED` — writer admits underspecification

The writer is mid-story and realizes the brief is genuinely ambiguous. Two reasonable interpretations exist; both would pass the verification Command but produce different behavior. Instead of guessing, the writer emits PROBE_NEEDED with the specific question. A quick research probe answers it. The writer resumes with one interpretation confirmed.

This sentinel exists because the alternative — writers guessing and locking-in via the Command match — produces silent mismatches between intent and behavior.

`SANDBOX_BLOCKED` — legitimate operation hit the allowlist

The writer needs to run something legitimate that’s outside the worktree allowlist. Reading a sibling repo, calling git on the parent, fetching a remote URL not in the network allowlist. Instead of trying to bypass, emits SANDBOX_BLOCKED to the operator who decides whether to expand the allowlist or reframe the task.

Most agent escape attempts (--no-verify, 2>/dev/null to hide errors, sed to bypass quoting) come from SANDBOX_BLOCKED situations the agent tried to solve alone instead of escalating.

The bash AST upgrade — closing a real footgun

The original OpenCode bash sandbox tokenized commands by whitespace to apply the allowlist. That tokenizer was wrong:

Before — whitespace tokenizer

After — tree-sitter AST

Whitespace tokenizer

cat $(echo /etc/passwd)

Tokenizes as:

cat
$(echo
/etc/passwd)

Allowlist sees cat → allowed. Allowlist sees $(echo → unknown token, fails open in some configs.

Substitution slips past.

Tree-sitter AST walker

Same command parses to:

command: cat
- argument: command_substitution
  - command: echo
    - argument: /etc/passwd

Allowlist walks the tree. Sees cat. Sees echo. Sees /etc/passwd as a file argument.

/etc/passwd is outside the worktree → rejected.

The change is in commit 46c4a39b7 — a refactor more than a feature, but it closed a real sandbox escape vector. Worth upstreaming to OpenCode proper.

Three-strikes rabbit-hole guard

The last line of defense inside the writer itself. After three consecutive verification failures on the same story, the writer is forced to emit STUCK instead of continuing to iterate. The orchestrator then decides: pivot, probe, or escalate to the operator.

Without this guard, writers in long loops produce diminishing-returns iterations — small tweaks that don’t address the root issue. With the guard, the agent has to step back and articulate what’s blocking it.

What this looked like before

Often

Loops past 5 iterations

Pre-lock pattern

Rare

Loops past 3 iterations

Post-lock pattern

Common

False DONE claims

Pre-fingerprint extraction

Zero

False DONE claims

Post-fingerprint extraction

These are subjective numbers from my own iteration log — not a controlled study. But the pattern shift is real and immediate.

What I’d take to any agent framework

If you’re building agents that ship code and can’t trust the agent’s own DONE claim:

Make the evaluation external. A Command the agent didn’t generate, validated against the agent’s claim by literal string match.
Strip markup before comparison. Writers will quote, format, escape. The lock should be invariant to formatting.
Archive per iteration. Prevent stale-state contamination at file-system level.
Give the agent escape hatches. PIVOT, PROBE, BLOCKED. Cheaper than letting the agent try to bypass.
Cap iteration count. Three strikes. Force the agent to articulate the block instead of grinding.

The next post in this series will be about the structured model sweep — what 48 calls across DeepSeek thinking modes taught me about reasoning-token costs.