Technical Design: TD-002-helix-cli
Source identity (from
02-design/technical-designs/TD-002-helix-cli.md):
ddx:
id: TD-002
status: partially-superseded
superseded_by: helix.prdPARTIALLY SUPERSEDED — The portions of this technical design that describe
helix runas the primary execution surface, queue-drain mechanics, and CLI wrapper behavior as core HELIX architecture are superseded by the current PRD (helix.prd). HELIX does not own a CLI or execution loop. This design survives only as documentation for the DDx reference-runtime adapter — the thin CLI wrapper used when running HELIX commands in a DDx-backed environment. It must not be read as specifying the portable HELIX methodology surface.
Technical Design: TD-002-helix-cli
Status: backfilled Backfill Date: 2026-03-25
Story Reference
User Story: [HELIX CLI wrapper execution] | Feature: [FEAT-002-helix-cli] | Solution Design: [SD-001-helix-supervisory-control]
Acceptance Criteria Review
- The wrapper keeps HELIX execution bounded.
- Queue-drain decisions follow the documented control contract.
- Tracker-backed execution remains deterministic and testable.
- Local installation exposes the mirrored HELIX command surface safely.
Technical Approach
Provide a thin Bash wrapper that keeps HELIX execution bounded, delegates supervisory decisions to documented HELIX actions, and delegates managed execution to DDx execution surfaces without requiring operators to assemble prompts manually. The wrapper is an entrypoint and compatibility layer, not the durable home of queue-drain mechanics.
This design defines the orchestration contract for helix run. The wrapper is
not allowed to invent its own workflow policy, and it must not grow a parallel
managed-execution substrate now that DDx ships execute-bead and is adopting
execute-loop as the queue-drain contract.
Orchestration Contract
helix run is a bounded supervisor around HELIX queue state.
- It always starts from tracker-backed ready work, not from assumptions about in-progress work.
- It delegates managed bead execution to DDx instead of owning the git-aware execution mechanics itself.
- It should converge on
ddx agent execute-loopas the queue-drain substrate, withddx agent execute-beadas the bounded execution primitive underneath. - It treats alignment as a bead-governed planning action: acquire the
kind:planning,action:alignbead, then run the stored prompt and file properly ordered follow-on beads. - It treats
checkas the source of next-step guidance after the ready queue drains. - It revalidates selected issue state at safe boundaries so concurrent local refinement does not lead to stale claim or close behavior.
- It keeps alignment and backfill as distinct follow-up actions.
- It does not auto-build blocked work on
WAIT. - It does not silently continue after a review failure.
- It persists run-controller state so
helix statuscan report lifecycle progress, cycle timing, focused epic, blocked work, and token usage. - It prefers cross-model verification when
--review-agentis configured. - It stays focused on a chosen epic until the epic completes or a blocker is recorded.
Component Changes
Modified: Wrapper entry and run loop
- Current State: Bash wrapper around HELIX actions.
- Changes: Enforce bounded execution, queue-drain checks, and review-aware cycle control while keeping DDx as the queue-drain owner.
Modified: Tracker integration
- Current State: Local JSONL issue tracker.
- Changes: Dependency-aware ready/blocked selection, ownership, and recovery-aware orchestration.
Modified: Installer and harness
- Current State: Local launcher and deterministic wrapper tests.
- Changes: Keep mirrored command install and deterministic verification in sync with the wrapper contract.
State Machine
Ready Work
When ddx bead ready --json reports one or more ready execution issues:
- Select or scope the next ready execution issue using the tracker ranking rules and HELIX queue policy.
- Re-read the issue and verify it is still a safe execution target.
- Dispatch DDx-managed execution:
- target contract:
ddx agent execute-loop --once - compatibility path while loop parity lands:
ddx agent execute-bead <id>
- target contract:
- Read the structured DDx outcome and treat landed vs preserved state as the source of truth for what happened during the attempt.
- Re-check workflow drift before any HELIX-owned post-execution review or follow-on work.
- Count the pass as a completed cycle only if the DDx-managed attempt lands and the bead closes.
Queue Drain
When the ready queue is empty, helix run must execute check once and obey
the first NEXT_ACTION result exactly:
NEXT_ACTION: BUILDRe-enter ready-work selection and continue only if the tracker now reports ready execution work.NEXT_ACTION: DESIGNRun one bounded design pass, then re-check queue state before any build resumes.NEXT_ACTION: POLISHRun one bounded issue-refinement pass, then re-check queue state before any build resumes.NEXT_ACTION: ALIGNRun one alignment pass if auto-alignment is enabled. If the follow-up check still returnsALIGN, stop and surface the alignment command rather than looping forever.NEXT_ACTION: BACKFILLStop and surface the explicithelix backfill <scope>command. Backfill is a separate cross-activity action and is not auto-run byhelix run.NEXT_ACTION: WAITStop immediately.WAITmeans execution is blocked by claimed work or a truly external dependency;helix runmust not attempt an unblock build pass.NEXT_ACTION: GUIDANCEStop immediately and surface the required user or stakeholder decision.NEXT_ACTION: STOPStop cleanly because no actionable work remains for the scope.
Cycle Accounting
The loop must distinguish between attempted work and completed work.
attempted_cycles: increment when the wrapper starts a build attempt.completed_cycles: increment only after build succeeds, the issue is closed, and post-implementation review passes when review is enabled.--review-every Nand--max-cycles Nmust usecompleted_cycles, not attempted cycles.- Failed build attempts do not count toward the cycle limit.
cycle_start,cycle_end,cycle_duration, andtotal_tokensbelong in persisted run-controller state and in the loop’s progress output.- The Codex runner must capture stdout and stderr together before token
extraction so the
tokens usedfooter is accounted for regardless of which stream Codex used. .helix/context.mdmust be regenerated at run start, on epic switch, and after every 5 completed build cycles. The generator must include:- the Quick Reference build and test commands from
AGENTS.md - current open, in-progress, ready-execution, and closed issue counts
- current focused epic metadata when epic mode is active
- the Quick Reference build and test commands from
Exponential Backoff
When an issue fails implementation, the wrapper retries with bounded exponential backoff before declaring it blocked:
- Formula:
delay = min(5 * 2^(attempt-1), 40)seconds - Attempt 1: 5s, Attempt 2: 10s, Attempt 3: 20s, Attempt 4+: 40s (cap)
- After 4 failed attempts (75s total backoff), the issue is blocked as intractable and added to the skip list.
- If the blocked issue is a child of the focused epic, the parent epic is also blocked.
- The backoff sleep can be overridden with
HELIX_BACKOFF_SLEEP(useful for testing).
Summary Mode
The --summary (or -s) flag routes verbose output to the log file only
while emitting concise progress lines with log-file line-range pointers:
helix: [14:24:01] cycle 1: hx-42 (5 ready)
helix: [14:24:35] codex complete (rc=0, 34s, 892 tokens) — log L12–L340 in .helix-logs/helix-...log
helix: [14:24:36] cycle 1: hx-42 → COMPLETE (1/3 done, 892 tokens)Implementation uses two output helpers:
summary_line: writes to fd 3 (original stderr, always visible to the operator)verbose_line: writes to the log file only in summary mode, or to stderr in normal mode
Summary mode implies --quiet (no agent startup messages or stream-json
tool-call output).
Recovery and Ownership
helix run needs a safe recovery model for crashed or orphaned sessions.
Ownership Model
Claimed work is advisory state in the tracker, not a license to rewrite the worktree blindly.
- A claimed execution issue is represented by
status: in_progressandassignee: helix. - The tracker records claim freshness metadata on
--claim:claimed-at: ISO-8601 UTC timestamp of the claimclaimed-pid: OS process ID of the claiming HELIX session
- The inverse operation
--unclaimrestores the issue toopen, clearsassignee, and nullsclaimed-atandclaimed-pid. - Recovery may only reclaim work when the claim is demonstrably stale or when the user explicitly requests recovery.
- Absence of a currently running process is not sufficient evidence that a claim is orphaned — the claim age must also exceed the staleness threshold.
Recovery Algorithm
Recovery runs at the start of helix run and after each failed implementation
cycle. For each in_progress issue with assignee = helix:
- Skip if another helix process is actively working on the issue
(
pgrepcheck). - Skip if
claimed-pidis still alive (kill -0check). - Skip if the claim age (from
claimed-at, orupdatedas fallback for legacy issues) is belowHELIX_ORPHAN_THRESHOLD(default: 7200 seconds / 2 hours). - Reclaim via
tracker update <id> --unclaim.
Recovery resets tracker state only — it does not revert worktree changes or attribute partial work to files.
Recovery Rules
Recovery must be non-destructive by default.
- The wrapper must not perform a broad
git checkout -- .or otherwise erase unrelated local changes. - After any implementation failure or timeout, the wrapper releases the
advisory claim via
--unclaimso the next cycle does not inherit stale ownership. - Recovery results are visible in the run’s stderr output and log file.
Queue Drift Fingerprinting
The wrapper detects concurrent modifications to claimed issues by computing a fingerprint before implementation and comparing it after:
- Fingerprint fields:
spec-id,parent,superseded-by,replaces - Computed: immediately after selection, before claim
- Recomputed: immediately after implementation, before close
- On mismatch: the issue is unclaimed and skipped (queue drift detected)
BUILD Loop Breaker
When check returns NEXT_ACTION: BUILD but no execution-eligible issue can
be selected, the loop tracks consecutive empty BUILD cycles. After 2
consecutive empty BUILDs:
- Run orphan recovery to free any stale in-progress issues.
- If orphan recovery increased the ready count, reset the counter and continue.
- If still no selectable issues, stop the loop with a clear message.
Concurrent Interactive Refinement
helix run must support the local operating mode where one session advances
execution while another session refines specs or tracker issues, including
direct conversational agent work that creates or updates beads without going
through a CLI wrapper.
Material Queue Drift
Material drift includes any change that can invalidate execution authority or completion status for the currently selected issue:
- changed
spec-id - changed dependencies
- changed parent or replacement relationship
- superseded execution issue
- execution-eligibility change that removes the issue from the runnable set
Revalidation Rules
- Before claim:
- re-read the issue and verify it is still runnable
- if materially changed, skip claim and return to queue evaluation
- Before close:
- re-read the issue and verify it has not been superseded or structurally invalidated
- if materially changed, do not close it from stale assumptions; stop, reopen the decision path, or create follow-up work as appropriate
Execution Eligibility
The wrapper must distinguish execution-safe work from general open work so interactive refinement issues do not get treated as build targets by accident.
Batch Selection
Batching exists to save context-loading cost without collapsing unrelated work.
- The primary related-work heuristics remain shared parent and shared
spec-id. - When those heuristics produce no sibling candidates, batching must fall back
to matching ready execution-safe issues by shared
area:*labels. - Area-label fallback must still exclude skipped issues, epics, and the primary issue itself.
- If no parent,
spec-id, or area-label siblings exist, the wrapper must execute the primary issue alone.
Review Handling
Post-build review is part of the orchestration contract.
- A clean review allows the loop to continue.
- Review findings must create or reopen follow-up work before the loop advances to the next issue.
- Review should be machine-readable enough for the wrapper to distinguish
CLEANfrom actionable findings. - If review cannot be interpreted safely, the loop must stop instead of assuming success.
- When
--review-agentis configured, review must run under the alternate model rather than the implementation model. - When an epic closes, the loop must run a scoped post-epic review against the epic’s governing artifact before releasing focus.
- Actionable findings from review, alignment, measure, or report must be filed as beads with explicit parent/dependency topology whenever ordering matters; the wrapper must not depend on operator memory to infer the next runnable slice.
Blockers And Work Absorption
- Small adjacent work discovered while satisfying the same governing acceptance, such as related manifest updates or directly coupled file edits, should be absorbed into the current issue instead of spawning avoidable tracker noise.
- Work that is adjacent but not clearly required by the current acceptance must still become separate follow-up issues.
- When the loop exits with skipped or intractable work, it must emit a blocker report that names the issue, reason, attempts, and any epic-level impact.
Components
1. Wrapper Entry Script
scripts/helix resolves the repository root, selects the workflow library
root, sources the tracker library, opens a session log, parses CLI flags, and
dispatches commands.
2. Prompt Builders
Each agent-facing command builds a prompt that points at the authoritative workflow action file and injects repository-local instructions such as tracker usage and required output trailers.
3. Agent Runner
run_agent_prompt remains the surface for non-managed prompts such as
check, review, align, design, and polish. Managed implementation
attempts should flow through DDx execution commands instead of raw
ddx agent run.
4. Built-In Tracker Library
ddx bead manages issues stored as JSONL in .ddx/beads.jsonl. It
provides creation, read/update/close flows, dependency management, ready and
blocked queue queries, and tracker health summaries.
For this design, it is the source of truth for claim ownership and claim
freshness metadata.
5. Loop Controller
run_loop is the orchestration layer for helix run. It:
- checks ready work before implementation
- persists run-controller state for status and observability
- revalidates the selected issue immediately before claim and before close
- delegates one bounded managed execution pass at a time through DDx
- calls
checkafter the queue drains - honors
DESIGNandPOLISHqueue-drain results before build resumes - can auto-run alignment once after
ALIGN - stops on
WAIT,GUIDANCE, orSTOP - stops on
BACKFILLand surfaces the explicit follow-up command - can trigger periodic alignment with
--review-everyusing completed cycles - uses epic focus, bounded exponential backoff, and blocker reporting to keep useful work moving without losing traceability
- must not attempt an unblock build pass after
WAIT
Migration note:
- HELIX-owned issue selection, epic focus, batching, and queue-drain routing still exist today.
- The execution contract being designed for is DDx-managed queue drain via
execute-loop, not indefinite growth of wrapper-owned claim/execute/close logic. - If
execute-loopcannot yet express a required HELIX supervisory behavior, that gap should become explicit DDx follow-on work, ordered in the tracker with parents and dependencies where needed, rather than hidden shell policy.
6. Installer
scripts/install-local-skills.sh links the HELIX skill entrypoints into
~/.agents/skills, mirrors them into ~/.claude/skills for Claude
compatibility, preserves package-relative access back to the repository’s
shared workflows/ library, makes scripts/helix executable, and installs a
local helix launcher under ~/.local/bin.
7. Deterministic Test Harness
tests/helix-cli.sh creates temporary git workspaces, stubs agent binaries,
seeds tracker state, and verifies command behavior without relying on live
agent sessions.
Data and Filesystem Surfaces
- Workflow docs root:
workflows/ - Tracker state:
.ddx/beads.jsonl - Session logs:
.helix-logs/helix-YYYYMMDD-HHMMSS.log - Run-controller state:
.ddx/run-state.json - Blocker reports:
.helix-logs/blockers-YYYYMMDD-HHMMSS.md - Installed launcher:
~/.local/bin/helix - Installed skill links:
${CODEX_HOME:-$HOME/.codex}/skills,${CLAUDE_HOME:-$HOME/.claude}/skills
External Dependencies
- Required for normal operation:
bash,jq - Required per agent choice:
codexorclaude - Optional for swarm mode:
ntm,tmux
Constraints
- The wrapper is intentionally bounded; it is not allowed to replace the HELIX
loop with an unconditional
while true. - Tracker readiness is dependency-aware.
- Experiment mode must refuse dirty worktrees.
- Backfill must fail if the report trailer is missing or the declared report file does not exist.
WAITis a terminal orchestration result, not a cue to try another implementation pass.- Recovery must preserve unrelated local changes and may not rely on process absence alone.
Evidence
scripts/helix:15-37scripts/helix:109-239scripts/helix:250-359scripts/helix:381-519scripts/helix:542-570scripts/helix:579-784scripts/install-local-skills.sh:4-67tests/helix-cli.sh:46-176tests/helix-cli.sh:347-414tests/helix-cli.sh:571-646