Technical Design: TD-003 — helix start/stop
Technical Design: TD-003 — helix start/stop
Status: draft Date: 2026-04-05 Story Reference: hx-407ed8b8 | Feature: FEAT-002 | Solution Design: SD-001
Problem
helix run currently blocks the terminal. The helix-worker skill launches
background runs from inside an agent session, but there is no discoverable CLI
command for operators to start and stop a background HELIX run directly. Users
must know about helix run --summary & and manually manage the process.
Acceptance Criteria
helix startlauncheshelix runas a background process.- A PID file is written at
${HELIX_PID_FILE:-.ddx/helix.pid}. helix statusshows whether a run is active (PID alive check).helix stopkills the background run cleanly.- Stale PID files are detected and cleaned up.
Design
helix start
helix start [--agent claude|codex] [--max-cycles N] [helix run options...]Behavior:
- Check if a run is already active (PID file exists, process alive) → refuse with an error.
- Delegate to
helix run --summary <forwarded-flags>. - Run the process in the background with output redirected to the log file.
- Write the PID to
${tracker_dir}/helix.pid. - Print:
helix: started (pid=<PID>, log=<path>)
Implementation:
run_start() {
local pid_file="${tracker_dir}/helix.pid"
if _helix_pid_alive "$pid_file"; then
local pid; pid="$(cat "$pid_file")"
die "helix run already active (pid=%s). Use 'helix stop' first." "$pid"
fi
# Clean stale PID file
rm -f "$pid_file"
# Launch helix run in background with summary mode
helix run --summary "$@" &
local bg_pid=$!
printf '%s\n' "$bg_pid" > "$pid_file"
printf 'helix: started (pid=%s, log=%s)\n' "$bg_pid" "$log_file" >&2
}helix stop
helix stopBehavior:
- Read PID from
${tracker_dir}/helix.pid. - If PID is alive, send SIGTERM.
- Wait up to 10 seconds for graceful shutdown.
- If still alive, send SIGKILL.
- Remove the PID file.
- Print:
helix: stopped (pid=<PID>)
Implementation:
run_stop() {
local pid_file="${tracker_dir}/helix.pid"
if [[ ! -f "$pid_file" ]]; then
die "no active helix run (no PID file)"
fi
local pid; pid="$(cat "$pid_file")"
if ! kill -0 "$pid" 2>/dev/null; then
rm -f "$pid_file"
die "stale PID file removed (pid=%s was not running)" "$pid"
fi
kill "$pid" 2>/dev/null
local waited=0
while kill -0 "$pid" 2>/dev/null && (( waited < 10 )); do
sleep 1
(( waited++ ))
done
if kill -0 "$pid" 2>/dev/null; then
kill -9 "$pid" 2>/dev/null || true
fi
rm -f "$pid_file"
printf 'helix: stopped (pid=%s)\n' "$pid" >&2
}helix status integration
Add to run_status() after the “Last Run” section:
# Active run check
local pid_file="${tracker_dir}/helix.pid"
if _helix_pid_alive "$pid_file"; then
local pid; pid="$(cat "$pid_file")"
printf '\nActive Run:\n'
printf ' PID: %s\n' "$pid"
printf ' Log: %s\n' "$log_file"
else
if [[ -f "$pid_file" ]]; then
rm -f "$pid_file" # clean up stale file
fi
fiJSON output adds active_run: {pid: N, log: "path"} | null.
Helper
_helix_pid_alive() {
local pid_file="$1"
[[ -f "$pid_file" ]] || return 1
local pid; pid="$(cat "$pid_file")"
[[ -n "$pid" ]] && kill -0 "$pid" 2>/dev/null
}Command surface
| Command | Behavior |
|---|---|
helix start [opts] | Launch background run, write PID file |
helix stop | Kill background run, remove PID file |
helix status | Show active run if PID alive |
These mirror the skill surface: helix-worker (agent-side) maps to
helix start (CLI-side).
Test Plan
Add to tests/helix-cli.sh:
helix startwrites PID file and runs in backgroundhelix startrefuses when already runninghelix stopkills process and removes PID filehelix stophandles stale PID file gracefullyhelix statusshows active run when PID alivehelix statuscleans stale PID file
Edge Cases
- Double start: Refused with error pointing to
helix stop. - Stale PID file: Detected via
kill -0check; cleaned up automatically by bothstartandstatus. - Process tree cleanup:
helix runspawns agent subprocesses. SIGTERM to the parent triggers the existingcleanuptrap which handles child process termination. - No tracker dir:
helix startcallsrequire_trackerbefore writing PID.
Dependencies
- Existing
helix run --summarymode - Existing log file infrastructure
- Existing cleanup trap in
run_helix
Migration
No migration needed. PID file is ephemeral and auto-cleaned.