Skip to main content
Status: accepted · Version v0.2.1 · Filed 2026-05-03

SPEC-072 — Per-Session Daemon Lifecycle — Spawn, Termination, Cleanup

Status: draft (v0.2 — revised per Texi review 2026-05-03 02:39Z) Author: Donna (Engineering) Reviewer: Texi (Architect) PO: Lola Date: 2026-05-03 Relationship to SPEC-070: narrows the launch / termination contract of SPEC-070 v0.2’s daemon. Replaces §B4 (per-OS launcher integration via LaunchAgent / systemd-user / Task Scheduler). The daemon binary architecture (FSM, IPC server, WS client, plugin loader, metrics — PRs #51/#57/#58/#59/#61/#66/#67) is unchanged.

1. Motivation

SPEC-070 v0.2 §B4 specified that the per-persona daemon would be registered with the host OS as a long-lived service (LaunchAgent on macOS, systemctl --user on Linux, Task Scheduler logon trigger on Windows). The daemon would be brought up at user-login time and torn down via prism_persona_destroy, surviving across editor restarts and idle periods. Two failures of fit when reviewed against actual usage:
  1. The daemon runs when no Prism is in use. A user who logs in but doesn’t open any editor for the day still has a daemon process consuming a WS connection, heartbeating the backend, and occupying registration rows. The system pays for an “always-on listener” whose only consumer (the editor surface) isn’t running.
  2. OS-launcher integration is heavy and platform-specific. Each OS requires a different installer path. The install lane carries the cross-OS complexity. Every new OS adds another launcher template.
The original argument for OS-launching was the “Lola idle for hours, teammate sends urgent mail, daemon catches it” case. Empirically that case is much weaker than it sounded at ratification: bootstrap drain via prism_start already returns queued signals on next editor open, so the daemon adds only the narrow window of “system notification while no editor is open” — which Desktop’s surface can’t even surface to the LLM (no notifications/claude/channel support). This SPEC moves the daemon’s lifecycle inside Prism’s own usage envelope: the daemon’s lifetime is the bootstrapped Prism session, spawned on prism_start and terminated on prism_wrap. The structural value SPEC-070 actually delivers (single durable WS subscription, FSM-managed reconnect, signal continuity within an active session) is preserved.

2. Goals

  • Daemon’s lifetime is bounded by the Prism session: it spawns on prism_start and terminates on prism_wrap (or when the shim process dies abruptly without a wrap).
  • Never running when there’s no active Prism session.
  • No OS-launcher integration. Cross-platform contract is identical.
  • 1:1 relationship — one bootstrapped session, one daemon. No shared-resource machinery (no lockfile, no connection counting, no linger window).
  • Backend orphan-row cleanup is automatic via the existing routing-registry liveness probe (PR #56).
  • Shim-side handle tracking ensures bootstraps are idempotent — repeated prism_start calls without intervening prism_wrap do not duplicate daemons.

3. Non-goals

  • No multi-session sharing. If a future workflow puts the same persona on two editor surfaces simultaneously (each with its own bootstrapped session), two daemons spawn. Signals fan out via the existing publish path; the recipient sees the doorbell twice. Duplicate but not broken.
  • No always-on listener. “Lola idle 4h, teammate sends mail” → bootstrap drain on next session.
  • No OS service / LaunchAgent / systemd integration. SPEC-070 §B4 templates are retired.
  • No daemon-spawned-by-installer. prism install does not start the daemon.

4. The unit of life: one bootstrapped session, one daemon

The MCP shim is a long-lived process inside the editor (Desktop / Code / Codex / etc.). It survives prism_start / prism_wrap / re-bootstrap cycles for the entire editor’s lifetime (and beyond, per PR #81’s relauncher staleness check). The daemon is not scoped to the shim process; it is scoped to the bootstrapped session that the shim is currently hosting. Lifecycle contract per shim instance:
EventDaemon state
Shim process starts (editor launch)No daemon yet
prism_start called → session registeredShim spawns daemon, retains the ChildProcess handle as a module-level singleton
Shim is bootstrapped (between prism_start and prism_wrap)Daemon is alive, holding WS subscription + UDS server
prism_wrap calledShim closes the daemon’s stdin pipe → daemon’s EOF handler initiates graceful shutdown → shim clears its singleton handle
Repeat prism_start after wrapSingleton handle is null → fresh daemon spawn
Repeat prism_start without wrap (idempotent re-bootstrap)Singleton handle non-null → no-op (existing daemon stays)
Shim process dies abruptly (Cmd+Q, crash, kernel kill) without prism_wrapStdin pipe closes via OS → daemon’s EOF handler runs as backstop
Daemon diesShim’s doorbell client sees UDS errors; backend’s liveness probe at 60s marks the daemon row stale
The unit is one daemon per bootstrapped session. The shim’s singleton handle prevents duplication across re-bootstrap. Stdin EOF is the only normative parent-death backstop for the unclean-exit case.

5. Spawn — on prism_start

mcp-node/src/verbs/lifecycle.ts:prism_start gains a daemon-spawn step after session registration succeeds. The shim retains the ChildProcess handle in module-level state:
// module-level singleton
let daemon_process: ChildProcess | null = null;

async function spawnDaemonForSession(): Promise<void> {
  if (daemon_process) {
    // Idempotent — re-bootstrap inside the same shim without an intervening
    // wrap is a no-op. Existing daemon continues serving the (re-)registered
    // session.
    return;
  }
  daemon_process = spawn(process.execPath, [daemonBinPath()], {
    stdio: ["pipe", "ignore", "ignore"],   // shim writes to daemon's stdin; stdout/stderr ignored
    detached: false,                       // implementation detail; not load-bearing for correctness
    env: {
      ...process.env,
      PRISM_DAEMON_SHIM_PID: String(process.pid),
      PRISM_DAEMON_SHIM_DOORBELL_PATH: shimDoorbellSocketPath(),
    },
  });
  daemon_process.unref();   // shim's event loop can exit independently of daemon's
  daemon_process.on("exit", () => { daemon_process = null; });
}
stdio: ["pipe", ...] is the load-bearing detail. The shim holds a writable stream to the daemon’s stdin. When the shim closes the pipe (either explicitly on prism_wrap or implicitly via process death), the daemon receives EOF and shuts down. This is the entire parent-side termination mechanism, identical across Mac / Linux / Windows. detached: false is implementation hygiene only — same process group, easier debugging — but is not part of the correctness story. Stdin EOF is the only normative termination signal. Process-group membership is not a portable lifecycle guarantee. unref() allows the shim’s event loop to exit if the shim shuts down first (uncommon — the shim usually outlives the daemon). The exit listener clears the singleton handle so a subsequent prism_start (after the daemon has gone away) will re-spawn cleanly. daemonBinPath() resolves to mcp-node/dist/daemon/server.js per Texi’s packaging answer — the daemon ships inside the same mcp-node dist artifact, not a separate repo-root daemon tree. Same runtime, same installer surface, same version. Spawn-failure policy: if spawn() throws (binary missing, exec permission denied), the shim logs to stderr and emits a prism_daemon_spawn_failed{reason} metric (Texi calibration). The shim continues without daemon backing — same posture as today’s behavior when no daemon is running. No system signal to the operator.

6. Termination — on prism_wrap, and on shim death

6.1 Clean termination via prism_wrap

mcp-node/src/verbs/lifecycle.ts:prism_wrap gains a daemon-shutdown step before session deregistration:
async function shutdownDaemonForSession(): Promise<void> {
  if (!daemon_process) return;   // no daemon to shut down
  const proc = daemon_process;
  daemon_process = null;          // clear singleton up front so a concurrent prism_start can spawn fresh
  try {
    proc.stdin?.end();             // close the pipe → daemon's EOF handler fires
  } catch {
    /* daemon already exited; nothing to do */
  }
  // We don't await the daemon's exit; its graceful shutdown runs async.
  // Backend's liveness probe handles the registration row regardless.
}
Closing the daemon’s stdin is the same mechanism as the shim-death backstop — just initiated explicitly rather than via OS process teardown. The daemon doesn’t need to know whether stdin closed because the shim wrapped or because the shim died; its EOF handler does the same thing either way.

6.2 Daemon-side EOF handler

The daemon binary’s entry point opens a stdin reader at process start:
process.stdin.on("end", () => initiateGracefulShutdown("stdin_eof"));
process.stdin.on("error", () => initiateGracefulShutdown("stdin_error"));
process.stdin.resume();   // start reading; we don't expect data, just EOF
initiateGracefulShutdown(reason) performs:
  1. Mark FSM SHUTTING_DOWN (new terminal state added to ADR-41 FSM)
  2. Stop accepting new UDS clients
  3. Close all WS connections to backend (sends WS close frame so backend’s _forward_pubsub task observes disconnect cleanly)
  4. Call prism_wrap(kind=daemon) to deregister the daemon row (best-effort; backend liveness probe handles it if this fails)
  5. Flush any in-flight metrics
  6. process.exit(0)

6.3 Shim-death backstop

If the shim process dies abruptly without calling prism_wrap (Cmd+Q without wrap, segfault, OOM, kernel kill), the OS closes the shim’s open file descriptors — including the stdin pipe to the daemon. The daemon’s stdin reader fires the same EOF handler. This is the only parent-death mechanism the spec relies on; it is identical across all three target platforms.

7. Edge cases

CaseBehavior
prism_wrap (clean session end)shim closes daemon’s stdin → daemon EOF handler fires → graceful shutdown. Singleton handle cleared in shim.
prism_start after prism_wrap (same shim, new session)Singleton handle is null → spawn fresh daemon for the new session
prism_start repeated without intervening wrap (idempotent re-bootstrap)Singleton handle non-null → spawn no-op. Existing daemon continues.
Shim Cmd+Q (editor close) without wrapOS closes shim’s fds → stdin pipe to daemon closes → daemon EOF handler fires
Shim segfault / OOMSame as Cmd+Q — OS-level fd cleanup catches it
Daemon crashdaemon_process.on("exit") fires in shim; singleton cleared. Shim’s doorbell client (createShimDoorbellClient) sees UDS connect errors and runs daemon-less for the rest of this session. Next prism_start (after wrap+rebootstrap) spawns fresh.
Both shim + daemon crashBackend’s liveness probe at 60s marks both stale. Next bootstrap does startup_drain and re-spawns.
Host suspend (laptop lid close)Backend stops getting heartbeats. Liveness probe marks stale. On wake, daemon’s WS reconnects via FSM RECONCILINGCONNECTED; shim’s heartbeat resumes. Existing SPEC-070 / PR #56 contract; no new code path.
Two editors open simultaneously, same personaEach editor’s shim hosts its own bootstrapped session, each spawning its own daemon. Signals to the persona fan out via the existing publish path; recipient sees the doorbell from both daemons. Duplicate-but-not-broken. Acceptable for v1; revisit only if observed in practice.

8. Backend cleanup of fragments

Daemon registers as kind=daemon (PR #52) and heartbeats over the existing transport. Three terminal cleanup paths exist already:
  1. Graceful deregister — daemon’s initiateGracefulShutdown calls prism_wrap(kind=daemon). Backend marks the row released_at = NOW().
  2. Liveness probe (PR #56) — if heartbeat falls behind the 60s threshold, send-time resolution treats the daemon as not_available_stale and the routing-registry sweep marks the row stale.
  3. gRPC heartbeat continuity — bidirectional gRPC stream (via prism-server-backend-grpc container) provides faster signal than HTTP polling; broken stream is immediate evidence of daemon death. Aspirational optimization, not a v1 gate (Texi calibration). v1 ships with HTTP heartbeat.
No new sweeper or cleanup verb required.

9. Cross-platform considerations

The stdin-EOF mechanism works identically on macOS, Linux, and Windows because it operates at the OS pipe level. No prctl(PR_SET_PDEATHSIG) (Linux-only), no kqueue EVFILT_PROC (macOS-only), no Windows Job Object — none of these are needed. They are explicitly not part of this SPEC’s correctness contract. daemonBinPath() resolution:
  • All platforms: resolved relative to the running shim binary’s location: path.join(path.dirname(import.meta.url), "daemon", "server.js") (after URL→path normalization)
  • Spawn: spawn(process.execPath, [daemonBinPath()], ...) invokes the same Node binary the shim is running under — no cross-platform shell wrapper
prism install writes the same env block on every platform. No additional templates or per-OS install scripts are introduced by this SPEC.

10. What changes vs SPEC-070 v0.2

SPEC-070 §B4 elementStatus under SPEC-072
launchctl bootstrap LaunchAgent templateretired
systemctl --user enable unit file templateretired
schtasks /create Task Scheduler templateretired
prism_persona_create invokes OS launcherretired
prism_persona_destroy calls IPC shutdown + 5s-then-kill escalationretired as a daemon path; persona-destroy still cleans persona rows but doesn’t talk to a daemon
Per-OS launcher integration matrixretired
Linger-enabled programmatically on Linuxretired
Daemon binary (FSM, IPC, WS, plugins, metrics)unchanged
kind=daemon registration discriminator (PR #52)unchanged
Routing-registry liveness probe (PR #56)unchanged — handles fragment cleanup
Daemon→shim doorbell over UDSunchanged
Plugin contract (SurfacePlugin interface)unchanged
Open TODO 5s-shutdown-then-kill in prism_persona_destroy (project_open_todo_kill_fallback)retired
Daemon binary packaging pathchanged — moves to mcp-node/dist/daemon/server.js (sibling of mcp-node/dist/server.js); daemon ships in the same dist artifact as the shim per Texi calibration
Lafonda’s in-flight worktree feat/spec-070-b4-install-lane-2026-05-02 becomes a no-op for daemon launch. Install-lane work narrows to: ensure mcp-node/dist/daemon/server.js ships in prism install output, and the env block contains PRISM_DAEMON_SHIM_PID / PRISM_DAEMON_SHIM_DOORBELL_PATH keys (latter already present per claude_code.ts:resolveShimDoorbellSocketPath).

11. Implementation plan

Phase A — Shim spawn integration (Donna)

  • A1. Module-level daemon_process: ChildProcess | null singleton in mcp-node/src/verbs/lifecycle.ts
  • A2. spawnDaemonForSession() helper called after session registration in prism_start. Idempotent against re-spawn.
  • A3. Daemon path resolution helper (daemonBinPath()) using path.dirname(import.meta.url) + relative resolve to daemon/server.js
  • A4. Spawn-failure handling — log to stderr, emit prism_daemon_spawn_failed{reason} metric, non-fatal

Phase B — Shim-side wrap shutdown + daemon stdin-EOF handler (Donna)

  • B1. shutdownDaemonForSession() in prism_wrap path — clears singleton, closes daemon stdin
  • B2. Daemon entry point opens process.stdin reader on start
  • B3. EOF / error handlers call initiateGracefulShutdown(reason)
  • B4. FSM gains SHUTTING_DOWN terminal state; reconnect logic gates on it
  • B5. Graceful shutdown sequence (UDS close → WS close → deregister → metrics flush → exit)

Phase C — Retire SPEC-070 §B4 (Lafonda + Donna)

  • C1. Lafonda: close the feat/spec-070-b4-install-lane-2026-05-02 worktree without merging the launcher templates
  • C2. Lafonda + Donna: move daemon entry to mcp-node/dist/daemon/server.js packaging; ensure prism install ships the file
  • C3. Donna: remove or repurpose prism_persona_destroy’s daemon-talking path
  • C4. Update SPEC-070 (v0.3 or annotation) to point at SPEC-072 for launch lifecycle

Phase D — Smoke (Donna + Porsche)

  • D1. End-to-end smoke: shim bootstrap → prism_start → daemon spawn → WS connection visible in prism_status daemon view → prism_wrap → daemon exits within 1s → backend row marked released. Then prism_start again in same shim → fresh daemon. Then kill shim → daemon exits via stdin EOF backstop.
  • D2. Porsche: dashboard shows daemon spawn / death events as observable mesh state (folds into SPEC-071 §11 card)
Each phase is independently shippable. Phase A + B together are the minimum shipping unit; C is cleanup; D is verification.

12. Resolved during review

Texi review 2026-05-03 02:39Z answered all v0.1 open questions:
  • Daemon binary packaging: ships at mcp-node/dist/daemon/server.js (sibling of mcp-node/dist/server.js). Same shipped artifact as the shim, same versioning unit. Adopted in §10.
  • Spawn-failure policy: non-fatal, log to stderr, emit metric. No system signal escalation. Adopted in §5.
  • gRPC heartbeat upgrade: aspirational, not a v1 gate. v1 ships with HTTP heartbeat. Adopted in §8 item 3.

13. References

  • SPEC-070 v0.2 — daemon binary architecture (foundation; this SPEC narrows §B4 only)
  • ADR-41 — daemon FSM; gains SHUTTING_DOWN terminal state via this SPEC
  • ADR-42 — project attach lifecycle (unchanged)
  • ADR-43 — daemon-via-shim (filed by Texi during SPEC-070 ratification; this SPEC provides the concrete launch contract)
  • PR #52 — agent_sessions.kind=daemon discriminator (unchanged)
  • PR #56 — routing-registry liveness probe (provides 60s stale threshold for fragment cleanup)
  • PR #57 — daemon binary skeleton
  • PR #67 — backend wake-up: daemon WS receipt is notification, not delivery
  • PR #81 — mcp-launcher staleness check (relevant: the launcher is what keeps the shim long-lived across editor restarts; the unit-of-life is therefore the bootstrapped session, not the shim process)
  • project_open_todo_kill_fallback — retired by this SPEC
  • feedback_frank_spidy_sense_pattern — encapsulation > resource economy; drives the 1:1 choice
  • feedback_optimize_later — get architecture uniform first; multi-surface-per-persona is unobserved
  • project_lane_split_install_vs_deploy — Lafonda owns install lane scope reduction in §11 Phase C
  • SPEC-071 — Signal Bus QoS (independent SPEC; daemon participates as a kind=daemon recipient via the same delivery semantics)

14. Revision log

  • v0.1 (2026-05-03 02:35Z) — initial draft circulated for review
  • v0.2 (2026-05-03 02:42Z) — Texi review revisions:
    • Finding 1: corrected the unit-of-life. Daemon is scoped to the bootstrapped session, NOT the shim process. Spawn on prism_start (idempotent against re-bootstrap), terminate on prism_wrap (shim closes daemon’s stdin, fires the same EOF mechanism). Stdin-EOF on shim death remains the unclean-exit backstop. §4-§7 rewritten.
    • Finding 2: removed the “process-group cleanup belt-and-suspenders” claim from the correctness story. Stdin EOF is the only normative parent-death mechanism. detached: false is now flagged as implementation hygiene only.
    • Finding 3: added the module-level singleton handle daemon_process in the shim. Re-bootstrap without wrap is a no-op (existing daemon continues); wrap clears the handle so the next prism_start spawns fresh. §5-§7 carry the rule.
    • Plus calibration: daemon packaging at mcp-node/dist/daemon/server.js, spawn-failure metric, gRPC aspirational not v1 gate.
Last modified on May 18, 2026