Skip to main content

1. Problem & framing

This plan is ratification of work already shipped (§4) plus the genuinely new model-acted ACK extension (§5). It is not a gate on Texi’s first-pass implementation — that landed during this morning’s session. Step 1 (§9) is therefore “Texi PRs the existing first-pass,” not “Texi codes the first-pass.” The remaining gap, called out in Retro #11:
Backend delivered_at and publish_path prove transport progress, not model-visible action.
Texi’s runtime_diagnostics snapshot closes the observability half (stream / strategy / surface wake / tickler events visible per-process). What’s still missing:
  1. Durability — events live in an in-process 80-entry ring; lost on shim restart, not queryable across surfaces.
  2. Model-acted signal — no explicit record that the agent’s model context actually consumed the doorbell beyond drain.
  3. Cross-process query verbprism_runtime_diagnostics is per-runtime; no joined view per signal_id.
§5 closes those three. Everything else is documentation and validation matrix.

2. Scope

In-scope (this plan):
  • End-to-end observability of stages 1–9 (signal_created → backend publish → backend WS frame → MCP stream receipt → strategy delivery → surface wake → model turn → drain → reply).
  • Closing the transport-vs-model gap with a model-acted ACK protocol.
  • A paced validation matrix between Donna and Texi before fleet rollout.
Out-of-scope (deferred):
  • Always-on per-persona daemon wake path (SPEC-070 — separate effort, lane: Texi).
  • Surface-adapter rendering bugs (e.g. PeerJoined Unknown — postmortem 74b62026, Donna 5-line patch lane).
  • Doorbell durability across prism_wrapprism_start cycle (Porsche open question).

2.1 Texi Close-Out Status

Frank approved closing the residual end-to-end before returning to Plan #10 review. The original plan is now ratification of shipped first-pass work plus the remaining trace/ACK implementation. Implemented locally:
  • Durable trace_id on signal_queue, included in signal wire payloads and pending-signal drain payloads.
  • Durable signal_trace_events table with ordered per-trace stage events.
  • Backend trace endpoints for event recording, model ACK, and trace query.
  • MCP verbs prism_signal_trace and prism_signal_ack.
  • MCP stream/runtime propagation of trace_id through frame receipt, adapter-delivery diagnostics, and visible wake prompts.
  • Focused smoke coverage for Codex publish path, trace_id wire payload, and controller-registration lock helper.
Validated locally:
  • npm --prefix mcp-node run build
  • PYTHONPATH=backend backend/.venv/bin/python -m pytest backend/tests/test_signal_publish_path.py backend/tests/test_spec038_imports.py
  • PYTHONPATH=backend backend/.venv/bin/python -m py_compile ...
Deployment/formal validation pending:
  • Alembic migration 035_signal_wake_trace.py must be applied to the live backend database.
  • Backend and MCP runtimes must be restarted so the new endpoints and verbs are available to live agents.
  • Formal V1-V5 must run twice after deployment. Local implementation is ready, but this session could not deploy to server1.home.lan because SSH rejected the available credentials and local Postgres on localhost:5433 was not running.

3. Lanes

LaneOwnerSurface
Backend signal pipeline + new verbDonnabackend/app/services/signal_service.py, new prism_signal_trace verb
MCP runtime + surface adaptersTeximcp-node/src/runtime_diagnostics.ts, bootstrap/idle_tickler.ts, strategies/*
Per memory feedback_engineering_authority.md: each owner edits within their lane; the other reviews + smokes only.

4. First Pass — Already Shipped (Texi, uncommitted local)

Land status: working tree, not yet PR’d (verified via git diff --stat HEAD -- mcp-node/). Components (~290 new lines + ~185 insertions across 9 files):
  • mcp-node/src/runtime_diagnostics.ts (217 lines) — in-process state for stream open/close/error, signal frame receipt, strategy delivery result, surface wake result, tool calls, drains, signal sends.
  • mcp-node/src/bootstrap/idle_tickler.ts (73 lines) — 4-minute stale threshold, registered+not-wrapped prerequisite, active-turn suppression.
  • prism_runtime_diagnostics MCP verb (verbs/coordination.ts) — query interface for the above.
  • Strategy instrumentation (channels_push.ts, app_server_inject.ts) — per-stage event emission.
  • Codex app-server turn/start + turn/steer adapter; Claude Code maintenance channel tick.
Validation status: Texi ran wake-diagnostic probe (signal bc553d7f) at Donna 2026-05-04 ~16:49Z. Donna ACK’d via signal 34fc1ebb (publish_path: buffered_for_piggyback, woken via channel push, stages 1–6 observed OK).

5. Joint Additions — Donna Lane

5.1 trace_id in signal frame

  • Mint a UUID trace_id at signal_created (backend, signal_service.py).
  • Propagate through every stage: persisted on the row, included in every WS frame, in every MCP stream event, in every surface adapter event, in the model-acted ACK envelope.
  • One trace_id ties every stage event to a single signal across surfaces and processes.

5.2 prism_signal_trace verb

Read-only query: prism_signal_trace(trace_id) → ordered timeline of every recorded stage event for that signal across backend, MCP runtime, surface adapter, and model ACK. Returns: [{stage, ts, source, outcome, payload_meta}]. Cheap, idempotent, non-mutating. Replaces ad-hoc log-grepping during paced probes.

5.3 Model-acted ACK protocol

A new lightweight ACK separate from prism_signal reply:
  • Surface adapter, on doorbell delivery to the model context, records delivered_to_surface_at.
  • The next model turn that observes the doorbell SHOULD emit prism_signal_ack(trace_id) as its first verb call. Records model_acted_at.
  • Gap (model_acted_atdelivered_to_surface_at) is the transport-to-model latency — the metric the retro called out as currently unmeasurable.
  • Failure mode: doorbell with no model ACK within window N → flagged in prism_runtime_diagnostics and surfaces in next bootstrap rules_reminders.
This ACK is a diagnostic primitive, not a replacement for content replies. Both can be sent in the same turn.

6. Validation Matrix — Joint

Pre-fleet smoke. Run after both lanes land + dist reload. Texi-driver, Donna-responder; then swap.
#ScenarioExpected stagesPass criterion
V1Active-foreground baseline1–9 all observedmodel_acted_at < 2s after delivered_to_surface_at
V2Backgrounded Claude tab1–6 observed; 7+ delayedtrace shows wake gap, ACK fires within 5s of refocus
V3Codex post-shim-respawn1–9 with adapter restart eventtrace_id continuity across restart
V4Burst (10 signals/sec)All trace_ids resolvedno stage skipped, no duplicate ACK
V5Cross-machine LAN (mini3 → server1 → mini3)1–9 with WS hoplatency budget < 500ms p95
Fleet rollout gated on all 5 passing twice consecutively (one Donna→Texi, one Texi→Donna).

7. Surface-Specific Notes

Per first-pass retro:
  • Codex: app-server/turn/start and app-server/turn/steer are the wake primitives. Idle tickler is sufficient for ≤4-min idle windows.
  • Claude Code: channel notification is primary; only fallback option for REPL/terminal poke is the stricter-flag REPL nudge — adds risk of input-stuffing collisions, kept as Phase 2 if V2 fails.
Maintenance ticks must NOT consume the same coalescing slot as real signal doorbells (locked invariant from retro).

8. Open Questions for Frank

  1. Approve scope? — does the trace_id + prism_signal_trace + model-acted ACK trio match what you want, or do you want narrower (just the verb) or broader (also doorbell durability across wrap/start)?
  2. Validation cadence — run V1–V5 sequentially in one session, or spread over multiple sessions to capture realistic background/idle conditions?
  3. Where do trace events persist? — Postgres only (durable, queryable, Plan-#10-aligned with Postgres-as-long-term), or also Redis ring-buffer for fast in-process query? Recommend PG-only for v1; Redis if perf matters later (per feedback_optimize_later.md).
  4. rules_reminders surfacing — should missed-ACK trigger a rules_reminders entry on next bootstrap, or is that too noisy? Recommend yes, with a per-trace cooldown.

9. Sequencing

Step 0:  Frank approves this plan markdown                                                ← gate
Step 1:  Texi commits first-pass + opens PR; Donna reviews + smokes (no edits)
Step 2:  Donna implements §5.1 trace_id-in-frame (backend); Texi reviews
Step 3:  Donna implements §5.2 prism_signal_trace verb; Texi reviews
Step 4:  Joint implementation §5.3 model-acted ACK (Donna backend, Texi adapter); cross-review
Step 5:  Run V1–V5 validation matrix twice; record traces in retro
Step 6:  Fleet rollout (default-on for runtime_diagnostics; ACK protocol recommended-not-required initially)
Step 7:  Postmortem + retro after 1 week of fleet data
Each step ships independently. No big-bang merge.

10. Memory & Postmortem Hooks

  • feedback_eliminate_failures_improve_perf.md — every new verb gets structured-failure returns + duration_s tracking.
  • feedback_postmortem_on_every_error.md — any V1–V5 failure files a postmortem inline.
  • feedback_completion_means_deployed.md — Steps 1–6 are not “done” until merged + deployed + smoke green.
  • project_signal_isolation_multitenant.mdprism_signal_trace must enforce the same membership-only authorization as prism_signals_pending (no cross-tenant trace exposure).
Last modified on June 7, 2026