SPEC-100 v0.1 — Loopback Diagnostic Verb + signal_type=Loopback
Status: draft v0.1 — Texi architecture review RATIFIED as new diagnostic SPEC (signal 42f7347a, note d9f3761c, 2026-05-10). Approval is conditional on this draft capturing the minimum contract from her response. Awaiting Candi governance review for SPEC-number assignment.
Author: Donna
Reviewer (architecture): Texi
Reviewer (governance): Candi
Origin: Doorbell-render-miss class — push delivery succeeds at the backend (publish_path: pushed_to_ws, delivered: true, surface_support: full) but the channel notification never surfaces in the recipient’s model context. Cluster of recurrences:
- Postmortem
07942a99(2026-05-10) — Cherry signal5b4a3344sat in pending queue 40 min while drain returned empty; found viaprism_signal_trace. - This-session: Texi signal
3a241bdcreported delivered to Donna’s WS but never surfaced as a doorbell; only seen via operator-promptedprism_signals_pendingdrain.
Summary
A new self-addressed signal type (Loopback) and an operator-invoked verb (prism_loopback_test) that send a probe through the full delivery path (backend write → Redis publish → shim WS receipt → channel notification → model drain) and report per-layer timing. Used to bisect doorbell-render-miss class defects to a specific layer.
The verb is diagnostic only — not part of the bootstrap path, not a continuous heartbeat, not a delivery guarantee. It produces evidence; remediation lives in companion SPECs (Fix #1 receipt-at-shim guarantee, Fix #3 redis-first hot path) which are explicitly pinned out of scope until loopback evidence identifies the dominant miss layer.
Scope (v0.1)
In-scope:- New first-class
signal_type=Loopback(added to_AGENT_SIGNAL_TYPES). - New verb
prism_loopback_test(pid)— operator-invoked, runs against the calling persona’s session. - New CLI subcommand
prism loopback-test— thin wrapper over the verb. - Shim-side ephemeral ring buffer that records metadata only for every received WS frame (so the verb has shim-receipt evidence to read).
- Per-layer timestamp capture and a stable diagnostic envelope returned to the caller.
- Fix #1: receipt-at-shim guarantee — backend retries push until shim acks. Not in this SPEC; this SPEC only measures shim receipt.
- Fix #3: redis-first hot-path ordering — flipping signal_send SOR from PG to Redis per ADR-27. Not in this SPEC.
- Cross-surface differential probe (claude_code vs codex vs cursor) — single-persona scope here; surface comparison is a follow-up.
Background — the layer model
A signal flows through five rule-bearing layers between sender and recipient:| # | Layer | Owns the rule |
|---|---|---|
| 1 | Backend WS publisher | publish_path, delivered:true semantics |
| 2 | Client WS subscriber (mcp-node shim) | frame receipt over WebSocket |
| 3 | Shim → host editor handoff | translates frame into MCP channel notification |
| 4 | MCP server instruction-block | model knows to call prism_signals_pending on doorbell |
| 5 | Model behavior | drains and acts |
Contract — minimum (per Texi 2026-05-10)
C1 — Loopback envelope shape
ALoopback signal carries the following fields in payload:
| Field | Type | Purpose |
|---|---|---|
loopback_id | UUID | Stable correlation key across all layers |
nonce | hex-128 | Unique per probe; foils any caching/dedup paths |
requested_by_identity | string | Caller’s persona identity at probe time |
requested_by_session | UUID | Caller’s session_id at probe time |
surface | string | claude_code | codex | cursor | … |
machine_id | string | Caller’s machine identifier (hostname or env) |
to_identity == requested_by_identity, resolved to requested_by_session via standard recipient resolution.
C2 — Cross-layer correlation
Every layer that observes the signal must record at least:loopback_id(from payload)signal_id(assigned by backend on insert)trace_id(assigned by backend on dispatch)layer(one of the five above)observed_at(UTC ISO-8601)
loopback_id to assemble the per-layer timeline.
C3 — Reply correlation
If the probe expects an echo reply (default true), the reply is itself asignal_type=Loopback with in_reply_to set to the original Loopback signal_id (not trace_id, per feedback_in_reply_to_uses_signal_id.md).
C4 — Result envelope (returned by prism_loopback_test)
model_ack is optional — driven by whether the model consumed the doorbell. Its absence is informational, not a failure (the verb itself doesn’t require model-side participation to return). All other fields are required in the schema; nullable when not observed.
C5 — Shim tee log (ephemeral ring buffer)
The shim records metadata only for every received WS frame in a per-persona ring buffer:| Field | Type |
|---|---|
received_at | UTC ISO-8601 |
signal_id | UUID |
trace_id | UUID (if present in frame) |
loopback_id | UUID (if signal_type=Loopback; else null) |
signal_type | string |
frame_size_bytes | int |
nonce_hash | sha256 of nonce (Loopback only) |
to_identity (already in metadata), or any user-content text.
Storage: ~/.prism/tee-frames-{identity}.jsonl, append-only, atomic-tempfile-rename for ring rotation (mirrors signals-{identity}.jsonl convention).
Read access: the backend polls a shim endpoint for “did you receive loopback_id=X?” — see C6.
C6 — CLI polls local shim for receipt evidence (v0.1 split orchestration)
The diagnostic must determine “did the shim receive the WS frame?” without depending on model-side cooperation. The shim is not reachable from the central backend — shims live on operator machines (mini1, mini3, server1) behind the editor process. Therefore v0.1 splits orchestration between the backend (which observes server-side layers) and the CLI (which observes shim-side layers, since it runs on the same host as the shim):- Backend issues the probe (
POST /{pid}/signal/loopback-issue) and returns a partial envelope withbackend_write+redis_publishpopulated. - Shim exposes a local-only HTTP endpoint
GET /tee-frames?loopback_id=<uuid>returning the matching ring entry or 404. - CLI receives the partial envelope, polls the local shim endpoint up to
shim_poll_timeout_ms(default 2000ms), waits up todrain_timeout_msfor evidence of drain (optional, model-side), and assembles the full SPEC-100 §C4 envelope. - A successful shim match populates
layers.shim_tee_receipt.observed=true. A timeout populatesfalseand adds"shim_tee_receipt"tomissing_layers.
prism_loopback_test returning the full envelope server-side. v0.1’s split is a pragmatic accommodation to the present architecture, not a target end-state.
C7 — CLI surface
prism loopback-test [--json] [--timeout=<ms>]
Default human-readable output (table):
0—outcome == "complete"(every required layer observed within timeouts)1—outcome ∈ {"partial", "timeout"}(one or more required layers missing)2— invocation/config error (auth, network unreachable, missing pid, etc.)
--json emits the C4 envelope verbatim with stable schema (versioned via "spec": "SPEC-100" field).
Implementation surfaces
| Component | File | Change |
|---|---|---|
| Backend enum | backend/app/services/signal_service.py:51 | Add "Loopback" to _AGENT_SIGNAL_TYPES; defaults at lines 74–116 (category=INFO, delivery_class=sync, ttl=600s) |
| Backend verb | backend/app/routers/signal.py + backend/app/services/loopback_service.py (new module) + backend/app/schemas/loopback.py (new module) | Implement POST /{pid}/signal/loopback-issue per C4/C6 — issues self-addressed signal and returns partial envelope (backend_write + redis_publish) |
| Backend trace | signal_service.py:980/1039/1193 | Existing record_signal_trace_event already covers backend layers — no change needed; loopback verb consumes these |
| Shim tee | mcp-node/src/bootstrap/stream.ts:237 (after noteSignalFrame) | Append metadata to ring buffer; ring rotation; atomic write |
| Shim endpoint | mcp-node/src/server.ts (new local HTTP route) | GET /tee-frames?loopback_id=... |
| CLI | cli/src/index.ts (~line 2705 dispatcher + new function) | cmdLoopbackTest() per C7 |
| Tests | backend/tests/test_loopback_diagnostic.py, mcp-node/test/tee_frames.test.mjs, cli/test/loopback.test.ts | Per surface |
Testing
Unit-level:- Backend: round-trip a Loopback signal in-process, assert all five layer entries in the result envelope, assert ring buffer matches.
- Shim: tee buffer rotation at 100 frames + 10 min boundary; metadata-only assertion (regression test against payload leakage).
- CLI: exit-code matrix (0/1/2) under all outcome states.
- Operator runs
prism loopback-testfrom each surface (claude_code Donna session, codex Texi session, etc.) and captures the result envelope. Per-surface drop-rate baseline establishes the empirical layer-attribution dataset that motivates Fix #1 and Fix #3.
Migration / rollout
- v0.1 ships behind no flag (diagnostic verb, idempotent, observability-only).
- Backwards-compat: existing signal types unaffected; tee buffer is additive.
- Default-off invariant intact (Plan #10 gate): Loopback enum is always-on but verb produces no side effects beyond a single self-addressed signal per invocation.
Out of scope — explicit (pinned)
| Concern | Pinned to |
|---|---|
| Backend retries WS push until shim acks (receipt-at-shim guarantee) | Fix #1 — separate SPEC after loopback evidence lands |
| signal_send SOR flip from PG to Redis (hot-path ordering) | Fix #3 — separate SPEC; multi-store-writes.md line 102 reconciliation rides on Fix #3 |
| Continuous heartbeat / cadence test | Out of scope; this is operator-invoked only |
| Cross-surface differential drop-rate harness | Follow-up to v0.1; needs per-surface coverage |
Cross-references
- SPEC-034 — agent-to-agent signal delivery (referenced baseline; not amended)
- ADR-27 — runtime state through SM (Redis-as-SOR for runtime state)
docs/architecture/multi-store-writes.md— line 102 (signal_send SOR; reconciliation pinned to Fix #3)- Postmortem
07942a99— signals_pending vs trace divergence - Memory:
feedback_in_reply_to_uses_signal_id.md - Memory:
project_doorbell_render_miss_filed_2026-05-03.md
Open questions for v0.2
- Push vs poll for shim-receipt evidence (C6). v0.1 polls; v0.2 may flip to push if Fix #1 is sequenced immediately after this SPEC. Texi to call.
- Should Loopback delivery be exempt from coalescing? The shim today coalesces non-system signal types in the doorbell renderer. A Loopback that gets coalesced behind another doorbell would falsify a “shim received but didn’t surface” diagnosis. Default proposal: Loopback bypasses coalescing (treat as system-priority). Texi to call.
- Multi-instance shim (worktrees). If a persona has multiple coder sessions running in different worktrees, which shim’s tee log is canonical? Default proposal: backend resolves via
requested_by_session, polls only that session’s shim. Document the answer.

