Skip to main content
Status: accepted · ADR-26 · Filed 2026-04-27

Decision

Establish a tiered diagnostic tool library so agents stop regenerating ad-hoc shell/Python for repeatable operations. Local prereqs are Docker Desktop and Node only — Python is NOT a host prereq; it lives inside the backend container image. The library architecture honors this split. Tier 1 — Shipped diag scripts (immediate scope).
  • Container-side (Python, in backend image): modules under backend/app/diag/, invoked via docker exec prism-backend python -m app.diag.<name> --json. Python is guaranteed inside the image; never assumed on host.
  • Host-side (Node, in CLI): subcommands under cli/src/diag/, surfaced as prism diag <name>. Node is the committed host prereq alongside Docker Desktop.
  • All diag scripts emit structured JSON for agent consumption (no human-formatted output as primary).
  • Catalog: prism diag list enumerates available tools, surface, and one-line description.
Tier 2 — Promoted MCP verbs.
  • When a Tier-1 diag is used 3+ times across sessions, promote to a first-class MCP verb (e.g. prism_diagnose <op>).
  • First-class registration, schema, agent-surface visibility, signal integration where useful.
  • Best for cross-machine ops where ssh + docker-exec is otherwise repeated.
Tier 3 — Vector-stored recipes.
  • Code-shape patterns that aren’t full tools — recalled via semantic_recall. Curated. Retired (or promoted) once a stable pattern emerges.
Initial Tier-1 inventory (committed alongside this ADR): Container-side (Python):
  1. diag_alembic_drift — compare alembic_version row vs versions/ files; flag missing / orphaned revisions. Direct response to the live ADR-driver: gRPC container crash-looping on missing revision 023.
  2. diag_session_registrations — list active registrations per identity, flag dead-session leaks (the bug we’re watching live as multiple Donna/Texi sessions accumulate).
  3. diag_signal_queue — pending signals, age, recipient, leak indicators (non-Signal rows mixed in — the SPEC-045 lifecycle leak we identified last session).
  4. diag_grpc_health — internal grpc_health_probe + listener state report.
  5. diag_redis_session_plane — Redis keyspace audit (sessions, locks, TTLs).
  6. diag_master_election — current master per project, contention history.
Host-side (Node):
  1. prism diag containersdocker ps for all prism containers, status, restart count, last-N log lines.
  2. prism diag connectivity — probe host:port reachability for backend HTTP, gRPC (45051 → 50051), Redis, Postgres. Direct response to Candi’s mini1 firewall question.
  3. prism diag bios — verify CLAUDE.md / PRISM.md / AGENTS.md replicas vs Prism templates (drift detection).
  4. prism diag mcp — verify MCP server registration with Claude Code / Codex; show config diff. Direct response to Texi’s launcher env-forwarding fix.
  5. prism diag dirs — verify $PROJECT_ROOT and $PRISM_ROOT, list registered projects, registry parity vs filesystem.
  6. prism diag launcher — replay coder.sh / coder.ps1 env forwarding to verify all env vars reach MCP subprocess.

Rationale

Token economy. Every regenerated 30–50 line snippet costs ~200 tokens generated + ~200 read + run cost. A prism diag X invocation is one tool call (~30 tokens including args + structured response). Compression is 10–50× on repetitive ops. With five agents (Donna, Texi, Candi, Lafonda, Desiree) independently burning the same patterns, savings compound. Cross-agent consistency. Today Donna’s grep, Texi’s grep, and Candi’s grep can subtly differ — one tool means one diagnostic, no interpretation drift during incident response. Operational learning compounds. Diagnostics promoted to scripts get reviewed, tested, version-controlled. Each session stops re-discovering the same shape of problem. The tool library becomes the ops-knowledge codification. Reliability and portability. Container-side Python runs in a known image. Host-side Node runs against the explicitly committed prereq set (Docker Desktop + Node). No surprise dependency on host Python, jq, or other ad-hoc tools. Resolves the recent zsh-portability discussion at architectural level: agents call prism diag X, not raw shell. Promotion gate forces curation. Without escalation criteria the library becomes a graveyard of one-off scripts. Tier-2 promotion to MCP verb requires evidence of ≥3 cross-session uses — only proven patterns earn first-class status.

Alternatives Considered

Use ad-hoc shell + host Python indefinitely. Rejected — high token cost, drift, no shared learning. Also presumes host Python, which is explicitly NOT a Prism prereq. Skip Tier-1 scripts; jump straight to MCP verbs. Rejected — promotion requires evidence of repeat use, which can’t be generated without a low-friction first tier. MCP verbs are heavyweight (schema, registration, signal integration); a one-off diag is overkill. Vector-stored recipes only (no scripts). Rejected as primary mechanism — recipes still require per-invocation regeneration in agent context. Tokenomics worse than scripts. Vector store is right for code-shape patterns (Tier 3), wrong for stable operations. Bundle all diag tools as a shell library on host. Rejected — host shell varies (zsh on mini3, bash on Linux, PowerShell on Windows). Portability cost is real. Node + Python-in-container avoids it cleanly. Single combined prism diag Python CLI on host. Rejected — would re-introduce host Python prereq. Node-on-host + Python-in-container preserves the committed prereq surface.
Last modified on May 18, 2026