Multi-Prism Controller
The Multi-Prism Controller is the coordination layer that turns Prism from a single-operator memory substrate into a multi-agent, multi-operator control plane. It is specified in SPEC-030 and implements the architectural commitments from the Prism architecture paper: MCP at the boundary, gRPC bidirectional streams inside, no peer-to-peer control paths, capability-scoped tokens, and lease-based resource claiming. This is not a separate service to deploy. The controller is a role that one MCP server instance claims per project, using a master election model inspired by NetBIOS master browser election. The same code runs in every MCP server — the difference is a runtime flag. The identity scope a controller operates within is the four-level hierarchy defined by SPEC-093:Tenant → Org → Department → Project. Every project carries tenant_id + org_id + department_id + pid with FK constraints enforcing the chain at insertion. Personal-install (LAN) backwards compatibility is preserved through flat-LAN seeding — when no Department is specified, a default Department is created under the default Org under the default Tenant — so existing single-user installs continue to work without operator action.
The master election model
Every MCP server instance on a project participates in the controller election. The election happens insideprism_start as a side effect of registration — no new verb, no configuration.
Claude Desktop always wins when present. CD is the operator console — the human-facing surface where the operator steers. It gets deterministic election priority the same way a domain controller gets priority in a NetBIOS master browser election. If CD is not running, the first agent to call prism_start on that project becomes master.
The master opens a gRPC bidirectional stream to the Prism backend. Through this stream it receives real-time push events: lease contention alerts, approval requests, state-change notifications. Non-master instances (peers) continue using normal HTTP for verb calls and pick up controller obligations asynchronously through the nudge table at their next prism_start.
prism_start:
Single-master invariant is enforced at the database level (partial UNIQUE on tenant_id, project_id, is_master=true), so no application-level race can produce two masters. CD’s priority is implemented via a Lua CAS script in Redis (SPEC-032 §5.2.1) that demotes a non-CD incumbent atomically.
The controller registration table
Every agent on every project registers in a Postgres-backed routing table when it callsprism_start. The table tracks who is active, who is master, what surface they are running on, and when they last sent a heartbeat. A partial UNIQUE constraint enforces single-master per project at the database level — no application-layer race is possible.
Released registrations are never deleted. They get a released_at timestamp, preserving full session history on disk. This follows the same append-never-delete pattern used throughout Prism for specs, ADRs, and leases.
You can inspect the routing table at any time:
net view for your agent network.
Context switching — prism_start gets smarter
When you call prism_start on a different project than the one you are currently registered on, the system detects the context switch automatically. Behind the scenes it soft-closes the old project (writing a wrap-session nudge so no work is silently lost), releases your registration on the old project, and registers you on the new one — running the master election for the new project in the same step.
From the user’s perspective: you say prism_start on a different project. Everything else is invisible. No prism_switch verb, no manual deregistration, no master re-election commands. The system figures out what needs to happen.
Operator-driven master changes
The election state machine above handles the common case — first session wins, Claude Desktop wins on arrival, peers register quietly. SPEC-082 v0.3 adds three operator-facing verbs that turn the remaining cases — “I want a specific identity to be master right now,” “another agent’s session is stale and needs to be removed” — into explicit, idempotent operations rather than restart races.| Verb | Caller | Purpose |
|---|---|---|
prism_master_handoff(pid, to_identity, to_session_id?) | Current master | Cooperative transfer. Caller MUST be current master. |
prism_master_claim(pid, to_identity, operator_id, operator_password, to_session_id?) | Anyone with operator credentials | Operator-authorized preempt. Caller proves authority via SPEC-038 §3.2 credentials; the target identity is independent of the caller. |
prism_session_deregister(session_id) | Anyone | Surgical cleanup of a single controller row. Idempotent; releases both the Postgres row and the Redis hash + sessions-set entry. |
prism_master_handoff and prism_master_claim resolve their target through a deterministic 6-step algorithm:
- If
to_session_idis provided, that exact controller row is selected (must matchpid+identity, must be active). Authoritative override. - Otherwise the candidate set is active rows for
to_identitywhose heartbeat is within the freshness threshold (default 30s). - An empty candidate set returns
target_not_registered. - A non-empty set with all rows heartbeat-aged-out returns
target_stale. - Exactly one fresh candidate is selected.
- Multiple fresh candidates returns
target_ambiguous, listing the candidatesession_ids. The operator disambiguates by re-calling withto_session_id.
controller_status rows are reconciled. prism_status reads from Redis, so it reflects the new master immediately on return. Concurrent calls are linearised by the CAS — the loser receives stale_master and can retry against the new state.
Both verbs reuse the existing MasterPreempted system signal — there is no MasterChanged enum. The payload extension carries previous_master_*, new_master_*, reason ("preempt" for prism_master_claim or election-driven preempt, "handoff" for prism_master_handoff), and by_operator=<operator_id> on the operator-authorized path. See Signal Mesh — System Signals for the full payload contract.
prism_session_deregister is the cleanup lever for stale rows that don’t represent a live session — typically because a peer crashed before its session manager could release the Redis state, or because the row predates the controller-row leak fix in PR #150. It does not change master state on its own; if the deregistered session happened to hold master, election re-runs on the next prism_start.
Consensus-first parallelism
Master election handles “one agent holds the gRPC stream”; leases handle “one agent at a time on a resource.” The middle layer — “how do many agents drive a single multi-step decision without serialising into one chat thread?” — is what SPEC-078 v0.2 codifies as consensus-first parallelism. It’s the orchestration discipline the eight-agent mesh runs on, and it’s what made Plan #10 ratify six governance artefacts in 3.5 hours instead of taking days. The mechanism is a 3-tier consensus workflow tuned to the architectural risk of the work in flight:| Risk tier | Trigger | Workflow |
|---|---|---|
| Low | Mechanical change inside an existing pattern (rename, dependency bump, doc fix) | Single-agent execute; PR review by lane owner. No consensus needed. |
| Medium | New scope inside an established lane (feature in shipped surface, schema migration on existing tables) | Driver routes to a designated reviewer. ReviewRequested → ReviewCompleted with structured findings. Single review pass unless findings are blocking. |
| High / architectural | Cross-lane scope, new authority surface, governance artefact, ratification arc | Multi-pass review chain: driver → architect for technical adequacy → governance/methodology for risk-tier authority and supersession → PO ratification. Each pass returns ReviewCompleted with a structured verdict (approved, approved_with_minor_nit, findings_block_ratification). Driver folds amendments into the next version, re-routes for confirmation, then ratifies. |
method.completion.done-definition— “completion” means merged + deployed + tested, not merged alone. Reviewers and ratifiers gate on this definition rather than re-litigating it per arc.method.parallel.ownership-contract— explicit write ownership, single-driver-per-domain, signal-mediated handoffs. The contract that makes “no peer-to-peer command authority” work in practice.
Leases and capability tokens
When multiple agents work on the same project, they need to claim resources without colliding. The controller mediates this through leases and capability tokens. A lease is a time-bounded claim on a resource — an entity, a file scope, a worktree. Only one agent can hold a lease on a given resource at a time, enforced by a partial UNIQUE constraint in Postgres. Leases expire automatically if not renewed. If a second agent requests a lease that is already held, the controller pushes a contention event to the master, which presents the conflict to the human for resolution. A capability token scopes what an agent is authorized to do. Tokens are short-lived (minutes), opaque, and stored in Postgres. An agent refreshes its token via the gRPC stream. The token’s scope defines which resource types and actions the agent can perform — “can read and write specs” or “can read ADRs but not modify them.” The controller validates the token on every mutation before allowing it through to the intent queue.Approval flow
Some mutations require human approval before they proceed. The controller handles this through two paths: When Claude Desktop is master (the common case), approval requests arrive in real time via the gRPC stream. CD presents the request inline — “Agent Donna wants to update SPEC-020 status to superseded. Approve?” — and the human responds. The approval flows back through the stream and the mutation proceeds. When CD is not running, the approval falls back to the nudge table. The nextprism_start on any surface shows the pending approval, and the human resolves it there. Slower, but functional. The real-time path exists for speed; the async path exists for correctness.
How the backend routes events
The backend is the only router in the system. When a state mutation lands — through the existing intent queue from SPEC-026 — the backend knows every registered agent on that project from the controller registration table. It routes the event through two delivery surfaces simultaneously: The master’s gRPC stream gets a real-time push. Every registered peer gets a nudge row written to the nudge table (SPEC-029). One source of truth (Postgres), two delivery mechanisms (stream for speed, nudges for reliability), zero peer-to-peer relay. Master gets the fast path (push). Peers get the durable path (pull on next verb). If the master drops offline mid-stream, its nextprism_start drains the nudges it missed — so real-time push is never the only delivery path. Zero lost events without a circular peer-to-peer mesh.
If the master dies (editor closes, machine sleeps), the backend detects the stream disconnect via gRPC keepalive. It marks the registration as released. Events continue accumulating in the nudge table. When the next agent calls prism_start, it re-elects a master, the new master opens a fresh stream, and catches up from the nudge table.
Scale and deployment
The controller election runs on a single Postgres instance. No distributed consensus, no etcd, no Raft. A Postgres row with a partial UNIQUE constraint IS the election. This handles the target scale comfortably: fewer than 10 users with fewer than 10 agents each, per project. Deployment adds one new container to the existing docker-compose stack —backend-grpc, running from the same backend image with a different entrypoint. It shares the same database URL, the same Neo4j driver, and has zero shared in-memory state with the HTTP backend. The MCP server configuration in each editor does not change.
If Prism ever needs to go beyond LAN scale, the protocol stays the same. The Postgres instance moves to a hosted service. No protocol change needed — just an infrastructure swap.
What this means for existing users
If you are running Prism today as a single operator with one agent, nothing changes.prism_start still works exactly as before. You gain prism_status as a new verb, but you will never need it until you add a second agent. The controller registration happens silently — you become master of your own project with no visible difference in behavior.
The controller becomes meaningful when you add a second agent. The moment Donna joins a project that Lola is already working on, the registration table shows two entries, one is master, and the coordination infrastructure activates. No configuration, no setup, no new processes to launch.
Where to go next
Vision
The thesis — where the market is going and why Prism’s architecture is ahead of it.
Tri-Graph Architecture
The knowledge representation layer that the controller governs.
Installation
Get Prism running — the controller activates automatically.

