Skip to main content
Status: draft · Version v0.1 · Filed 2026-05-01

spec_id: SPEC-063 version: v0.1 status: draft authored_by: Donna date: 2026-04-30

SPEC-063 — Postmortem + Retrospective Discipline

(Originally drafted as SPEC-062; that number was taken by Texi’s Codex thread-name spec filed 2026-05-01 01:18Z. Renumbered to 063, content unchanged.)

Status

Draft, Phase 1 shipping in same PR. Operator directive 2026-04-30: error rate accelerating; the memory system is reactive (capture-after) not preventive (consult-before). Filing memories doesn’t change next-action behavior because consulting-before-acting is voluntary. Postmortems force naming the choice point — the moment verification was available and skipped — and that naming is what builds reflex.

Problem

Eight distinct discipline failures in one session (2026-04-30 Donna) — see postmortem_session_2026-04-30.md. Each had a memory rule already loaded that should have prevented it. Each was filed as a feedback memory after the fact. None of the prior memory filings prevented the next-class instance. The memory system as currently practiced is reactive. The gap between filing and reading is the operative failure mode. Memories sit in MEMORY.md available to the next session, but consulting them is a voluntary action. Under any time pressure (or simple typing-ergonomics preference), the agent defaults to pattern-completion and confabulation, then files yet another memory after the operator catches the error.

Goals

  1. Every detected error gets a postmortem. Postmortem template is closed-form. No abbreviation tolerated.
  2. Every postmortem gets a retrospective. Why was verification skipped? What concretely changes next time?
  3. Postmortems are queryable. Stored as durable records. Surfaced on bootstrap. Aggregated on the dashboard so error-rate is observable.
  4. Wraps surface unresolved postmortems. A wrap that lands while open postmortems sit unaddressed shouldn’t quietly succeed.

Architecture

§3.1 — postmortems table (Phase 1)

CREATE TABLE postmortems (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID NOT NULL REFERENCES tenants(id),
    project_id      UUID NOT NULL REFERENCES projects(id),
    captured_via    TEXT NOT NULL,            -- agent persona
    captured_at     TIMESTAMPTZ NOT NULL DEFAULT now(),
    error_summary   TEXT NOT NULL,
    what_i_did      TEXT NOT NULL,
    what_i_should   TEXT NOT NULL,
    choice_point    TEXT NOT NULL,
    memory_violated TEXT,
    blast_radius    TEXT,
    why_skipped     TEXT NOT NULL,
    what_catches    TEXT NOT NULL,
    action_item     TEXT NOT NULL,
    triggered_by_input  UUID REFERENCES operator_inputs(id),
    resolved_at         TIMESTAMPTZ,
    resolution_kind     TEXT,
    resolution_ref      TEXT
);
CREATE INDEX ix_postmortems_unresolved ON postmortems(captured_at DESC) WHERE resolved_at IS NULL;

§3.2 — prism_postmortem verb

prism_postmortem(
  pid, error_summary, what_i_did, what_i_should, choice_point,
  memory_violated?, blast_radius?, why_skipped, what_catches, action_item,
  triggered_by_input?
) → {id, captured_at}
Required fields enforce the template. Empty / abbreviated fields rejected with a clear error.

§3.3 — prism_postmortem_resolve verb

prism_postmortem_resolve(
  pid, postmortem_id,
  resolution_kind: 'memory_filed'|'spec_filed'|'tooling_change'|'no_action_needed',
  resolution_ref?: <file path | spec_id | commit SHA>
) → {id, resolved_at}

§3.4 — Wrap-time unresolved-postmortem warning (Phase 2)

prism_wrap checks postmortems WHERE resolved_at IS NULL for the current agent. Non-zero count emits non-blocking warning naming each open postmortem.

§3.5 — Bootstrap-time recent-postmortems read (Phase 2)

prism_start response includes recent_postmortems[]. Bootstrap prompt updated: “Read recent_postmortems before any class-1/2/3 action this session.”

§3.6 — Dashboard panels (Phase 3 — deferred)

PanelSourceQuestion
Error rate over timepostmortems count per sessionIs the discipline working?
Open postmortems by agentunresolved group by agentWho has unresolved debt?
Top memory rules violatedgroup by memory_violatedWhich rules don’t stop the agent?

Phased rollout

Phase 1 — table + verbs (record + resolve). Lowest risk, ships tonight in same PR as this spec. Phase 2 — wrap warning + bootstrap read. Touches lifecycle paths; wants careful smoke. Own PR. Phase 3 — dashboard panels. Tied to other Phase-3 dashboard PRs.

Files changed (Phase 1)

FileChange
backend/alembic/versions/028_spec063_postmortems.pyNew
backend/app/models/postmortem.pyNew
backend/app/services/postmortem_service.pyNew
backend/app/routers/postmortem.pyNew
backend/app/schemas/postmortem.pyNew
backend/app/main.pyModified — include router
mcp-node/src/verbs/coordination.tsModified — verbs
mcp-node/src/client/http.tsModified — client methods
mcp-node/src/verbs.tsModified — verb names
tests/test_spec063_postmortems.pyNew

References

  • feedback_postmortem_on_every_error.md — the discipline rule.
  • postmortem_session_2026-04-30.md — 8 worked examples.
  • SPEC-060 §3.6 (operator_inputs) — postmortems link via triggered_by_input.
  • Triggering event: 2026-04-30 ~02:45Z, Frank: “we need to get in the habit of writing a postmortem and retrospective on every single error.”

Authorship

Donna 2026-04-30. Same session as the directive that generated it.
Last modified on May 18, 2026