A postmortem template that engineering managers actually read

Why most write-ups fail

Many postmortems chase blamelessness so hard they forget decision support. A good write-up answers: what hurt customers, what amplified it, and what we will measurably change.

The skeleton

Summary — one paragraph, plain language, customer view first.
Impact — duration, error rates, revenue or trust signals if known; explicit “unknowns.”
Timeline — UTC, tool-sourced facts, not reconstructed hero narratives.
Detection — did we page for the right reason? false negatives costlier than false positives here.
Root causes — plural, usually. Separate proximate trigger from systemic contributors.
What went well — real praise for automation and runbooks that worked.
Corrective actions — each with an owner and a definition of done; avoid ticket spam.

Cultural tradeoffs

Depth vs speed: publish a 24-hour “initial learning” doc for severe events, then a deeper follow-up if facts were missing.
Transparency: external postmortems earn trust; internal-only invites rumor.

What I would improve next time

Pair every action item with a budget or metric (even a lightweight one). “Add monitoring” without a signal definition tends to decay.