跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Conduct Post Mortem

技能 已验证 活跃

Conduct a blameless post-mortem analysis after an incident. Build timeline reconstruction, identify contributing factors, and generate actionable improvements. Focus on systemic issues rather than individual blame. Use after any production incident or service degradation, following a near-miss, when investigating recurring issues, or to share systemic learnings across teams.

目的

To enable teams to systematically learn from production incidents, improve system resilience, and foster a blameless culture by providing a repeatable, documented process for post-mortem analysis.

功能

  • Conducts blameless post-mortem analysis
  • Reconstructs incident timelines
  • Identifies systemic contributing factors
  • Generates actionable improvement items
  • Provides templates for reports and action tracking

使用场景

  • After any production incident or service degradation
  • Following a near-miss or close call
  • Investigating recurring issues
  • Sharing systemic learnings across teams

非目标

  • Assigning blame to individuals
  • Performing live incident response
  • Automated root cause analysis without human input

安装

/plugin install agent-almanac@pjt222-agent-almanac

质量评分

已验证
99 /100
about 21 hours ago 分析

信任信号

最近提交1 day ago
星标14
许可证MIT
状态
查看源代码

类似扩展

PM Post Mortem Facilitator

99

Blameless incident and failure post-mortem generator using 5 Whys root cause analysis. Produces structured post-mortem documents with timeline, contributing factors, corrective actions, and lessons learned. Tracks MTTR and detection time metrics. Use when someone says "post-mortem", "postmortem", "incident review", "root cause analysis", "5 whys", "what happened", "outage review", "failure analysis", "RCA".

技能
marfoerst

Incident Response

100

Manage active production incidents through detection, triage, mitigation, communication, and resolution with structured roles and decision-making. Use this skill whenever the user has an active incident, a production issue, a service outage, a security incident, or needs to plan incident response procedures. Triggers on incident response, production incident, outage, service down, site down, P0, P1, severity, downtime, on-call, incident commander, status page, postmortem prep. Also triggers when something is actively broken in production and the user is figuring out what to do.

技能
rampstackco

After Action Report

100

Run a structured after-action review (postmortem, retrospective) on a launch, incident, or completed project to capture timeline, root cause analysis, contributing factors, and actionable lessons. Use this skill whenever the user wants to run a postmortem, retrospective, AAR, or after-action review on any past event. Triggers on after-action report, AAR, postmortem, retrospective, retro, post-incident review, what went well what didn't, lessons learned, blameless postmortem, root cause analysis, RCA, five whys. Also triggers when the user has just shipped something or just resolved an incident and wants to capture learnings.

技能
rampstackco

Ops Fires

100

Production incidents dashboard. Reads ECS health, Sentry errors, CI failures. Offers to dispatch fix agents for active fires.

技能
Lifecycle-Innovations-Limited

Design On Call Rotation

100

Design sustainable on-call rotations with balanced schedules, clear escalation policies, fatigue management, and handoff procedures. Minimize burnout while maintaining incident response coverage. Use when setting up on-call for the first time, scaling a team from 2-3 to 5+ engineers, addressing on-call burnout or alert fatigue, improving incident response times, or after a post-mortem identifies handoff issues.

技能
pjt222

Observability Designer

100

Observability Designer (POWERFUL)

技能
alirezarezvani