跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Release It!

技能 已验证 活跃

Build production-ready systems with stability patterns: circuit breakers, bulkheads, timeouts, and retry logic. Use when the user mentions "production outage", "circuit breaker", "timeout strategy", "deployment pipeline", "chaos engineering", "bulkhead pattern", "retry with backoff", or "health checks". Also trigger when designing resilient microservices, planning zero-downtime deployments, or investigating cascading failure scenarios. Covers capacity planning, health checks, and anti-fragility patterns. For data systems, see ddia-systems. For system architecture, see system-design.

目的

To guide users in designing, deploying, and operating production-ready software systems that can withstand failures and operate reliably under real-world conditions.

功能

  • Explains stability anti-patterns and their countermeasures
  • Details essential stability patterns (circuit breakers, bulkheads, timeouts, retries)
  • Covers capacity planning and performance testing methodologies
  • Outlines safe deployment strategies (rolling, blue-green, canary, feature flags)
  • Guides on implementing observability (logs, metrics, traces, health checks)
  • Introduces chaos engineering principles and practices

使用场景

  • Designing resilient microservices
  • Planning zero-downtime deployments
  • Investigating cascading failure scenarios
  • Implementing capacity planning and health checks
  • Building systems that handle production outages gracefully

非目标

  • Providing specific code implementations for every pattern
  • Replacing foundational books on system design and resilience
  • Automating chaos engineering experiments directly (guides on how to do it safely)

实践

  • Stability Patterns
  • Capacity Planning
  • Deployment Strategies
  • Observability
  • Chaos Engineering

安装

请先添加 Marketplace

/plugin marketplace add wondelai/skills
/plugin install skills@wondelai-skills

质量评分

已验证
95 /100
about 24 hours ago 分析

信任信号

最近提交17 days ago
星标953
许可证MIT
状态
查看源代码

类似扩展

Chaos Engineer

99

Designs chaos experiments, creates failure injection frameworks, and facilitates game day exercises for distributed systems — producing runbooks, experiment manifests, rollback procedures, and post-mortem templates. Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience testing, blast radius control, game days, antifragile systems, fault injection, Chaos Monkey, Litmus Chaos.

技能
jeffallan

Chaos Engineering

99

Use when planning, running, or learning from chaos engineering experiments. Triggers on "chaos experiment", "fault injection", "gameday", "resilience test", "blast radius", "steady state", "abort criteria", "Chaos Toolkit", "Chaos Mesh", "Litmus", "Gremlin", "AWS FIS", or any deliberate failure-injection question. Ships experiment designer, blast-radius calculator, and postmortem generator (all stdlib Python), 4 references on chaos principles + experiment design + attack taxonomy + tooling landscape, and a /chaos-experiment slash command. Composes with feature-flags-architect (kill switches as abort triggers) and kubernetes-operator (common chaos targets).

技能
alirezarezvani

Run Chaos Experiment

95

Design and execute chaos engineering experiments using Litmus or Chaos Mesh. Test system resilience through controlled fault injection, validate hypothesis-driven tests, and improve failure recovery. Use before major product launches, after architecture changes to validate resilience, during GameDays or disaster recovery drills, to validate assumptions about failure modes, or as part of an SRE maturity program.

技能
pjt222

Design On Call Rotation

100

Design sustainable on-call rotations with balanced schedules, clear escalation policies, fatigue management, and handoff procedures. Minimize burnout while maintaining incident response coverage. Use when setting up on-call for the first time, scaling a team from 2-3 to 5+ engineers, addressing on-call burnout or alert fatigue, improving incident response times, or after a post-mortem identifies handoff issues.

技能
pjt222

Observability Designer

100

Observability Designer (POWERFUL)

技能
alirezarezvani

Circuit Breaker Pattern

100

Implement circuit breaker logic for agentic tool calls — tracking tool health, transitioning between closed/open/half-open states, reducing task scope when tools fail, routing to alternatives via capability maps, and enforcing failure budgets to prevent error accumulation. Separates orchestration (deciding what to attempt) from execution (calling tools), following the expeditor pattern. Use when building agents that depend on multiple tools with varying reliability, designing fault-tolerant agentic workflows, recovering gracefully from tool outages mid-task, or hardening existing agents against cascading tool failures.

技能
pjt222