Skip to main content

Configure Alerting Rules

Skill Verified Active

Configure Prometheus Alertmanager with routing trees, receivers (Slack, PagerDuty, email), inhibition rules, silences, and notification templates for actionable incident alerting. Use when implementing proactive monitoring with automated incident detection, routing alerts to the appropriate team by severity, reducing alert fatigue through grouping and deduplication, integrating with on-call systems like PagerDuty, or migrating from legacy alerting to Prometheus-based alerting.

Purpose

To enable users to set up robust and actionable incident alerting by configuring Prometheus Alertmanager, reducing alert fatigue, and ensuring timely notifications to the appropriate teams.

Features

  • Configure Alertmanager deployment and Prometheus integration
  • Define Prometheus alerting rules with best practices
  • Create notification templates for Slack, PagerDuty, and email
  • Implement advanced routing, grouping, and inhibition rules
  • Manage silences for planned maintenance and integrate with external systems

Use Cases

  • Implementing proactive monitoring with automated incident detection
  • Routing alerts to appropriate teams based on severity
  • Reducing alert fatigue through grouping and deduplication
  • Integrating monitoring with on-call systems like PagerDuty

Non-Goals

  • Setting up Prometheus metrics collection
  • Writing custom alert queries
  • Managing on-call rotations directly (only integrating with systems that do)
  • Writing incident response runbooks (though it links to them)

Installation

/plugin install agent-almanac@pjt222-agent-almanac

Quality Score

Verified
98 /100
Analyzed about 21 hours ago

Trust Signals

Last commit1 day ago
Stars14
LicenseMIT
Status
View Source

Similar Extensions

Observability Designer

100

Observability Designer (POWERFUL)

Skill
alirezarezvani

Grafana Dashboards

99

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

Skill
wshobson

Monitor Stream

99

Stream live swarm events using the Monitor tool for real-time observability

Skill
ruvnet

LangSmith Observability

99

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

Skill
Orchestra-Research

Plan Capacity

99

Perform capacity planning using historical metrics and growth models. Use predict_linear for forecasting, identify resource constraints, calculate headroom, and recommend scaling actions before saturation. Use before seasonal traffic spikes or product launches, during quarterly capacity reviews, when resource utilization trends upward, or before budget planning cycles.

Skill
pjt222

Define SLO/SLI/SLA

99

Establish Service Level Objectives (SLO), Service Level Indicators (SLI), and Service Level Agreements (SLA) with error budget tracking, burn rate alerts, and automated reporting using Prometheus and tools like Sloth or Pyrra. Use when defining reliability targets for customer-facing services, balancing feature velocity against system reliability through error budgets, migrating from arbitrary uptime goals to data-driven metrics, or implementing Site Reliability Engineering practices.

Skill
pjt222

© 2025 SkillRepo · Find the right skill, skip the noise.