Monitor Model Drift

Skill Verified Active

Implement comprehensive model drift monitoring using Evidently AI, statistical tests (PSI, KS), and custom metrics to detect data drift and concept drift in production ML systems. Set up automated alerting and reporting workflows to catch degradation before it impacts business metrics. Use when production models show unexplained performance degradation, when new data distributions differ from training data, when seasonal shifts affect input features, or when regulatory requirements mandate model monitoring.

Purpose

To proactively detect and alert on data and concept drift in production ML systems, preventing performance degradation and impact on business metrics through automated monitoring and reporting.

Features

Implement comprehensive model drift monitoring using Evidently AI
Detect data drift with statistical tests (PSI, KS)
Detect concept drift via performance monitoring
Set up automated alerting and reporting workflows
Schedule monitoring jobs for continuous oversight

Use Cases

When production models show unexplained performance degradation
When new data distributions differ from training data
When seasonal shifts affect input features
When regulatory requirements mandate model monitoring

Non-Goals

Automated model retraining based on drift detection
Real-time monitoring of every production system without scheduling
Deep analysis of root causes beyond drift identification

Installation

/plugin install agent-almanac@pjt222-agent-almanac

Quality Score

Verified

99 /100

Analyzed about 21 hours ago

Trust Signals

Last commit1 day ago

GitHub owner pjt222

Stars14

Downloads 308

LicenseMIT

Websitepjt222.github.io

Status

View Source

Similar Extensions

Azure Monitor Query Py

100

Azure Monitor Query SDK for Python. Use for querying Log Analytics workspaces and Azure Monitor metrics. Triggers: "azure-monitor-query", "LogsQueryClient", "MetricsQueryClient", "Log Analytics", "Kusto queries", "Azure metrics".

Skill

microsoft

Hf Cli

100

Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.

Skill

huggingface

Arize Experiment

100

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.

Skill

github

Arize Evaluator

100

Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.

Skill

github

Arize Dataset

100

Creates, manages, and queries Arize datasets and examples. Covers dataset CRUD, appending examples, exporting data, and file-based dataset creation using the ax CLI. Use when the user needs test data, evaluation examples, or mentions create dataset, list datasets, export dataset, append examples, dataset version, golden dataset, or test set.

Skill

github

CE Optimize

100

Run metric-driven iterative optimization loops -- define a measurable goal, run parallel experiments, measure each against hard gates or LLM-as-judge scores, keep improvements, and converge on the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation.

Skill

EveryInc