跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Deploy Ml Model Serving

技能 活跃

Deploy machine learning models to production serving infrastructure using MLflow, BentoML, or Seldon Core with REST/gRPC endpoints, implement autoscaling, monitoring, and A/B testing capabilities for high-performance model inference at scale. Use when deploying trained models for real-time inference, setting up REST or gRPC prediction APIs, implementing autoscaling for variable load, running A/B tests between model versions, or migrating from batch to real-time inference.

目的

To enable users to deploy and manage machine learning models in production environments with robust, scalable, and observable serving infrastructure.

功能

  • Deploy ML models with MLflow, BentoML, Seldon Core
  • Implement REST/gRPC endpoints for real-time inference
  • Configure autoscaling for variable load
  • Set up monitoring and observability with Prometheus/Grafana
  • Implement A/B testing and canary deployments

使用场景

  • Deploying trained models for real-time inference
  • Setting up prediction APIs
  • Implementing autoscaling for fluctuating demand
  • Running A/B tests between model versions
  • Migrating from batch to real-time inference

非目标

  • Training or fine-tuning ML models
  • Managing MLflow tracking server or Kubernetes cluster infrastructure
  • Directly interacting with cloud provider deployment services (e.g., SageMaker, Vertex AI)
  • Performing deep model performance analysis or drift detection (beyond basic monitoring)

Documentation

  • info:Configuration & parameter referenceWhile the SKILL.md provides good procedural steps and examples, it does not explicitly document all configuration parameters, defaults, or precedence orders for external tools like MLflow, BentoML, or Kubernetes.

Portability

  • warning:Structural AssumptionThe example Kubernetes YAMLs and Dockerfiles assume a certain project structure and environment setup (e.g., 'your-registry/churn-classifier:v1.0', 'http://mlflow-server:5000') that might require significant adaptation by the user.
  • warning:Runtime stabilityThe skill relies heavily on specific tooling (MLflow, BentoML, Seldon Core, Kubernetes) and assumes their presence and proper configuration, which may lead to instability if the user's environment differs significantly.

Errors

  • info:Actionable error messagesWhile the SKILL.md outlines potential failures and recovery steps for specific deployment stages, it does not provide universally actionable error messages for all potential runtime issues within the deployed infrastructure.

Execution

  • warning:Pinned dependenciesThe SKILL.md mentions dependencies for MLflow, BentoML, and Seldon Core, and Dockerfiles include `pip install` commands, but specific versions are not always pinned, and lockfiles for Python dependencies are not explicitly referenced, which could lead to dependency conflicts.

Practical Utility

  • warning:Edge casesThe SKILL.md lists common pitfalls and provides some recovery steps, but it does not systematically document all potential failure modes (e.g., dependency conflicts, network issues, specific K8s errors) with clear symptoms and recovery paths.

Safety

  • info:Halt on unexpected stateThe skill outlines potential failure points and recovery steps, but it does not explicitly mandate halting the workflow or reporting on unexpected pre-state in a machine-readable checklist format.

安装

/plugin install agent-almanac@pjt222-agent-almanac

质量评分

94 /100
about 22 hours ago 分析

信任信号

最近提交2 days ago
星标14
许可证MIT
状态
查看源代码

类似扩展

Orchestrate Ml Pipeline

99

Orchestrate end-to-end machine learning pipelines using Prefect or Airflow with DAG construction, task dependencies, retry logic, scheduling, monitoring, and integration with MLflow, DVC, and feature stores for production ML workflows. Use when automating multi-step ML workflows from data ingestion to deployment, scheduling periodic model retraining, coordinating distributed training tasks, or managing retry logic and failure recovery across pipeline stages.

技能
pjt222

Monitor Model Drift

99

Implement comprehensive model drift monitoring using Evidently AI, statistical tests (PSI, KS), and custom metrics to detect data drift and concept drift in production ML systems. Set up automated alerting and reporting workflows to catch degradation before it impacts business metrics. Use when production models show unexplained performance degradation, when new data distributions differ from training data, when seasonal shifts affect input features, or when regulatory requirements mandate model monitoring.

技能
pjt222

K8s Manifest Generator

100

Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.

技能
wshobson

Hf Cli

100

Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.

技能
huggingface

Arize Experiment

100

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.

技能
github

Arize Evaluator

100

Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.

技能
github