Skypilot Multi Cloud Orchestration
Skill Verifiziert AktivMulti-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
To enable efficient and cost-effective execution of ML training and batch jobs across diverse cloud environments by leveraging SkyPilot's multi-cloud orchestration capabilities.
Funktionen
- Multi-cloud orchestration for ML workloads
- Automatic cost optimization across providers
- Leverage spot instances with auto-recovery
- Unified interface for 20+ cloud providers
- Managed jobs with checkpointing and fault tolerance
Anwendungsfälle
- Running training or batch jobs across AWS, GCP, Azure, and Kubernetes.
- Optimizing GPU costs by selecting the cheapest cloud or region.
- Utilizing spot instances for cost savings with automatic recovery.
- Managing distributed multi-node training jobs.
Nicht-Ziele
- Acting as a direct interface to individual cloud provider SDKs.
- Replacing Kubernetes for existing K8s infrastructure management.
- Handling simpler serverless GPU needs better suited for tools like Modal.
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Orchestrate Ml Pipeline
99Orchestrate end-to-end machine learning pipelines using Prefect or Airflow with DAG construction, task dependencies, retry logic, scheduling, monitoring, and integration with MLflow, DVC, and feature stores for production ML workflows. Use when automating multi-step ML workflows from data ingestion to deployment, scheduling periodic model retraining, coordinating distributed training tasks, or managing retry logic and failure recovery across pipeline stages.
Cost Optimization
98Optimize cloud costs across AWS, Azure, GCP, and OCI through resource rightsizing, tagging strategies, reserved instances, and spending analysis. Use when reducing cloud expenses, analyzing infrastructure costs, or implementing cost governance policies.
Skypilot Multi Cloud Orchestration
95Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
Janitor Tokens
100Zeigt an, wie viele Token im Kontextfenster jede Fähigkeit verbraucht. Verwenden Sie dies, wenn der Benutzer nach Token-Kosten, Budget, Kapazität oder nach Fähigkeiten fragt, die am meisten Kontextspeicherplatz verschwenden.
Cloud Architect
100Designs cloud architectures, creates migration plans, generates cost optimization recommendations, and produces disaster recovery strategies across AWS, Azure, and GCP. Use when designing cloud architectures, planning migrations, or optimizing multi-cloud deployments. Invoke for Well-Architected Framework, cost optimization, disaster recovery, landing zones, security architecture, serverless design.
K8s Manifest Generator
100Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.