Weights And Biases
Skill Verifiziert AktivTrack ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform
To provide a seamless way to leverage the comprehensive MLOps capabilities of Weights & Biases for tracking, visualizing, and managing machine learning experiments and models.
Funktionen
- Track ML experiments with automatic metric logging
- Visualize training in real-time dashboards
- Optimize hyperparameters with automated sweeps
- Manage model registry with versioning and lineage
Anwendungsfälle
- Use when you need to systematically track and compare ML model training runs.
- Use when you want to automate hyperparameter optimization for your models.
- Use when you need a centralized platform for managing model versions and artifacts.
- Use when collaborating with a team on ML projects and sharing experiment results.
Nicht-Ziele
- Does not replace the core ML training frameworks (e.g., TensorFlow, PyTorch) but integrates with them.
- Does not handle the actual execution of ML training jobs; it focuses on logging and management.
- Does not provide infrastructure for distributed training or deployment (though it integrates with platforms that do).
Execution
- info:Pinned dependenciesWhile the skill documentation specifies `wandb` as a dependency, there's no explicit mention of pinned versions or lockfiles within the skill's context, though the `wandb` package itself likely manages its dependencies.
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Weights And Biases
95Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform
Mlflow
98Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform
MLflow
96Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform
Hf Cli
100Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.
Arize Experiment
100Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.
Arize Evaluator
100Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.