Arize Annotation
Skill Verifiziert AktivCreates and manages annotation configs (categorical, continuous, freeform label schemas) and annotation queues (human review workflows) on Arize. Applies human annotations to project spans via the Python SDK. Use when the user mentions annotation config, annotation queue, label schema, human feedback, bulk annotate spans, update_annotations, labeling queue, annotate record, or human review.
To streamline the process of managing data annotation workflows and configurations within the Arize platform.
Funktionen
- Manage annotation configurations (categorical, continuous, freeform)
- Create and manage annotation queues for human review
- Apply human annotations to project spans via Python SDK
- Bulk update annotations for dataset examples and experiment records
Anwendungsfälle
- When setting up a new data labeling schema in Arize.
- When needing to route data for human review via an annotation queue.
- When applying bulk human annotations to existing project spans.
- When managing label schema types and values for machine learning projects.
Nicht-Ziele
- Performing automated quality checks on annotations (use arize-evaluator).
- Managing Arize datasets or experiments directly (use arize-dataset/arize-experiment).
- Interacting with Arize beyond annotation and labeling workflows.
Installation
npx skills add github/awesome-copilotFührt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.
Qualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Label Training Data
98Set up systematic data labeling workflows using Label Studio or similar tools. Implement quality controls, measure inter-annotator agreement, manage labeler teams, and integrate labeled data into ML training pipelines. Use when starting a supervised ML project that requires labeled training data, when model performance is limited by insufficient labeled examples, when labeling text, images, audio, or video, or when implementing active learning to prioritize the most valuable examples.
Arize Experiment
100Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.
Arize Evaluator
100Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.
Arize Dataset
100Creates, manages, and queries Arize datasets and examples. Covers dataset CRUD, appending examples, exporting data, and file-based dataset creation using the ax CLI. Use when the user needs test data, evaluation examples, or mentions create dataset, list datasets, export dataset, append examples, dataset version, golden dataset, or test set.
Annotate Source Files
100Add PUT workflow annotations to source files using the correct language-specific comment prefix. Covers annotation syntax, skeleton generation via put_generate(), multiline annotations, .internal variables, and validation. Supports 30+ languages with automatic comment prefix detection. Use after analyzing a codebase and having an annotation plan, when adding workflow documentation to new or existing source files, or when documenting data pipelines, ETL processes, or multi-step computations.
Arize Prompt Optimization
100Optimizes, improves, and debugs LLM prompts using production trace data, evaluations, and annotations. Extracts prompts from spans, gathers performance signal, and runs a data-driven optimization loop using the ax CLI. Use when the user mentions optimize prompt, improve prompt, make AI respond better, improve output quality, prompt engineering, prompt tuning, or system prompt improvement.