Arize Trace Skill

Skill Verified Active

Downloads, exports, and inspects existing Arize traces and spans to understand what an LLM app is doing or debug runtime issues. Covers exporting traces by ID, spans by ID, sessions by ID, and root-cause investigation using the ax CLI. Use when the user wants to look at existing trace data, see what their LLM app is doing, export traces, download spans, investigate errors, or analyze behavior regressions.

Purpose

To enable users to inspect and debug their LLM applications by exporting and analyzing trace and span data from Arize.

Features

Export Arize traces by ID
Export Arize spans by ID
Export Arize sessions by ID
Investigate root causes using ax CLI
Filter trace and span data

Use Cases

Looking at existing trace data
Seeing what an LLM app is doing
Exporting traces for offline analysis
Investigating runtime errors
Analyzing behavior regressions

Non-Goals

Modifying Arize data
Real-time monitoring of live traces
Configuring Arize itself

Workflow

Identify the need to inspect Arize traces/spans.
Determine the appropriate `ax` command (e.g., `spans export`, `traces export`) based on the data needed.
Construct the command with necessary arguments (PROJECT, IDs, filters, time ranges).
Execute the command, handling authentication and profile setup as needed.
Analyze the exported JSON data for debugging or understanding application behavior.

Prerequisites

Requires the ax CLI
Requires a configured Arize profile

Installation

npx skills add github/awesome-copilot

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified

99 /100

Analyzed about 21 hours ago

Trust Signals

Last commit1 day ago

GitHub owner github

Stars32.9k

LicenseMIT

Websiteawesome-copilot.github.com

Status

View Source

Similar Extensions

Arize Link

Generates deep links to the Arize UI for traces, spans, sessions, datasets, labeling queues, evaluators, and annotation configs. Produces clickable URLs for sharing Arize resources with team members. Use when the user wants to link to or open a trace, span, session, dataset, evaluator, or annotation config in the Arize UI.

Skill

github

LangSmith Observability

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

Skill

Orchestra-Research

Arize Instrumentation

Adds Arize AX tracing to an LLM application for the first time. Follows a two-phase agent-assisted flow to analyze the codebase then implement instrumentation after user confirmation. Use when the user wants to instrument their app, add tracing from scratch, set up LLM observability, integrate OpenTelemetry or openinference, or get started with Arize tracing.

Skill

github

Arize Prompt Optimization

100

Optimizes, improves, and debugs LLM prompts using production trace data, evaluations, and annotations. Extracts prompts from spans, gathers performance signal, and runs a data-driven optimization loop using the ax CLI. Use when the user mentions optimize prompt, improve prompt, make AI respond better, improve output quality, prompt engineering, prompt tuning, or system prompt improvement.

Skill

github

Arize Experiment

100

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.

Skill

github

Arize Evaluator

100

Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.

Skill

github