Skip to main content

Arize Trace Skill

Skill Verified Active

Downloads, exports, and inspects existing Arize traces and spans to understand what an LLM app is doing or debug runtime issues. Covers exporting traces by ID, spans by ID, sessions by ID, and root-cause investigation using the ax CLI. Use when the user wants to look at existing trace data, see what their LLM app is doing, export traces, download spans, investigate errors, or analyze behavior regressions.

Purpose

To enable users to inspect and debug their LLM applications by exporting and analyzing trace and span data from Arize.

Features

  • Export Arize traces by ID
  • Export Arize spans by ID
  • Export Arize sessions by ID
  • Investigate root causes using ax CLI
  • Filter trace and span data

Use Cases

  • Looking at existing trace data
  • Seeing what an LLM app is doing
  • Exporting traces for offline analysis
  • Investigating runtime errors
  • Analyzing behavior regressions

Non-Goals

  • Modifying Arize data
  • Real-time monitoring of live traces
  • Configuring Arize itself

Workflow

  1. Identify the need to inspect Arize traces/spans.
  2. Determine the appropriate `ax` command (e.g., `spans export`, `traces export`) based on the data needed.
  3. Construct the command with necessary arguments (PROJECT, IDs, filters, time ranges).
  4. Execute the command, handling authentication and profile setup as needed.
  5. Analyze the exported JSON data for debugging or understanding application behavior.

Prerequisites

  • Requires the ax CLI
  • Requires a configured Arize profile

Installation

npx skills add github/awesome-copilot

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified
99 /100
Analyzed about 21 hours ago

Trust Signals

Last commit1 day ago
Stars32.9k
LicenseMIT
Status
View Source

Similar Extensions

Arize Link

99

Generates deep links to the Arize UI for traces, spans, sessions, datasets, labeling queues, evaluators, and annotation configs. Produces clickable URLs for sharing Arize resources with team members. Use when the user wants to link to or open a trace, span, session, dataset, evaluator, or annotation config in the Arize UI.

Skill
github

LangSmith Observability

99

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

Skill
Orchestra-Research

Arize Instrumentation

95

Adds Arize AX tracing to an LLM application for the first time. Follows a two-phase agent-assisted flow to analyze the codebase then implement instrumentation after user confirmation. Use when the user wants to instrument their app, add tracing from scratch, set up LLM observability, integrate OpenTelemetry or openinference, or get started with Arize tracing.

Skill
github

Arize Prompt Optimization

100

Optimizes, improves, and debugs LLM prompts using production trace data, evaluations, and annotations. Extracts prompts from spans, gathers performance signal, and runs a data-driven optimization loop using the ax CLI. Use when the user mentions optimize prompt, improve prompt, make AI respond better, improve output quality, prompt engineering, prompt tuning, or system prompt improvement.

Skill
github

Arize Experiment

100

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use when the user mentions create experiment, run experiment, compare models, model performance, evaluate AI, experiment results, benchmark, A/B test models, or measure accuracy.

Skill
github

Arize Evaluator

100

Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks, trigger-run operations, column mapping, and continuous monitoring. Use when the user mentions create evaluator, LLM judge, hallucination, faithfulness, correctness, relevance, run eval, score spans, score experiment, trigger-run, column mapping, continuous monitoring, or improve evaluator prompt.

Skill
github

© 2025 SkillRepo · Find the right skill, skip the noise.