跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

LangSmith Observability

技能 已验证 活跃

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

目的

To provide a robust platform for debugging, evaluating, and monitoring LLM applications by leveraging LangSmith's tracing, dataset, and monitoring features.

功能

  • LLM tracing for inputs, outputs, and latency
  • Systematic model evaluation against datasets
  • Production system monitoring for metrics and errors
  • Integration with OpenAI, Anthropic, LangChain, LlamaIndex
  • Client API for programmatic interaction with LangSmith

使用场景

  • Debugging LLM application issues
  • Evaluating model outputs against datasets
  • Monitoring production LLM systems
  • Building regression testing pipelines for AI applications

非目标

  • General deep learning experiment tracking (use Weights & Biases)
  • General ML lifecycle management (use MLflow)
  • ML monitoring focused on data drift (use Arize/WhyLabs)

实践

  • LLM Observability
  • LLM Evaluation
  • LLM Monitoring
  • LLM Tracing
  • LLMOps

先决条件

  • Python 3.7+
  • LangSmith account and API key
  • Set LANGSMITH_API_KEY and LANGSMITH_TRACING environment variables

Execution

  • info:Pinned dependenciesDependencies are listed in SKILL.md, but not explicitly pinned with versions in a lockfile, which could lead to potential compatibility issues.

安装

请先添加 Marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

质量评分

已验证
99 /100
1 day ago 分析

信任信号

最近提交17 days ago
星标8.3k
许可证MIT
状态
查看源代码

类似扩展

Playwright Best Practices

100

用于编写 Playwright 测试、修复不稳定测试、调试失败、实现页面对象模型 (Page Object Model)、配置 CI/CD、优化性能、模拟 API、处理身份验证或 OAuth、测试可访问性 (axe-core)、文件上传/下载、日期/时间模拟、WebSockets、地理定位、权限、多标签/弹出窗口流程、移动/响应式布局、触摸手势、GraphQL、错误处理、离线模式、多人协作、第三方服务(付款、电子邮件验证)、控制台错误监控、全局设置/拆卸、测试注解(skip, fixme, slow)、测试标签(@smoke, @fast, @critical, 使用 --grep 过滤)、项目依赖项、安全测试(XSS, CSRF, 身份验证)、性能预算(Web Vitals, Lighthouse)、iframe、组件测试、canvas/WebGL、服务工作线程/PWA、测试覆盖率、i18n/本地化、Electron 应用或浏览器扩展测试。涵盖 E2E、组件、API、视觉、可访问性、安全、Electron 和扩展测试。

技能
currents-dev

Status

100

Show DAG state, agent progress, and branch status for an AgentHub session.

技能
alirezarezvani

Observability Designer

100

Observability Designer (POWERFUL)

技能
alirezarezvani

Grafana Dashboards

99

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

技能
wshobson

Monitor Stream

99

Stream live swarm events using the Monitor tool for real-time observability

技能
ruvnet

Instrument Distributed Tracing

99

Instrument applications with OpenTelemetry for distributed tracing, including auto and manual instrumentation, context propagation, sampling strategies, and integration with Jaeger or Tempo. Use when debugging latency issues in distributed systems, understanding request flow across microservices, correlating traces with logs and metrics for root cause analysis, measuring end-to-end latency, or migrating from legacy tracing systems to OpenTelemetry.

技能
pjt222