LangSmith Observability
Skill Verifiziert AktivLLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.
To provide a robust platform for debugging, evaluating, and monitoring LLM applications by leveraging LangSmith's tracing, dataset, and monitoring features.
Funktionen
- LLM tracing for inputs, outputs, and latency
- Systematic model evaluation against datasets
- Production system monitoring for metrics and errors
- Integration with OpenAI, Anthropic, LangChain, LlamaIndex
- Client API for programmatic interaction with LangSmith
Anwendungsfälle
- Debugging LLM application issues
- Evaluating model outputs against datasets
- Monitoring production LLM systems
- Building regression testing pipelines for AI applications
Nicht-Ziele
- General deep learning experiment tracking (use Weights & Biases)
- General ML lifecycle management (use MLflow)
- ML monitoring focused on data drift (use Arize/WhyLabs)
Praktiken
- LLM Observability
- LLM Evaluation
- LLM Monitoring
- LLM Tracing
- LLMOps
Voraussetzungen
- Python 3.7+
- LangSmith account and API key
- Set LANGSMITH_API_KEY and LANGSMITH_TRACING environment variables
Execution
- info:Pinned dependenciesDependencies are listed in SKILL.md, but not explicitly pinned with versions in a lockfile, which could lead to potential compatibility issues.
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Playwright Best Practices
100Verwenden Sie dies beim Schreiben von Playwright-Tests, Beheben von flackernden Tests, Debuggen von Fehlern, Implementieren des Page Object Model, Konfigurieren von CI/CD, Optimieren der Leistung, Mocken von APIs, Verwalten von Authentifizierung oder OAuth, Testen der Barrierefreiheit (axe-core), Hoch- und Herunterladen von Dateien, Mocken von Datums-/Uhrzeitangaben, WebSockets, Geolokalisierung, Berechtigungen, Multi-Tab-/Popup-Flows, mobilen/responsiven Layouts, Touch-Gesten, GraphQL, Fehlerbehandlung, Offline-Modus, Multi-User-Kollaboration, Drittanbieterdiensten (Zahlungen, E-Mail-Verifizierung), Überwachung von Konsolenfehlern, globalem Setup/Teardown, Testannotationen (skip, fixme, slow), Test-Tags (@smoke, @fast, @critical, Filterung mit --grep), Projektabhängigkeiten, Sicherheitstests (XSS, CSRF, Auth), Leistungsbudgets (Web Vitals, Lighthouse), iFrames, Komponententests, Canvas/WebGL, Service Workers/PWA, Testabdeckung, i18n/Lokalisierung, Electron-Apps oder Tests für Browsererweiterungen. Deckt E2E-, Komponenten-, API-, visuelle, Barrierefreiheits-, Sicherheits-, Electron- und Erweiterungstests ab.
Status
100Show DAG state, agent progress, and branch status for an AgentHub session.
Observability Designer
100Observability Designer (POWERFUL)
Grafana Dashboards
99Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Monitor Stream
99Stream live swarm events using the Monitor tool for real-time observability
Instrument Distributed Tracing
99Instrument applications with OpenTelemetry for distributed tracing, including auto and manual instrumentation, context propagation, sampling strategies, and integration with Jaeger or Tempo. Use when debugging latency issues in distributed systems, understanding request flow across microservices, correlating traces with logs and metrics for root cause analysis, measuring end-to-end latency, or migrating from legacy tracing systems to OpenTelemetry.