Instructor
Skill Verified ActiveExtract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library
To reliably extract and validate structured data from LLM responses, simplifying complex data processing tasks and improving the accuracy of LLM outputs.
Features
- Extract structured data with Pydantic validation
- Automatic retries on extraction failures
- Parse complex JSON with type safety
- Stream partial results for real-time processing
- Support for multiple LLM providers (Anthropic, OpenAI, local models)
Use Cases
- Reliably extracting entities, classifications, or complex objects from unstructured text.
- Ensuring LLM outputs conform to predefined schemas and data types.
- Building applications that require real-time processing of LLM-generated data through streaming.
- Integrating LLM-driven data extraction into existing Python applications with type safety.
Non-Goals
- Performing LLM inference directly without structured output requirements.
- Replacing core LLM providers or their fundamental APIs.
- Handling complex multi-turn conversational logic beyond structured response generation.
Execution
- info:Pinned dependenciesWhile the SKILL.md lists dependencies, it does not explicitly mention a lockfile (e.g., `requirements.txt` or `Pipfile.lock`) for pinning specific versions, which could be an area for improvement.
Installation
First, add the marketplace
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
Instructor
75Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library
Create Atomic Tool
99Build a `BaseTool[InSchema, OutSchema]` subclass — input/output schemas, `BaseToolConfig`, `run()` (and optional `run_async()`), env-driven secrets, typed failure outputs. Use when the user asks to "add a tool", "create a tool", "wrap an API as a tool", "build a `BaseTool`", "make a calculator/search/weather tool", or runs `/atomic-agents:create-atomic-tool`.
Guidance
99Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework
Create Atomic Schema
98Design and write a `BaseIOSchema` input/output pair for an Atomic Agents agent or tool — docstrings, field descriptions, validators, error variants. Use when the user asks to "create a schema", "design the input/output schema", "define an `IOSchema`", "write a `BaseIOSchema`", "model the agent's output", or runs `/atomic-agents:create-atomic-schema`.
Chatgpt Search
100Search ChatGPT and extract the full response + hydration JSON that powers the UI. Attaches to a running Chrome instance (port 9222 by default), opens ChatGPT, submits a query, waits for the streamed response, and returns structured data: messages, product cards, hydration JSON, and API calls. Use when asked to "search chatgpt", "ask chatgpt", "chatgpt search", "get chatgpt response", or "scrape chatgpt".
Website Extraction Api
100Extract typed JSON from public website pages using a schema.