PDF Processing OpenAI
技能 已验证 活跃Toolkit for comprehensive PDF reading, reviwing, and creation with visual quality control. Use to work with PDFs (.pdf files) for: (1) Reading or extracting content from existing PDFs, (2) Creating new PDF documents with professional formatting, (3) Generating reports, documents, or layouts that require precise typography and design, or any other PDF reading or generation tasks.
To enable AI agents to perform comprehensive PDF reading, reviewing, and creation tasks with a focus on visual quality control and professional formatting.
功能
- Reading and extracting content from existing PDFs
- Creating new PDF documents with formatting
- Generating reports and layouts with precise typography
- Visual quality control of rendered PDF pages
- Programmatic PDF generation using reportlab
使用场景
- Use to extract text and data from PDFs when layout fidelity is important.
- Use to programmatically generate professional-looking PDF documents for reports or proposals.
- Use to review and validate the visual appearance of generated PDFs before delivery.
- Use for any task requiring nuanced PDF content manipulation or generation.
非目标
- Editing existing PDF content directly (focus is on creation and extraction).
- Handling highly complex interactive PDF forms beyond basic content.
- Providing a GUI or visual editor; operates via LLM prompts and scripts.
工作流
- Render PDF pages to PNGs for visual inspection, using pdftoppm if available.
- Use reportlab for programmatic PDF creation.
- Employ pdfplumber or pypdf for text extraction and quick checks.
- Re-render pages after updates to verify alignment, spacing, and legibility.
- Clean up or remove intermediate files after final approval.
先决条件
- Python 3 environment
- Poppler utils (for rendering PDFs to PNGs)
- uv or pip for Python package management
Code Execution
- info:ValidationThe skill relies on external tools for input handling and does not explicitly mention schema validation libraries within its own instructions.
- info:Error HandlingThe skill mentions telling the user about missing dependencies if installation fails, but doesn't specify structured error reporting for other operational failures.
- info:LoggingThe skill mentions deleting intermediate files and keeping final artifacts organized, implying some level of state management but no explicit audit logging.
Errors
- info:Actionable error messagesThe skill mentions informing the user about missing dependencies, but lacks detailed error handling for other potential operational failures.
Execution
- info:Pinned dependenciesWhile Python dependencies are listed, explicit pinning via a lockfile (like `uv.lock` or `poetry.lock`) is not mentioned or evident. System tool installation commands are standard but not pinned to specific versions.
Practical Utility
- info:Usage examplesThe SKILL.md provides dependency installation commands and a rendering command example, but lacks end-to-end examples demonstrating input, invocation, and output for the core PDF processing tasks.
- info:Edge casesThe skill mentions handling missing dependencies as a failure mode and recovery step, but does not explicitly detail other edge cases for PDF processing (e.g., corrupted files, complex layouts).
安装
请先添加 Marketplace
/plugin marketplace add lawvable/awesome-legal-skills/plugin install pdf-processing-openai@lawvable质量评分
已验证类似扩展
Extract Fleet Vehicle Registration
100Extract vehicle identification, owner details, registration dates, and technical specifications from vehicle registration documents.
Convert Resume to Markdown
100Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.
Aws Cdk Development
100AWS Cloud Development Kit (CDK) 专家,用于使用 TypeScript/Python 构建云基础设施。在创建 CDK 堆栈、定义 CDK 构造、实现基础设施即代码,或当用户提及 CDK、CloudFormation、IaC、cdk synth、cdk deploy,或希望以编程方式定义 AWS 基础设施时使用。涵盖 CDK 应用结构、构造模式、堆栈组合和部署工作流。
Cleanup Cycles
100Detect and untangle circular dependencies. Runs madge/skott (TS), pycycle (Py), or compiler-only checks (Go/Rust). Auto-fixes leaf-extractable cycles; reports core cycles for human review. Use when the user asks to find circular imports, fix dependency cycles, or untangle module graph. Example queries — "find circular imports", "fix dependency cycles", "untangle our module graph", "why is madge complaining".
Document Extraction API
99Extract structured data from documents using AI-powered field extraction.
Convert Contract To Markdown
99Convert a contract PDF to clean markdown for clause extraction or LLM analysis.