Đi tới nội dung chính
Nội dung này hiện chưa có sẵn bằng ngôn ngữ của bạn và đang được hiển thị bằng tiếng Anh.

Document Parser Skill

Skill Đã xác minh
92

>

Tóm tắt từ AI

This skill uses the docling library to parse complex PDFs, Word documents, and images, preserving structure, extracting tables, and handling multi-column layouts. It provides Python code examples for basic and advanced usage, including batch processing and specific document types like academic papers and reports.

Scope

  • warning:Description qualityThe displayed description is an empty string, providing no information about the skill's purpose or capabilities.

Documentation

  • info:Configuration & parameter referenceThe skill provides Python code snippets demonstrating configuration options for the 'docling' library, but explicit parameter documentation and precedence are not detailed within the SKILL.md itself.

Code Execution

  • info:ValidationThe skill documentation shows examples of using the 'docling' library, which likely has internal validation, but explicit schema-based validation for inputs and outputs within the skill's context is not detailed.
  • info:Error HandlingThe Python examples show basic try-except blocks for error handling within the 'docling' library usage, but detailed error categorization, meaningful reporting, or specific recovery steps are not extensively documented in the skill.

Compliance

  • info:GDPRThe skill processes document content, which may contain personal data, but there are no specific sanitization steps mentioned within the skill itself before submission to the LLM. The 'docling' library may handle this, but it's not explicit here.

Practical Utility

  • info:Edge casesLimitations such as handling very large documents, scanned content needing OCR, complex tables, and encrypted PDFs are mentioned, but specific failure modes and recovery steps are not detailed.

Cài đặt

npx skills add claude-office-skills/skills

Chạy Vercel skills CLI (skills.sh) qua npx — yêu cầu Node.js trên máy và ít nhất một agent tương thích skills đã được cài (Claude Code, Cursor, Codex, …). Giả định repo tuân theo định dạng agentskills.io.

3 months ago
98 stars
MIT
Cập nhật 2 days ago
Xem mã nguồn

Tiện ích tương tự

PaddleOCR Document Parsing

98

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.

Skill
aidenwu0209

Smart OCR Skill

92

>

Skill
claude-office-skills

PDF to DOCX Converter

98

Convert PDF files to editable Word documents using pdf2docx

Skill
claude-office-skills

PDF Extraction

95

Extract text, tables, and metadata from PDFs using pdfplumber

Skill
claude-office-skills

Table Extractor

92

>

Skill
claude-office-skills

AI Multimodal Processing Skill

95

Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.

Skill
samhvw8