[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"extension-skill-claude-office-skills-doc-parser-zh-CN":3,"guides-for-claude-office-skills-doc-parser":226,"similar-k1776t2fdx4h35mkwpc5h201dd866zms":227},{"_creationTime":4,"_id":5,"children":6,"community":7,"display":9,"evaluation":23,"identity":192,"isFallback":197,"parentExtension":198,"providers":199,"relations":203,"repo":205,"workflow":223},1778053148350.4373,"k1776t2fdx4h35mkwpc5h201dd866zms",[],{"reviewCount":8},0,{"description":10,"installMethods":11,"name":12,"sourceUrl":13,"tags":14},">",{},"Document Parser Skill","https://github.com/claude-office-skills/skills/tree/HEAD/doc-parser",[15,16,17,18,19,20,21,22],"parsing","document-processing","pdf","python","ocr","extraction","layout-analysis","docling",{"_creationTime":24,"_id":25,"extensionId":5,"locale":26,"result":27,"trustSignals":180,"workflow":190},1778053561145.6184,"kn73de6m698v8krhvrt0b5zdhs866847","en",{"checks":28,"evaluatedAt":170,"extensionSummary":171,"promptVersionExtension":172,"promptVersionScoring":173,"rationale":174,"score":175,"summary":176,"tags":177,"targetMarket":178,"tier":179},[29,34,37,40,44,48,53,58,61,64,68,72,75,79,82,85,88,91,94,97,100,104,108,112,116,119,122,125,129,132,135,138,141,144,148,151,154,157,160,163,167],{"category":30,"check":31,"severity":32,"summary":33},"Practical Utility","Problem relevance","pass","The description names a concrete problem: parsing complex documents with advanced features like structure preservation and multi-column layout analysis.",{"category":30,"check":35,"severity":32,"summary":36},"Unique selling proposition","The skill leverages the 'docling' library, described as IBM's state-of-the-art document understanding library, offering advanced capabilities beyond basic text extraction.",{"category":30,"check":38,"severity":32,"summary":39},"Production readiness","The skill includes Python code examples for conversion, extraction, and batch processing, and mentions supported formats and advanced configurations, indicating a ready-to-use state.",{"category":41,"check":42,"severity":32,"summary":43},"Scope","Single responsibility principle","The skill focuses on document parsing and extraction, aligning with its name and description without extending into unrelated domains.",{"category":41,"check":45,"severity":46,"summary":47},"Description quality","warning","The displayed description is an empty string, providing no information about the skill's purpose or capabilities.",{"category":49,"check":50,"severity":51,"summary":52},"Invocation","Scoped tools","not_applicable","This skill does not expose specific tools but relies on the underlying library's functionality, making tool-specific scoping checks not applicable.",{"category":54,"check":55,"severity":56,"summary":57},"Documentation","Configuration & parameter reference","info","The skill provides Python code snippets demonstrating configuration options for the 'docling' library, but explicit parameter documentation and precedence are not detailed within the SKILL.md itself.",{"category":41,"check":59,"severity":51,"summary":60},"Tool naming","The skill does not expose user-facing tools with distinct names; it abstracts the functionality through the 'docling' library.",{"category":41,"check":62,"severity":32,"summary":63},"Minimal I/O surface","The skill's input appears to be a document path, and its output is structured data (text, tables, figures), aligning with minimal I/O principles.",{"category":65,"check":66,"severity":32,"summary":67},"License","License usability","The license is MIT, a permissive open-source license, clearly stated in both the SKILL.md frontmatter and a dedicated LICENSE file.",{"category":69,"check":70,"severity":51,"summary":71},"Maintenance","Commit recency","No commit history is available for this specific file/skill within the provided context, making recency assessment impossible.",{"category":69,"check":73,"severity":32,"summary":74},"Dependency Management","The SKILL.md and README.md clearly list 'docling' and its optional dependencies (OCR, all) for installation via pip, indicating proper dependency management.",{"category":76,"check":77,"severity":51,"summary":78},"Security","Secret Management","The skill does not appear to handle or expose any secrets.",{"category":76,"check":80,"severity":32,"summary":81},"Injection","The skill processes document content as data, and the Python code examples show explicit data extraction rather than instruction execution from loaded content.",{"category":76,"check":83,"severity":32,"summary":84},"Transitive Supply-Chain Grenades","The skill relies on the 'docling' Python library, which is installed via pip, and does not appear to fetch or execute external code at runtime.",{"category":76,"check":86,"severity":32,"summary":87},"Sandbox Isolation","The skill operates on provided document files and its Python code examples suggest it processes data, with no indications of attempting to modify files outside its scope.",{"category":76,"check":89,"severity":32,"summary":90},"Sandbox escape primitives","No evidence of detached process spawns or retry loops around denied tool calls was found in the provided script examples.",{"category":76,"check":92,"severity":32,"summary":93},"Data Exfiltration","The skill's purpose is document processing and data extraction. There are no outbound calls or references to confidential data submission.",{"category":76,"check":95,"severity":32,"summary":96},"Hidden Text Tricks","The bundled files (SKILL.md, LICENSE, etc.) are free of hidden-steering tricks, visible Unicode characters, or other obfuscation methods.",{"category":76,"check":98,"severity":32,"summary":99},"Opaque code execution","The skill's code examples are plain Python and the 'docling' library is installed via pip, with no indications of obfuscated or base64-encoded payloads.",{"category":101,"check":102,"severity":32,"summary":103},"Portability","Structural Assumption","The skill assumes a document file path as input, which is standard. The Python code examples do not indicate assumptions about specific project structures.",{"category":105,"check":106,"severity":51,"summary":107},"Trust","Issues Attention","No GitHub issues data was provided for this skill.",{"category":109,"check":110,"severity":32,"summary":111},"Versioning","Release Management","The SKILL.md frontmatter clearly declares the version as '1.0'.",{"category":113,"check":114,"severity":56,"summary":115},"Code Execution","Validation","The skill documentation shows examples of using the 'docling' library, which likely has internal validation, but explicit schema-based validation for inputs and outputs within the skill's context is not detailed.",{"category":76,"check":117,"severity":32,"summary":118},"Unguarded Destructive Operations","The skill's primary function is document parsing and extraction, which are read-only operations and do not involve destructive actions.",{"category":113,"check":120,"severity":56,"summary":121},"Error Handling","The Python examples show basic try-except blocks for error handling within the 'docling' library usage, but detailed error categorization, meaningful reporting, or specific recovery steps are not extensively documented in the skill.",{"category":113,"check":123,"severity":51,"summary":124},"Logging","The skill itself does not appear to implement custom logging; it relies on the underlying 'docling' library, and there are no specific instructions for audit logging.",{"category":126,"check":127,"severity":56,"summary":128},"Compliance","GDPR","The skill processes document content, which may contain personal data, but there are no specific sanitization steps mentioned within the skill itself before submission to the LLM. The 'docling' library may handle this, but it's not explicit here.",{"category":126,"check":130,"severity":32,"summary":131},"Target market","The skill's functionality is global and not tied to any specific geographic or legal jurisdiction. The target market is correctly set to 'global'.",{"category":101,"check":133,"severity":32,"summary":134},"Runtime stability","The skill is written in Python and relies on standard libraries and the 'docling' package, which is designed for broad compatibility. No specific OS or shell assumptions were detected.",{"category":49,"check":136,"severity":32,"summary":137},"Precise Purpose","The skill's purpose is clearly stated as advanced document parsing using 'docling', with examples of its usage provided in the prompt.",{"category":49,"check":139,"severity":32,"summary":140},"Concise Frontmatter","The frontmatter is dense and self-contained, providing essential metadata like name, description, version, and capabilities.",{"category":54,"check":142,"severity":32,"summary":143},"Concise Body","The skill body is well-structured with clear sections for overview, usage, domain knowledge, and examples, keeping the main SKILL.md concise.",{"category":145,"check":146,"severity":32,"summary":147},"Context","Progressive Disclosure","The SKILL.md includes code snippets and explanations, with links to external resources like the 'docling' GitHub and documentation, demonstrating progressive disclosure.",{"category":145,"check":149,"severity":51,"summary":150},"Forked exploration","The skill's primary function is document parsing and data extraction, not deep exploration or code review, making 'context: fork' not applicable.",{"category":30,"check":152,"severity":32,"summary":153},"Usage examples","Multiple clear, end-to-end Python examples are provided, demonstrating how to use the skill for various document parsing tasks, including table extraction and batch processing.",{"category":30,"check":155,"severity":56,"summary":156},"Edge cases","Limitations such as handling very large documents, scanned content needing OCR, complex tables, and encrypted PDFs are mentioned, but specific failure modes and recovery steps are not detailed.",{"category":113,"check":158,"severity":51,"summary":159},"Tool Fallback","The skill does not rely on external tools like an MCP server; it directly uses the 'docling' Python library.",{"category":101,"check":161,"severity":32,"summary":162},"Stack assumptions","The skill clearly states its Python dependency and recommends installing 'docling' with optional components for OCR or full functionality, with clear installation instructions provided.",{"category":164,"check":165,"severity":32,"summary":166},"Safety","Halt on unexpected state","While not explicitly listing preconditions as a machine-readable checklist, the provided Python code and documentation imply that malformed input or unsupported formats would lead to errors from the 'docling' library, effectively halting the workflow.",{"category":101,"check":168,"severity":32,"summary":169},"Cross-skill coupling","The skill is self-contained and focuses solely on document parsing using the 'docling' library, with no apparent reliance on other specific skills being loaded.",1778053268945,"This skill uses the docling library to parse complex PDFs, Word documents, and images, preserving structure, extracting tables, and handling multi-column layouts. It provides Python code examples for basic and advanced usage, including batch processing and specific document types like academic papers and reports.","2.0.0","3.4.0","The skill is well-documented with clear usage examples and leverages a specialized library ('docling') for advanced document parsing. It adheres to good practices regarding scope, security, and portability. Minor areas for improvement include more detailed error handling documentation and explicit edge case recovery steps.",92,"A robust document parsing skill leveraging the 'docling' library for advanced extraction of text, tables, and figures from various document formats.",[15,16,17,18,19,20,21,22],"global","verified",{"codeQuality":181,"collectedAt":182,"documentation":183,"maintenance":185,"security":186,"testCoverage":189},{},1778053250947,{"descriptionLength":184,"readmeSize":8},1,{},{"hasNpmPackage":187,"license":188,"smitheryVerified":187},false,"MIT",{"hasCi":187,"hasTests":187},{"updatedAt":191},1778053561145,{"githubOwner":193,"githubRepo":194,"locale":26,"slug":195,"type":196},"claude-office-skills","skills","doc-parser","skill",true,null,{"extract":200,"llm":202},{"commitSha":201,"license":188},"9c4c7d5cd2813a8936bf2c9fdb174ea883b85a11",{"promptVersionExtension":172,"promptVersionScoring":173,"score":175,"targetMarket":178,"tier":179},{"repoId":204},"kd7fw7xbj58qc2z8whrrjptbed8659db",{"_creationTime":206,"_id":204,"identity":207,"providers":209,"workflow":220},1777995558409.8474,{"githubOwner":193,"githubRepo":194,"sourceUrl":208},"https://github.com/claude-office-skills/skills",{"discover":210,"github":213},{"sources":211},[212],"skills-sh",{"closedIssues90d":8,"forks":214,"license":188,"openIssues90d":215,"pushedAt":216,"readmeSize":217,"stars":218,"topics":219},27,2,1769868236000,29630,98,[],{"discoverAt":221,"extractAt":222,"githubAt":222,"updatedAt":222},1777995558409,1778053155657,{"anyEnrichmentAt":224,"extractAt":225,"githubAt":224,"llmAt":191,"updatedAt":191},1778053151766,1778053148350,[],[228,255,273,292,314,333],{"_creationTime":229,"_id":230,"community":231,"display":232,"identity":240,"providers":244,"relations":249,"workflow":251},1778053339109.673,"k170fjdnm4zmjtz1rgs8zwq4418663pv",{"reviewCount":8},{"description":233,"installMethods":234,"name":235,"sourceUrl":236,"tags":237},"Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.",{},"PaddleOCR Document Parsing","https://github.com/aidenwu0209/paddleocr-skills/tree/HEAD/skills/paddleocr-doc-parsing",[17,238,19,21,239,18],"document-parsing","paddleocr",{"githubOwner":241,"githubRepo":242,"locale":26,"slug":243,"type":196},"aidenwu0209","paddleocr-skills","paddleocr-doc-parsing",{"extract":245,"llm":248},{"commitSha":246,"license":247},"ca41406b66e5a475f43b073a5b731dfd1b9c50b1","Apache-2.0",{"promptVersionExtension":172,"promptVersionScoring":173,"score":218,"targetMarket":178,"tier":179},{"repoId":250},"kd7b1t00prnctc7258swvw0hs5865sjq",{"anyEnrichmentAt":252,"extractAt":253,"githubAt":252,"llmAt":254,"updatedAt":254},1778053339393,1778053339109,1778053352237,{"_creationTime":256,"_id":257,"community":258,"display":259,"identity":266,"providers":268,"relations":271,"workflow":272},1778053148350.4734,"k1782aqmjfqy0qysysgq76w9z1867e3x",{"reviewCount":8},{"description":10,"installMethods":260,"name":261,"sourceUrl":262,"tags":263},{},"Smart OCR Skill","https://github.com/claude-office-skills/skills/tree/HEAD/smart-ocr",[19,15,264,239,20,265,16],"multilingual","image-processing",{"githubOwner":193,"githubRepo":194,"locale":26,"slug":267,"type":196},"smart-ocr",{"extract":269,"llm":270},{"commitSha":201,"license":188},{"promptVersionExtension":172,"promptVersionScoring":173,"score":175,"targetMarket":178,"tier":179},{"repoId":204},{"anyEnrichmentAt":224,"extractAt":225,"githubAt":224,"llmAt":191,"updatedAt":191},{"_creationTime":274,"_id":275,"community":276,"display":277,"identity":285,"providers":287,"relations":290,"workflow":291},1778053148350.4656,"k171nxqak0bb4qq89mkfwf02s5867cf6",{"reviewCount":8},{"description":278,"installMethods":279,"name":280,"sourceUrl":281,"tags":282},"Convert PDF files to editable Word documents using pdf2docx",{},"PDF to DOCX Converter","https://github.com/claude-office-skills/skills/tree/HEAD/pdf-to-docx",[17,283,284,16,18],"docx","conversion",{"githubOwner":193,"githubRepo":194,"locale":26,"slug":286,"type":196},"pdf-to-docx",{"extract":288,"llm":289},{"commitSha":201,"license":188},{"promptVersionExtension":172,"promptVersionScoring":173,"score":218,"targetMarket":178,"tier":179},{"repoId":204},{"anyEnrichmentAt":224,"extractAt":225,"githubAt":224,"llmAt":191,"updatedAt":191},{"_creationTime":293,"_id":294,"community":295,"display":296,"identity":306,"providers":308,"relations":312,"workflow":313},1778053148350.4636,"k171dtxahnz3h8q0jz3gk6akks867ym1",{"reviewCount":8},{"description":297,"installMethods":298,"name":299,"sourceUrl":300,"tags":301},"Extract text, tables, and metadata from PDFs using pdfplumber",{},"PDF Extraction","https://github.com/claude-office-skills/skills/tree/HEAD/pdf-extraction",[17,20,302,303,304,305,16],"text","tables","metadata","pdfplumber",{"githubOwner":193,"githubRepo":194,"locale":26,"slug":307,"type":196},"pdf-extraction",{"extract":309,"llm":310},{"commitSha":201,"license":188},{"promptVersionExtension":172,"promptVersionScoring":173,"score":311,"targetMarket":178,"tier":179},95,{"repoId":204},{"anyEnrichmentAt":224,"extractAt":225,"githubAt":224,"llmAt":191,"updatedAt":191},{"_creationTime":315,"_id":316,"community":317,"display":318,"identity":325,"providers":327,"relations":331,"workflow":332},1778053148350.4768,"k17c4t5g480bzq5t7qrjgbjsys867fb5",{"reviewCount":8},{"description":10,"installMethods":319,"name":320,"sourceUrl":321,"tags":322},{},"Table Extractor","https://github.com/claude-office-skills/skills/tree/HEAD/table-extractor",[17,20,323,324,15],"table","camelot",{"githubOwner":193,"githubRepo":194,"locale":26,"slug":326,"type":196},"table-extractor",{"extract":328,"llm":329},{"commitSha":201,"license":188},{"promptVersionExtension":172,"promptVersionScoring":173,"score":175,"targetMarket":178,"tier":330},"flagged",{"repoId":204},{"anyEnrichmentAt":224,"extractAt":225,"githubAt":224,"llmAt":191,"updatedAt":191},{"_creationTime":334,"_id":335,"community":336,"display":337,"identity":350,"providers":354,"relations":358,"workflow":360},1778054812528.7214,"k17c4avaab2db2m79et4f4hnwn867qj1",{"reviewCount":8},{"description":338,"installMethods":339,"name":340,"sourceUrl":341,"tags":342},"Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.",{},"AI Multimodal Processing Skill","https://github.com/samhvw8/dot-claude/tree/HEAD/skills/ai-multimodal",[343,344,345,346,347,16,348,19,349],"gemini-api","multimodal","audio","image","video","text-to-image","transcription",{"githubOwner":351,"githubRepo":352,"locale":26,"slug":353,"type":196},"samhvw8","dot-claude","ai-multimodal",{"extract":355,"llm":357},{"commitSha":356,"license":188},"28c76162116d2eedab131c0e1548fdc76a2999f7",{"promptVersionExtension":172,"promptVersionScoring":173,"score":311,"targetMarket":178,"tier":179},{"repoId":359},"kd79ad9dpqazy79y2s6rvajgjn865xek",{"anyEnrichmentAt":361,"extractAt":362,"githubAt":361,"llmAt":363,"updatedAt":363},1778054813688,1778054812528,1778054896678]