Этот контент пока недоступен на вашем языке и отображается на английском.

Data Extractor

Skill Предупреждение

Резюме ИИ

This skill leverages the unstructured Python library to process a wide range of document types, including PDFs, Word docs, emails, and HTML. It automatically detects and partitions elements, extracts text and metadata, and supports advanced features like table structure inference, OCR, and semantic chunking for RAG applications.

Scope

critical:Description qualityThe description is materially misleading as it contains only a single character ('>') and provides no actual information about the extension's functionality, which is contrary to the provided content in SKILL.md.

Documentation

info:Configuration & parameter referenceWhile the SKILL.md provides extensive code examples, it does not explicitly document all configuration options or parameters for the `partition` function or its variations, nor does it detail precedence order for any potential configurations.

Code Execution

info:ValidationThe SKILL.md demonstrates the use of `unstructured` library functions, which likely perform internal validation on file paths and parameters, but explicit schema validation within the skill's logic is not showcased.

Compliance

info:GDPRThe skill extracts data from documents. While it doesn't explicitly handle personal data, the extracted content could potentially contain PII, which would be submitted to the LLM without additional sanitization by this skill itself.

Установка

npx skills add claude-office-skills/skills

Запускает Vercel skills CLI (skills.sh) через npx — нужны локальный Node.js и хотя бы один установленный совместимый со skills агент (Claude Code, Cursor, Codex, …). Предполагается, что репозиторий соответствует формату agentskills.io.

3 months ago

claude-office-skills

98 stars

MIT

Обновлено 3 days ago

Посмотреть исходный код

Data Extractor

Scope

Documentation

Code Execution

Compliance

Похожие расширения

Table Extractor

Document Parser Skill

PDF to DOCX Converter

PaddleOCR Document Parsing

PDF Extraction

Chat with PDF