PDF Extraction

Skill Verified

Extract text, tables, and metadata from PDFs using pdfplumber

AI Summary

This skill leverages the `pdfplumber` library to precisely extract textual content, tabular data, and document metadata from PDF files. It offers detailed control over extraction parameters and includes examples for common use cases like converting tables to DataFrames and processing invoice data.

Documentation

info:Configuration & parameter referenceWhile the code snippets show usage of pdfplumber with parameters like tolerances, these specific parameters and their default values are not explicitly documented in the SKILL.md or accompanying files.

Code Execution

info:ValidationThe provided code snippets demonstrate basic usage of `pdfplumber` but do not explicitly show the use of a schema validation library for input parameters like file paths or extraction options.

Installation

npx skills add claude-office-skills/skills

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

3 months ago

claude-office-skills

98 stars

MIT

Updated 5 days ago

View Source

Similar Extensions

Document Parser Skill

Skill

claude-office-skills

PDF to DOCX Converter

Convert PDF files to editable Word documents using pdf2docx

Skill

claude-office-skills

Chat with PDF

Answer questions about PDF content, summarize, and extract information

Skill

claude-office-skills

Table Extractor

Skill

claude-office-skills

Smart OCR Skill

Skill

claude-office-skills

GPU Document Processing

Use when processing large PDFs, document collections, or bulk text extraction tasks that benefit from GPU-accelerated processing. Triggers when the user provides large documents or needs bulk document analysis.

Skill

langchain-ai