Skip to main content

PaddleOCR Document Parsing

Skill Verified Active

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.

Purpose

To accurately extract structured information from complex documents and images, making the content easily usable for LLMs and downstream processing.

Features

  • Extract tables with cell-level precision
  • Recognize formulas as LaTeX
  • Parse multi-column layouts and reading order
  • Output structured Markdown or JSON
  • Support for PDFs and document images

Use Cases

  • Processing invoices and financial reports
  • Extracting content from academic papers
  • Structuring data from scanned documents
  • Analyzing complex document layouts

Non-Goals

  • Simple text-only OCR tasks
  • Speed-critical OCR on basic images
  • Processing screenshots or basic images with clear text

Workflow

  1. Identify input source (URL or local file)
  2. Execute document parsing script with appropriate parameters
  3. Parse the JSON response (checking `ok` status and `error` fields)
  4. Extract relevant data (text, tables, formulas) from the structured output
  5. Present results to the user or use for further processing

Prerequisites

  • Python 3.9+
  • uv package manager
  • Internet access for API calls
  • PADDLEOCR_DOC_PARSING_API_URL environment variable
  • PADDLEOCR_ACCESS_TOKEN environment variable

Installation

npx skills add PaddlePaddle/PaddleOCR

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified
99 /100
Analyzed about 22 hours ago

Trust Signals

Last commit1 day ago
Stars77.8k
LicenseApache-2.0
Status
View Source

Similar Extensions

Paddleocr Text Recognition

99

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.

Skill
PaddlePaddle

Convert Resume to Markdown

100

Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.

Skill
iterationlayer

Firecrawl Parse

99

Efficiently extract and convert the contents of any local file—such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, or HTML—into clean, well-formatted markdown saved to disk. Use this skill whenever the user requests to parse, read, or extract information from a file on their computer, including phrases like “parse this PDF”, “convert this document”, “read this file”, “extract text from”, or when a local file path (not a URL) is provided. This skill offers advanced options like generating AI-powered summaries and answering questions based on the file's content. Prefer this tool over `scrape` when handling local files to deliver precise, structured outputs for downstream tasks.

Skill
firecrawl

Markdown to Styled PDF

99

Generate a professionally styled PDF document from Markdown content with custom fonts, headers, and page numbers.

Skill
iterationlayer

Document Extraction API

99

Extract structured data from documents using AI-powered field extraction.

Skill
iterationlayer

Nutrient Document Processing

98

Process documents with Nutrient DWS. Use when the user wants to generate PDFs from HTML or URLs, convert Office/images/PDFs, assemble or split packets, OCR scans, extract text/tables/key-value pairs, redact PII, watermark, sign, fill forms, optimize PDFs, or produce compliance outputs like PDF/A or PDF/UA. Triggers include convert to PDF, merge these PDFs, OCR this scan, extract tables, redact PII, sign this PDF, make this PDF/A, or linearize for web delivery.

Skill
PSPDFKit-labs

© 2025 SkillRepo · Find the right skill, skip the noise.