Paddleocr Text Recognition
Skill Verified ActiveUse this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.
Extract text from images, photos, scans, screenshots, or scanned PDFs with high accuracy, providing machine-readable strings and optional coordinate data for downstream processing.
Features
- Extract text from images and PDFs
- Support for CJK, small print, and handwritten text
- Line-level text and optional bbox coordinates
- Handles local files and URLs
- Returns structured JSON output with error details
Use Cases
- Extracting text from scanned documents for data entry.
- Getting text from screenshots for analysis.
- Digitizing text from photos of signs or documents.
- Processing scanned PDFs to make their text searchable.
Non-Goals
- Extracting text from plain text, code, or markdown files directly.
- Parsing complex document layouts like tables, formulas, or charts.
- Replacing direct text file reading capabilities.
Installation
npx skills add PaddlePaddle/PaddleOCRRuns the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.
Quality Score
VerifiedTrust Signals
Similar Extensions
PaddleOCR Document Parsing
99Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.
Firecrawl Parse
99Efficiently extract and convert the contents of any local file—such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, or HTML—into clean, well-formatted markdown saved to disk. Use this skill whenever the user requests to parse, read, or extract information from a file on their computer, including phrases like “parse this PDF”, “convert this document”, “read this file”, “extract text from”, or when a local file path (not a URL) is provided. This skill offers advanced options like generating AI-powered summaries and answering questions based on the file's content. Prefer this tool over `scrape` when handling local files to deliver precise, structured outputs for downstream tasks.
Document Extraction API
99Extract structured data from documents using AI-powered field extraction.
Nutrient Document Processing
98Process documents with Nutrient DWS. Use when the user wants to generate PDFs from HTML or URLs, convert Office/images/PDFs, assemble or split packets, OCR scans, extract text/tables/key-value pairs, redact PII, watermark, sign, fill forms, optimize PDFs, or produce compliance outputs like PDF/A or PDF/UA. Triggers include convert to PDF, merge these PDFs, OCR this scan, extract tables, redact PII, sign this PDF, make this PDF/A, or linearize for web delivery.
Generate Restaurant Menu
100Generate a branded restaurant menu PDF with sections, items, prices, and descriptions.
Extract Fleet Vehicle Registration
100Extract vehicle identification, owner details, registration dates, and technical specifications from vehicle registration documents.