PaddleOCR Text Recognition
Skill VerifiziertUse this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.
This skill leverages the PaddleOCR API to perform optical character recognition on images and PDF documents. It returns extracted text with optional line-level bounding box coordinates, supporting various file types and providing detailed error handling and configuration guidance.
Versioning
- warning:Release ManagementNo explicit versioning information (e.g., a version field in SKILL.md or package.json, or a CHANGELOG) is present for the skill itself. The README references a specific commit hash from the upstream repository.
Installation
npx skills add aidenwu0209/paddleocr-skillsFührt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.
Ähnliche Erweiterungen
PaddleOCR Document Parsing
98Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.
PDF OCR Extraction
95Extract text from scanned PDFs using optical character recognition
PDF to DOCX Converter
98Convert PDF files to editable Word documents using pdf2docx
Office MCP Server
94MCP server with 39 tools for Word, Excel, PowerPoint, PDF, OCR operations
Document Parser Skill
92>
PDF Processing Guide
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.