Firecrawl Parse
Skill Verifiziert AktivEfficiently extract and convert the contents of any local file—such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, or HTML—into clean, well-formatted markdown saved to disk. Use this skill whenever the user requests to parse, read, or extract information from a file on their computer, including phrases like “parse this PDF”, “convert this document”, “read this file”, “extract text from”, or when a local file path (not a URL) is provided. This skill offers advanced options like generating AI-powered summaries and answering questions based on the file's content. Prefer this tool over `scrape` when handling local files to deliver precise, structured outputs for downstream tasks.
To efficiently and cleanly convert local documents into well-formatted markdown, enabling easier access and processing of file content.
Funktionen
- Convert local files (PDF, DOCX, XLSX, HTML, etc.) to markdown
- Generate AI-powered summaries of file content
- Answer questions based on parsed file content
- Save extracted content to disk
- Differentiate from URL scraping tools
Anwendungsfälle
- When a user requests to parse, read, or extract information from a local file.
- When a local file path (not a URL) is provided for processing.
- To create clean markdown versions of documents for downstream tasks.
- To quickly summarize or get answers from a document without reading it fully.
Nicht-Ziele
- Processing files from URLs (use `firecrawl-scrape` instead)
- Streaming large outputs to stdout (prefer saving to disk)
- Handling files larger than 50MB
- Replacing general-purpose file viewers
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add firecrawl/cli/plugin install cli@firecrawlQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
PaddleOCR Document Parsing
99Verwenden Sie diese Fähigkeit, um strukturierte Markdown/JSON aus PDFs und Dokumentbildern zu extrahieren – Tabellen mit präziser Zellendefinition, Formeln als LaTeX, Abbildungen, Siegel, Diagramme, Kopf-/Fußzeilen, mehrspaltiges Layout und korrekte Lesereihenfolge. Trigger-Begriffe: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.
Convert Resume to Markdown
100Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.
Paddleocr Text Recognition
99Verwenden Sie diese Fähigkeit, wenn der Benutzer Text aus Bildern, Fotos, Scans, Screenshots oder gescannten PDFs extrahieren möchte. Gibt exakte maschinenlesbare Zeichenfolgen mit Text auf Zeilenebene und optionalen Bounding-Box-Koordinaten zurück. Hohe Genauigkeit für CJK, Kleingedrucktes und handschriftlichen Text. Auslöserbegriffe: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.
Markdown to Styled PDF
99Generate a professionally styled PDF document from Markdown content with custom fonts, headers, and page numbers.
Trader Regime
100Detect current market regime using npx neural-trader — bull/bear/ranging/volatile classification with recommended strategy
Setup
100Use first for install/update routing — sends setup, doctor, or MCP requests to the correct OMC setup flow