Firecrawl Parse

Skill Verified Active

Efficiently extract and convert the contents of any local file—such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, or HTML—into clean, well-formatted markdown saved to disk. Use this skill whenever the user requests to parse, read, or extract information from a file on their computer, including phrases like “parse this PDF”, “convert this document”, “read this file”, “extract text from”, or when a local file path (not a URL) is provided. This skill offers advanced options like generating AI-powered summaries and answering questions based on the file's content. Prefer this tool over `scrape` when handling local files to deliver precise, structured outputs for downstream tasks.

Purpose

To efficiently and cleanly convert local documents into well-formatted markdown, enabling easier access and processing of file content.

Features

Convert local files (PDF, DOCX, XLSX, HTML, etc.) to markdown
Generate AI-powered summaries of file content
Answer questions based on parsed file content
Save extracted content to disk
Differentiate from URL scraping tools

Use Cases

When a user requests to parse, read, or extract information from a local file.
When a local file path (not a URL) is provided for processing.
To create clean markdown versions of documents for downstream tasks.
To quickly summarize or get answers from a document without reading it fully.

Non-Goals

Processing files from URLs (use `firecrawl-scrape` instead)
Streaming large outputs to stdout (prefer saving to disk)
Handling files larger than 50MB
Replacing general-purpose file viewers

Installation

First, add the marketplace

/plugin marketplace add firecrawl/cli

/plugin install cli@firecrawl

Quality Score

Verified

99 /100

Analyzed about 15 hours ago

Trust Signals

Last commit1 day ago

GitHub owner firecrawl

Stars383

Downloads 51.1k

Websitedocs.firecrawl.dev

Status

View Source

Similar Extensions

PaddleOCR Document Parsing

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.

Skill

PaddlePaddle

Convert Resume to Markdown

100

Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.

Skill

iterationlayer

Paddleocr Text Recognition

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.

Skill

PaddlePaddle

Markdown to Styled PDF

Generate a professionally styled PDF document from Markdown content with custom fonts, headers, and page numbers.

Skill

iterationlayer

Trader Regime

100

Detect current market regime using npx neural-trader — bull/bear/ranging/volatile classification with recommended strategy

Skill

ruvnet

Setup

100

Use first for install/update routing — sends setup, doctor, or MCP requests to the correct OMC setup flow

Skill

Yeachan-Heo