Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Extract Article Text

Skill Aktiv

Extract clean article content — title, author, date, and body text — from PDFs, Word docs, and web pages.

Zweck

To extract clean, structured article content from various document formats for use in content aggregation, summarization, or knowledge base ingestion.

Funktionen

  • Extracts title, author, and publication date
  • Extracts main body text, ignoring headers/footers
  • Supports PDFs, Word docs, and web pages
  • Defines custom extraction schema for specific fields

Anwendungsfälle

  • Processing articles for a newsletter platform
  • Ingesting research papers into a knowledge base
  • Extracting key information from legal documents
  • Content summarization workflows

Nicht-Ziele

  • Analyzing or interpreting the extracted content
  • Modifying or generating documents
  • Performing OCR on image-based documents

Security

  • warning:Secret ManagementThe API key is documented as 'YOUR_API_KEY' in code examples and prompts, but there is no clear guidance on how to securely manage this key, such as using environment variables or a secrets manager.

Compliance

  • info:GDPRWhile the skill extracts content from documents, it does not explicitly operate on personal data without sanitization. However, the LLM processing of extracted content may involve personal data.

Practical Utility

  • info:Edge casesWhile the API schema defines required fields and max lengths, the SKILL.md does not explicitly document failure modes for specific edge cases like malformed inputs or API rate limits.

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add iterationlayer/skills
/plugin install skills@iterationlayer-skills

Qualitätspunktzahl

95 /100
Analysiert about 21 hours ago

Vertrauenssignale

Letzter Commit16 days ago
Sterne0
LizenzMIT
Status
Quellcode ansehen