Document Hunter
Skill ActiveSearches and retrieves documents from free public sources using automated browser navigation. Use when research needs primary source documents like court filings, government reports, or public records.
To automate the time-consuming task of finding and downloading primary source documents from free public archives for research purposes.
Features
- Searches public document archives systematically
- Automates browser navigation and PDF downloads
- Organizes downloaded documents with metadata
- Reports on found documents, sources searched, and remaining gaps
Use Cases
- Retrieving court filings for legal research
- Finding government reports for policy analysis
- Gathering public records for investigative journalism
- Automating the collection of primary source material for academic research
Non-Goals
- Accessing documents behind paywalls (e.g., full PACER access without RECAP)
- Downloading copyrighted material
- Replacing manual verification of document content
- Providing legal advice based on retrieved documents
Documentation
- info:Configuration & parameter referenceThe SKILL.md mentions an argument hint but does not fully document expected arguments or their defaults. The Python script template also implies parameters but they are not explicitly documented in the skill description.
Maintenance
- warning:Dependency ManagementThe SKILL.md lists Python dependencies ('playwright', 'beautifulsoup4', 'requests') but does not include a lockfile or explicit version pinning, and lacks automated vulnerability scanning.
Security
- warning:InjectionWhile the skill targets specific websites, it relies on general web scraping techniques and DOM manipulation, which could be vulnerable to injection if not carefully implemented, especially given the `query_selector_all` and `fill` methods used.
- warning:Transitive Supply-Chain GrenadesThe skill uses Playwright which fetches browser binaries and scripts; while generally safe, it involves external dependencies. The reliance on external websites for content also introduces a risk if those sites are compromised or change their structure.
Portability
- warning:Structural AssumptionThe `site-patterns.md` describes specific site structures and selectors, which could break if websites change their HTML or JavaScript rendering. The Python code template also assumes specific directory structures for output.
- warning:Runtime stabilityThe skill requires Playwright and Chromium, and assumes a POSIX environment (Linux or macOS) for installation and execution, as stated in the README. Windows users must use WSL. This limits its cross-platform stability without explicit handling.
- warning:Stack assumptionsThe SKILL.md mentions requirements like Playwright and Chromium, but does not explicitly declare the runtime surface (e.g., Python version) or minimum version for the bundled scripts. It assumes a POSIX environment.
Versioning
- warning:Release ManagementThe SKILL.md frontmatter declares a model but not a package version. While the README indicates releases, there is no explicit semver versioning wired into the skill's metadata or installation instructions, and installation defaults to `main`.
Code Execution
- warning:ValidationThe Python script template includes placeholders for input validation and sanitization, but the provided code does not show explicit schema validation or sanitization for inputs like case names or URLs.
- warning:Error HandlingThe Python script template includes a basic try-except block for source functions but lacks detailed error categorization, structured error reporting, or specific recovery steps for common issues like site blocking or download failures.
- warning:LoggingThere is no explicit mention or implementation of local audit logging for destructive actions or outbound calls within the provided SKILL.md or supporting files. The console output during execution is the primary feedback.
Install
- warning:Installation instructionThe installation instructions are primarily in the parent README and assume familiarity with the larger project. While it details Playwright setup, it lacks specific copy-paste invocations for this skill and does not clearly document authentication requirements for sources like Scribd.
Errors
- warning:Actionable error messagesThe provided script template shows basic error reporting but lacks detailed, actionable messages for common failure modes like site blocking, download failures, or missing documents. Remediation steps are not clearly defined per error.
Execution
- warning:Pinned dependenciesThe skill lists Python dependencies but does not provide a lockfile or specify exact versions, which can lead to runtime issues due to dependency drift. The shebang/header for scripts is also not fully detailed.
Protocol
- warning:Idempotent retry & timeoutsThe Python script template shows basic timeouts for page navigation, but lacks explicit implementation of per-call timeouts for download operations and does not detail idempotency considerations for file operations.
Practical Utility
- warning:Usage examplesThe SKILL.md provides a Python code template but lacks concrete end-to-end usage examples with specific inputs and expected outputs for the 'document-hunter' skill itself. The README has workflow examples for a larger music project.
- warning:Edge casesWhile the troubleshooting section in `site-patterns.md` and the main SKILL.md touch on some failure modes (site blocked, no results, download fails), they lack structured documentation of specific symptoms and recovery steps for each edge case as expected for a complete skill.
Safety
- warning:Halt on unexpected stateThe skill does not explicitly list machine-readable preconditions or instruct to abort on unexpected pre-states like a dirty working tree or missing dependencies beyond the mentioned Python requirements. The workflow does not appear to have explicit rollback procedures.
Installation
First, add the marketplace
/plugin marketplace add bitwize-music-studio/claude-ai-music-skills/plugin install claude-ai-music-skills@bitwize-musicQuality Score
Trust Signals
Similar Extensions
Agent Browser
100Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
Manus
100Delegate complex, long-running tasks to Manus AI agent for autonomous execution. Use when user says 'use manus', 'delegate to manus', 'send to manus', 'have manus do', 'ask manus', 'check manus sessions', or when tasks require deep web research, market analysis, product comparisons, stock analysis, competitive research, document generation, data analysis, or multi-step workflows that benefit from autonomous agent execution with parallel processing.
Dev Browser
99Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.
Project Session Manager
100Worktree-first dev environment manager for issues, PRs, and features with optional tmux sessions
Public Google Drive
100Create public Google Docs or Google Sheet files without requiring OAuth. Use this skill to create and edit Google Docs and Sheets, no Google sign-in required. Documents are viewable at shareable links. Registration is automatic on first use.
Oh My Claudecode
100Process-first advisor routing for Claude, Codex, or Gemini via `omc ask`, with artifact capture and no raw CLI assembly