Huggingface Datasets
Skill Verified ActiveUse this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.
Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.
Features
- Fetch subset/split metadata
- Paginate rows with offset and length
- Search text within dataset rows
- Apply filters with predicate syntax
- Download parquet URLs
- Read dataset size and statistics
- Validate dataset availability
Use Cases
- Exploring dataset contents programmatically
- Extracting specific subsets of data
- Searching for patterns within dataset text
- Automating data retrieval for ML tasks
Non-Goals
- Creating or uploading datasets (use hf-cli)
- Running ML models
- Training or fine-tuning models
- Managing Hugging Face Hub resources beyond dataset viewing
Installation
/plugin install skills@huggingface-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
Website Extraction Api
100Extract typed JSON from public website pages using a schema.
Extract Supplier Catalog From Website
100Extract SKUs, product names, unit prices, availability, and minimum order quantities from a supplier catalog page.
Hugging Science
98Use when the user is doing AI/ML work in a scientific domain — biology, chemistry, physics, astronomy, climate, genomics, materials science, medicine, ecology, energy, conservation, engineering, mathematics, scientific reasoning, drug discovery, protein design, weather modeling, theorem proving, single-cell, PDE solving, or anything similar. Hugging Science (huggingscience.co) is a curated catalog of scientific datasets, models, blog posts, and interactive Spaces; the `hugging-science` org on Hugging Face hosts community datasets, models, and demo Spaces. This skill helps you discover the right resource AND actually use it — loading datasets via `datasets`, running models via `transformers` or the HF Inference API, calling Spaces like BoltzGen via `gradio_client`, and citing blog posts for methodology. Trigger this skill whenever a user mentions a scientific ML task, asks for "a dataset/model for X" where X is a scientific topic, wants to fine-tune on scientific data, asks about protein / molecule / genome / climate / materials / astronomy / pathology / weather ML, or needs AI tools for research — even if they never say "Hugging Science" explicitly. The catalog is purpose-built for LLM agents (it ships an `llms-full.txt`); prefer it over generic web search for these tasks.
X Twitter Scraper
100Use when the user needs X (Twitter) data or confirmation-gated X actions through Xquik: tweet search, user lookup, follower extraction, media download, monitoring, webhooks, MCP, SDKs, posting, likes, DMs, and profile updates. Requires a Xquik API key. Never ask for X login material.
Slack
100Use the Slack tool to react, pin/unpin, send, edit, delete messages, or fetch Slack member info.
Github
100Use gh for GitHub issues, PR status, CI/logs, comments, reviews, releases, and API queries.