Huggingface Datasets

Skill Verified Active

Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.

Purpose

Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.

Features

Fetch subset/split metadata
Paginate rows with offset and length
Search text within dataset rows
Apply filters with predicate syntax
Download parquet URLs
Read dataset size and statistics
Validate dataset availability

Use Cases

Exploring dataset contents programmatically
Extracting specific subsets of data
Searching for patterns within dataset text
Automating data retrieval for ML tasks

Non-Goals

Creating or uploading datasets (use hf-cli)
Running ML models
Training or fine-tuning models
Managing Hugging Face Hub resources beyond dataset viewing

Installation

/plugin install skills@huggingface-skills

Quality Score

Verified

97 /100

Analyzed about 16 hours ago

Trust Signals

Last commit2 days ago

GitHub owner huggingface

Stars10.5k

LicenseApache-2.0

Websitehuggingface.co

Status

View Source

Similar Extensions

Website Extraction Api

100

Extract typed JSON from public website pages using a schema.

Skill

iterationlayer

Extract Supplier Catalog From Website

100

Extract SKUs, product names, unit prices, availability, and minimum order quantities from a supplier catalog page.

Skill

iterationlayer

Hugging Science

Use when the user is doing AI/ML work in a scientific domain — biology, chemistry, physics, astronomy, climate, genomics, materials science, medicine, ecology, energy, conservation, engineering, mathematics, scientific reasoning, drug discovery, protein design, weather modeling, theorem proving, single-cell, PDE solving, or anything similar. Hugging Science (huggingscience.co) is a curated catalog of scientific datasets, models, blog posts, and interactive Spaces; the `hugging-science` org on Hugging Face hosts community datasets, models, and demo Spaces. This skill helps you discover the right resource AND actually use it — loading datasets via `datasets`, running models via `transformers` or the HF Inference API, calling Spaces like BoltzGen via `gradio_client`, and citing blog posts for methodology. Trigger this skill whenever a user mentions a scientific ML task, asks for "a dataset/model for X" where X is a scientific topic, wants to fine-tune on scientific data, asks about protein / molecule / genome / climate / materials / astronomy / pathology / weather ML, or needs AI tools for research — even if they never say "Hugging Science" explicitly. The catalog is purpose-built for LLM agents (it ships an `llms-full.txt`); prefer it over generic web search for these tasks.

Skill

K-Dense-AI

X Twitter Scraper

100

Use when the user needs X (Twitter) data or confirmation-gated X actions through Xquik: tweet search, user lookup, follower extraction, media download, monitoring, webhooks, MCP, SDKs, posting, likes, DMs, and profile updates. Requires a Xquik API key. Never ask for X login material.

Skill

Xquik-dev

Slack

100

Use the Slack tool to react, pin/unpin, send, edit, delete messages, or fetch Slack member info.

Skill

steipete

Github

100

Use gh for GitHub issues, PR status, CI/logs, comments, reviews, releases, and API queries.

Skill

steipete