Skip to main content

Huggingface Datasets

Skill Verified Active

Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.

Purpose

Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.

Features

  • Fetch subset/split metadata
  • Paginate rows with offset and length
  • Search text within dataset rows
  • Apply filters with predicate syntax
  • Download parquet URLs
  • Read dataset size and statistics
  • Validate dataset availability

Use Cases

  • Exploring dataset contents programmatically
  • Extracting specific subsets of data
  • Searching for patterns within dataset text
  • Automating data retrieval for ML tasks

Non-Goals

  • Creating or uploading datasets (use hf-cli)
  • Running ML models
  • Training or fine-tuning models
  • Managing Hugging Face Hub resources beyond dataset viewing

Installation

/plugin install skills@huggingface-skills

Quality Score

Verified
97 /100
Analyzed about 16 hours ago

Trust Signals

Last commit2 days ago
Stars10.5k
LicenseApache-2.0
Status
View Source

Similar Extensions

Website Extraction Api

100

Extract typed JSON from public website pages using a schema.

Skill
iterationlayer

Extract Supplier Catalog From Website

100

Extract SKUs, product names, unit prices, availability, and minimum order quantities from a supplier catalog page.

Skill
iterationlayer

Hugging Science

98

Use when the user is doing AI/ML work in a scientific domain — biology, chemistry, physics, astronomy, climate, genomics, materials science, medicine, ecology, energy, conservation, engineering, mathematics, scientific reasoning, drug discovery, protein design, weather modeling, theorem proving, single-cell, PDE solving, or anything similar. Hugging Science (huggingscience.co) is a curated catalog of scientific datasets, models, blog posts, and interactive Spaces; the `hugging-science` org on Hugging Face hosts community datasets, models, and demo Spaces. This skill helps you discover the right resource AND actually use it — loading datasets via `datasets`, running models via `transformers` or the HF Inference API, calling Spaces like BoltzGen via `gradio_client`, and citing blog posts for methodology. Trigger this skill whenever a user mentions a scientific ML task, asks for "a dataset/model for X" where X is a scientific topic, wants to fine-tune on scientific data, asks about protein / molecule / genome / climate / materials / astronomy / pathology / weather ML, or needs AI tools for research — even if they never say "Hugging Science" explicitly. The catalog is purpose-built for LLM agents (it ships an `llms-full.txt`); prefer it over generic web search for these tasks.

Skill
K-Dense-AI

X Twitter Scraper

100

Use when the user needs X (Twitter) data or confirmation-gated X actions through Xquik: tweet search, user lookup, follower extraction, media download, monitoring, webhooks, MCP, SDKs, posting, likes, DMs, and profile updates. Requires a Xquik API key. Never ask for X login material.

Skill
Xquik-dev

Slack

100

Use the Slack tool to react, pin/unpin, send, edit, delete messages, or fetch Slack member info.

Skill
steipete

Github

100

Use gh for GitHub issues, PR status, CI/logs, comments, reviews, releases, and API queries.

Skill
steipete

© 2025 SkillRepo · Find the right skill, skip the noise.