Skip to main content

Huggingface Vision Trainer

Plugin Verified Active

Train and fine-tune object detection models (RTDETRv2, YOLOS, DETR and others) and image classification models (timm and transformers models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3) using Transformers Trainer API on Hugging Face Jobs infrastructure or locally. Includes COCO dataset format support, Albumentations augmentation, mAP/mAR metrics, trackio tracking, hardware selection, and Hub persistence.

Purpose

To provide a seamless and powerful way for users to train and fine-tune computer vision models without managing local GPU infrastructure, leveraging Hugging Face's cloud capabilities.

Features

  • Train object detection models (RTDETRv2, YOLOS, DETR)
  • Train image classification models (timm, transformers)
  • Train SAM/SAM2 segmentation models
  • Support for COCO dataset format and Albumentations augmentation
  • Integration with Hugging Face Jobs for cloud GPU training
  • Automated dataset validation and Hub persistence

Use Cases

  • Fine-tuning object detection models on custom datasets.
  • Training image classification models for specific tasks.
  • Experimenting with SAM/SAM2 models for segmentation on new data.
  • Leveraging cloud GPUs for computationally intensive vision model training.

Non-Goals

  • Running training jobs on local hardware (though scripts can be run locally for inspection).
  • Providing a graphical user interface for model training.
  • Managing or providing datasets; users must supply their own datasets on the Hub.

Installation

First, add the marketplace

/plugin marketplace add huggingface/skills
/plugin install huggingface-vision-trainer@huggingface-skills

Quality Score

Verified
96 /100
Analyzed about 14 hours ago

Trust Signals

Last commit1 day ago
Stars10.5k
LicenseApache-2.0
Status
View Source

© 2025 SkillRepo · Find the right skill, skip the noise.