Train Sentence Transformers

Skill Verified Active

Train or fine-tune sentence-transformers models across `SentenceTransformer` (bi-encoder; dense or static embedding model; for retrieval, similarity, clustering, classification, paraphrase mining, dedup, multimodal), `CrossEncoder` (reranker; pair scoring for two-stage retrieval / pair classification), and `SparseEncoder` (SPLADE, sparse embedding model; for learned-sparse retrieval). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing. Use for any sentence-transformers training task.

Purpose

To enable users to train or fine-tune sentence-transformers models for diverse NLP tasks by providing example scripts, best practices, and comprehensive documentation.

Features

Train bi-encoder, cross-encoder, and sparse-encoder models
Supports various training techniques (losses, mining, distillation, LoRA)
Includes runnable Python scripts for common scenarios
Provides detailed reference documentation for configuration and troubleshooting
Facilitates Hugging Face Hub publishing

Use Cases

Training a custom embedding model for retrieval tasks
Fine-tuning a large language model for reranking using LoRA
Adapting a pre-trained model to a specific domain using distillation
Experimenting with different training strategies and hyperparameters

Non-Goals

Providing pre-trained models directly
Automating hyperparameter search (though references discuss it)
Executing training jobs without user intervention
Handling dataset creation from raw text

Practices

Model training
Contrastive learning
Distillation
Transfer learning
Model evaluation

Prerequisites

pip install "sentence-transformers[train]>=5.0"
pip install datasets>=2.19.0
pip install accelerate>=0.26.0
pip install trackio
GPU strongly recommended

Installation

/plugin install skills@huggingface-skills

Quality Score

Verified

98 /100

Analyzed about 16 hours ago

Trust Signals

Last commit2 days ago

GitHub owner huggingface

Stars10.5k

LicenseApache-2.0

Websitehuggingface.co

Status

View Source

Similar Extensions

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Skill

K-Dense-AI

Nnsight Remote Interpretability

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

Skill

davila7

Geniml

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Skill

K-Dense-AI

Huggingface Llm Trainer

Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, model selection/leaderboards and model persistence. Use for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

Skill

huggingface

Transformers.js

Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in browsers and server-side runtimes (Node.js, Bun, Deno) with WebGPU/WASM using pre-trained models from Hugging Face Hub.

Skill

huggingface

Transformers

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

Skill

K-Dense-AI