Train Sentence Transformers
Skill Verified ActiveTrain or fine-tune sentence-transformers models across `SentenceTransformer` (bi-encoder; dense or static embedding model; for retrieval, similarity, clustering, classification, paraphrase mining, dedup, multimodal), `CrossEncoder` (reranker; pair scoring for two-stage retrieval / pair classification), and `SparseEncoder` (SPLADE, sparse embedding model; for learned-sparse retrieval). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing. Use for any sentence-transformers training task.
To enable users to train or fine-tune sentence-transformers models for diverse NLP tasks by providing example scripts, best practices, and comprehensive documentation.
Features
- Train bi-encoder, cross-encoder, and sparse-encoder models
- Supports various training techniques (losses, mining, distillation, LoRA)
- Includes runnable Python scripts for common scenarios
- Provides detailed reference documentation for configuration and troubleshooting
- Facilitates Hugging Face Hub publishing
Use Cases
- Training a custom embedding model for retrieval tasks
- Fine-tuning a large language model for reranking using LoRA
- Adapting a pre-trained model to a specific domain using distillation
- Experimenting with different training strategies and hyperparameters
Non-Goals
- Providing pre-trained models directly
- Automating hyperparameter search (though references discuss it)
- Executing training jobs without user intervention
- Handling dataset creation from raw text
Practices
- Model training
- Contrastive learning
- Distillation
- Transfer learning
- Model evaluation
Prerequisites
- pip install "sentence-transformers[train]>=5.0"
- pip install datasets>=2.19.0
- pip install accelerate>=0.26.0
- pip install trackio
- GPU strongly recommended
Installation
/plugin install skills@huggingface-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
PyTorch Lightning
100Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
Nnsight Remote Interpretability
99Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.
Geniml
99This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.
Huggingface Llm Trainer
99Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, model selection/leaderboards and model persistence. Use for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.
Transformers.js
99Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in browsers and server-side runtimes (Node.js, Bun, Deno) with WebGPU/WASM using pre-trained models from Hugging Face Hub.
Transformers
98This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.