Skip to main content

Train Sentence Transformers

Skill Verified Active

Train or fine-tune sentence-transformers models across `SentenceTransformer` (bi-encoder; dense or static embedding model; for retrieval, similarity, clustering, classification, paraphrase mining, dedup, multimodal), `CrossEncoder` (reranker; pair scoring for two-stage retrieval / pair classification), and `SparseEncoder` (SPLADE, sparse embedding model; for learned-sparse retrieval). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing. Use for any sentence-transformers training task.

Purpose

To enable users to train or fine-tune sentence-transformers models for diverse NLP tasks by providing example scripts, best practices, and comprehensive documentation.

Features

  • Train bi-encoder, cross-encoder, and sparse-encoder models
  • Supports various training techniques (losses, mining, distillation, LoRA)
  • Includes runnable Python scripts for common scenarios
  • Provides detailed reference documentation for configuration and troubleshooting
  • Facilitates Hugging Face Hub publishing

Use Cases

  • Training a custom embedding model for retrieval tasks
  • Fine-tuning a large language model for reranking using LoRA
  • Adapting a pre-trained model to a specific domain using distillation
  • Experimenting with different training strategies and hyperparameters

Non-Goals

  • Providing pre-trained models directly
  • Automating hyperparameter search (though references discuss it)
  • Executing training jobs without user intervention
  • Handling dataset creation from raw text

Practices

  • Model training
  • Contrastive learning
  • Distillation
  • Transfer learning
  • Model evaluation

Prerequisites

  • pip install "sentence-transformers[train]>=5.0"
  • pip install datasets>=2.19.0
  • pip install accelerate>=0.26.0
  • pip install trackio
  • GPU strongly recommended

Installation

/plugin install skills@huggingface-skills

Quality Score

Verified
98 /100
Analyzed about 16 hours ago

Trust Signals

Last commit2 days ago
Stars10.5k
LicenseApache-2.0
Status
View Source

Similar Extensions

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Skill
K-Dense-AI

Nnsight Remote Interpretability

99

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

Skill
davila7

Geniml

99

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Skill
K-Dense-AI

Huggingface Llm Trainer

99

Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, model selection/leaderboards and model persistence. Use for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

Skill
huggingface

Transformers.js

99

Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in browsers and server-side runtimes (Node.js, Bun, Deno) with WebGPU/WASM using pre-trained models from Hugging Face Hub.

Skill
huggingface

Transformers

98

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

Skill
K-Dense-AI

© 2025 SkillRepo · Find the right skill, skip the noise.