跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Train Sentence Transformers

技能 已验证 活跃

Train or fine-tune sentence-transformers models across `SentenceTransformer` (bi-encoder; dense or static embedding model; for retrieval, similarity, clustering, classification, paraphrase mining, dedup, multimodal), `CrossEncoder` (reranker; pair scoring for two-stage retrieval / pair classification), and `SparseEncoder` (SPLADE, sparse embedding model; for learned-sparse retrieval). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing. Use for any sentence-transformers training task.

目的

To enable users to train or fine-tune sentence-transformers models for diverse NLP tasks by providing example scripts, best practices, and comprehensive documentation.

功能

  • Train bi-encoder, cross-encoder, and sparse-encoder models
  • Supports various training techniques (losses, mining, distillation, LoRA)
  • Includes runnable Python scripts for common scenarios
  • Provides detailed reference documentation for configuration and troubleshooting
  • Facilitates Hugging Face Hub publishing

使用场景

  • Training a custom embedding model for retrieval tasks
  • Fine-tuning a large language model for reranking using LoRA
  • Adapting a pre-trained model to a specific domain using distillation
  • Experimenting with different training strategies and hyperparameters

非目标

  • Providing pre-trained models directly
  • Automating hyperparameter search (though references discuss it)
  • Executing training jobs without user intervention
  • Handling dataset creation from raw text

实践

  • Model training
  • Contrastive learning
  • Distillation
  • Transfer learning
  • Model evaluation

先决条件

  • pip install "sentence-transformers[train]>=5.0"
  • pip install datasets>=2.19.0
  • pip install accelerate>=0.26.0
  • pip install trackio
  • GPU strongly recommended

安装

/plugin install skills@huggingface-skills

质量评分

已验证
98 /100
1 day ago 分析

信任信号

最近提交2 days ago
星标10.5k
许可证Apache-2.0
状态
查看源代码

类似扩展

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

技能
K-Dense-AI

Nnsight Remote Interpretability

99

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

技能
davila7

Geniml

99

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

技能
K-Dense-AI

Huggingface Llm Trainer

99

Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, model selection/leaderboards and model persistence. Use for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

技能
huggingface

Transformers.js

99

Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in browsers and server-side runtimes (Node.js, Bun, Deno) with WebGPU/WASM using pre-trained models from Hugging Face Hub.

技能
huggingface

Transformers

98

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

技能
K-Dense-AI