Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Openrlhf Training

Skill Verifiziert Aktiv

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.

Zweck

To enable efficient and high-performance Reinforcement Learning from Human Feedback (RLHF) training for large language models using a distributed architecture with advanced acceleration techniques.

Funktionen

High-performance RLHF training framework
Support for PPO, GRPO, RLOO, DPO algorithms
Ray + vLLM acceleration for large models (7B-70B+)
Distributed architecture with multi-node GPU cluster support
Hybrid Engine for GPU resource sharing

Anwendungsfälle

Training large language models with RLHF
Fine-tuning models on custom reward functions
Leveraging distributed computing for faster training
Accelerating inference during RLHF rollout phases

Nicht-Ziele

Single-node or basic model fine-tuning
Environments without GPU acceleration capabilities
Inference-only model serving outside of the training loop

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

Verifiziert

97 /100

Analysiert about 18 hours ago

Vertrauenssignale

Letzter Commitabout 20 hours ago

GitHub-Inhaber davila7

Sterne27.2k

Downloads 23k

LizenzMIT

Websiteaitmpl.com

Status

Quellcode ansehen

Openrlhf Training

Funktionen

Anwendungsfälle

Nicht-Ziele

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

Openrlhf Training

Verl Rl Training

Moe Training

Ray Data

Verl Rl Training

PyTorch Lightning