Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Verl Rl Training

Skill Verifiziert Aktiv

Teil von:Agent Native Research Artifact (ARA) Tooling

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

Zweck

To enable users to effectively train large language models at scale using reinforcement learning techniques with the verl framework, offering production-ready guidance and support.

Funktionen

Guidance on verl RL training library
Support for RLHF, GRPO, PPO, and other RL algorithms
Flexible infrastructure backend configurations (FSDP, Megatron)
Detailed troubleshooting and common issue resolution
Examples for various training workflows

Anwendungsfälle

Implementing RLHF for LLM post-training
Training LLMs at scale with flexible infrastructure
Leveraging GRPO for math and reasoning tasks
Configuring PPO with a critic model for dense reward tasks

Nicht-Ziele

Implementing Megatron-native training directly (use slime or miles)
Simple SFT/DPO tasks (use TRL or Axolotl)
Core LLM architecture development
Basic language model fine-tuning

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert

99 /100

Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago

GitHub-Inhaber Orchestra-Research

Sterne8.3k

Downloads 0

LizenzMIT

Websiteorchestra-research.com

Status

Quellcode ansehen

Verl Rl Training

Funktionen

Anwendungsfälle

Nicht-Ziele

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

Verl Rl Training

Openrlhf Training

Slime Rl Training

Openrlhf Training

Torchforge

Fine Tuning With Trl