Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Slime Rl Training

Skill Verifiziert Aktiv

Teil von:Agent Native Research Artifact (ARA) Tooling

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

Zweck

To provide a comprehensive and production-ready framework for training LLMs with Reinforcement Learning, enabling users to leverage Megatron-LM and SGLang for scalable and efficient model development.

Funktionen

Megatron-LM integration for distributed training
SGLang for high-throughput generation rollouts
Flexible data buffer and custom generation/reward functions
Support for multiple LLM families (GLM, Qwen, Llama, etc.)
Detailed workflows for various training scenarios

Anwendungsfälle

Training GLM models with RL
Implementing custom data generation pipelines for LLM fine-tuning
Integrating Megatron-LM with SGLang for RL scaling
Fine-tuning large language models on custom datasets using RL algorithms

Nicht-Ziele

Providing a simple prompt-based agent for basic LLM tasks
Replacing core LLM inference engines without framework integration
Generic model training outside the RL post-training context

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert

98 /100

Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago

GitHub-Inhaber Orchestra-Research

Sterne8.3k

Downloads 0

LizenzMIT

Websiteorchestra-research.com

Status

Quellcode ansehen

Slime Rl Training

Funktionen

Anwendungsfälle

Nicht-Ziele

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

Miles RL Training

Miles Rl Training

Slime RL Training

Verl Rl Training

Fine Tuning With Trl

Verl Rl Training