Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Miles Rl Training

Skill Aktiv

Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.

Zweck

To guide users in performing enterprise-grade Reinforcement Learning training for large-scale MoE models, leveraging advanced techniques like FP8/INT4 quantization and speculative RL for maximum efficiency and alignment.

Funktionen

Low-precision training (FP8, INT4)
MoE model training and alignment (R3)
Speculative RL for throughput optimization
Train-inference alignment
Production-ready framework guidance

Anwendungsfälle

Training large MoE models (1TB+)
Enabling FP8 or INT4 quantization-aware training
Achieving bit-wise identical train-inference alignment
Maximizing rollout throughput with speculative RL

Nicht-Ziele

Serving as the research-grade original slime framework
Providing flexible backend swapping (use verl)
Offering PyTorch-native abstractions (use torchforge)

Trust

warning:Issues Attentionopen=17, closed=4. The ratio of open to closed issues in the last 90 days is low, suggesting maintainers may be slow to respond to or resolve issues.

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

92 /100

Analysiert about 22 hours ago

Vertrauenssignale

Letzter Commit1 day ago

GitHub-Inhaber davila7

Sterne27.2k

Downloads 23k

LizenzMIT

Websiteaitmpl.com

Status

Quellcode ansehen

Miles Rl Training

Funktionen

Anwendungsfälle

Nicht-Ziele

Trust

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

Miles RL Training

Slime Rl Training

Slime RL Training

Tensorrt Llm

Agentdb Learning

Verl Rl Training