Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Distributed Llm Pretraining Torchtitan

Skill Verifiziert Aktiv

Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.

Zweck

To enable efficient and scalable pretraining of large language models natively within PyTorch, leveraging advanced parallelism and optimization techniques for maximum performance.

Funktionen

PyTorch-native distributed LLM pretraining
Composable 4D parallelism (FSDP2, TP, PP, CP)
Support for Float8 training on H100 GPUs
Pretraining for Llama 3.1, DeepSeek V3, and custom models
Distributed checkpointing and efficient resumption

Anwendungsfälle

Pretraining large language models from scratch (8B to 405B+)
Scaling LLM training across 8 to 512+ GPUs
Optimizing training performance with Float8 and torch.compile
Integrating custom models into a distributed training pipeline

Nicht-Ziele

Fine-tuning LLMs (use alternatives like Axolotl/TRL)
Inference optimization (use DeepSpeed for broader ecosystem)
Simple single-GPU training (consider smaller educational frameworks)
NVIDIA-only maximum performance without PyTorch integration (consider Megatron-LM)

Trust

info:Issues Attention17 issues opened, 4 closed in the last 90 days, indicating a closure rate below 50% and a need for faster response.

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

Verifiziert

98 /100

Analysiert about 22 hours ago

Vertrauenssignale

Letzter Commit1 day ago

GitHub-Inhaber davila7

Sterne27.2k

Downloads 23k

LizenzMIT

Websiteaitmpl.com

Status

Quellcode ansehen

Distributed Llm Pretraining Torchtitan

Funktionen

Anwendungsfälle

Nicht-Ziele

Trust

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

TorchTitan Distributed LLM Pretraining

Ray Train

Pytorch Lightning

Openrlhf Training

Huggingface Accelerate

HuggingFace Accelerate