Skip to main content

Pytorch Lightning

Skill Verified Active

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

Purpose

To enable users to write clean, organized, and production-ready PyTorch training loops with built-in best practices, reducing boilerplate and simplifying distributed training.

Features

  • Trainer class for simplified training loops
  • Automatic distributed training support (DDP, FSDP, DeepSpeed)
  • Callbacks system for modular extensions
  • Scalable code from laptop to supercomputer
  • Built-in best practices for training

Use Cases

  • Want clean, organized PyTorch code
  • Need production-ready training loops
  • Switching between single GPU, multi-GPU, or TPU
  • Leveraging built-in callbacks and logging

Non-Goals

  • Maximizing control over raw PyTorch (use alternatives like raw PyTorch)
  • Multi-node orchestration and hyperparameter tuning (use Ray Train)
  • Minimal changes to existing code (use Accelerate)
  • Serving as a TensorFlow ecosystem alternative

Execution

  • info:Pinned dependenciesDependencies are declared in SKILL.md but not explicitly pinned with lockfile details, though standard `pip install` implies standard Python dependency resolution.

Installation

First, add the marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

Quality Score

Verified
99 /100
Analyzed about 15 hours ago

Trust Signals

Last commit16 days ago
Stars8.3k
LicenseMIT
Status
View Source

Similar Extensions

Huggingface Accelerate

99

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

Skill
davila7

HuggingFace Accelerate

97

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

Skill
Orchestra-Research

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Skill
K-Dense-AI

PyHealth Clinical Pipelines

99

Build clinical/healthcare deep-learning pipelines with PyHealth — loading EHR/signal/imaging datasets (MIMIC-III/IV, eICU, OMOP, SleepEDF, ChestXray14, EHRShot), defining tasks (mortality, readmission, length-of-stay, drug recommendation, sleep staging, ICD coding, EEG events), instantiating models (Transformer, RETAIN, GAMENet, SafeDrug, MICRON, StageNet, AdaCare, CNN/RNN/MLP), training with the PyHealth Trainer, computing clinical metrics, and using medical code utilities (ICD/ATC/NDC/RxNorm lookup and cross-mapping). Use this skill whenever the user mentions PyHealth, MIMIC, eICU, OMOP, EHR modeling, clinical prediction, drug recommendation, sleep staging, medical code mapping, ICD/ATC codes, or any healthcare ML pipeline that fits the dataset → task → model → trainer → metrics pattern, even if "PyHealth" isn't named explicitly.

Skill
K-Dense-AI

Run Train

99

Trusted-lane training execution skill for deep learning research repositories. Use when a documented or selected training command should be run conservatively for startup verification, short-run verification, full kickoff, or resume, with status, checkpoint, and metric capture written to standardized `train_outputs/`. Do not use for environment setup, exploratory sweeps, speculative idea implementation, or end-to-end orchestration.

Skill
lllllllama

Nnsight Remote Interpretability

99

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

Skill
davila7

© 2025 SkillRepo · Find the right skill, skip the noise.