Pytorch Lightning
技能 已验证 活跃High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.
To enable users to write clean, organized, and production-ready PyTorch training loops with built-in best practices, reducing boilerplate and simplifying distributed training.
功能
- Trainer class for simplified training loops
- Automatic distributed training support (DDP, FSDP, DeepSpeed)
- Callbacks system for modular extensions
- Scalable code from laptop to supercomputer
- Built-in best practices for training
使用场景
- Want clean, organized PyTorch code
- Need production-ready training loops
- Switching between single GPU, multi-GPU, or TPU
- Leveraging built-in callbacks and logging
非目标
- Maximizing control over raw PyTorch (use alternatives like raw PyTorch)
- Multi-node orchestration and hyperparameter tuning (use Ray Train)
- Minimal changes to existing code (use Accelerate)
- Serving as a TensorFlow ecosystem alternative
Execution
- info:Pinned dependenciesDependencies are declared in SKILL.md but not explicitly pinned with lockfile details, though standard `pip install` implies standard Python dependency resolution.
安装
请先添加 Marketplace
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skills质量评分
已验证类似扩展
Huggingface Accelerate
99Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.
HuggingFace Accelerate
97Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.
PyTorch Lightning
100Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
PyHealth Clinical Pipelines
99Build clinical/healthcare deep-learning pipelines with PyHealth — loading EHR/signal/imaging datasets (MIMIC-III/IV, eICU, OMOP, SleepEDF, ChestXray14, EHRShot), defining tasks (mortality, readmission, length-of-stay, drug recommendation, sleep staging, ICD coding, EEG events), instantiating models (Transformer, RETAIN, GAMENet, SafeDrug, MICRON, StageNet, AdaCare, CNN/RNN/MLP), training with the PyHealth Trainer, computing clinical metrics, and using medical code utilities (ICD/ATC/NDC/RxNorm lookup and cross-mapping). Use this skill whenever the user mentions PyHealth, MIMIC, eICU, OMOP, EHR modeling, clinical prediction, drug recommendation, sleep staging, medical code mapping, ICD/ATC codes, or any healthcare ML pipeline that fits the dataset → task → model → trainer → metrics pattern, even if "PyHealth" isn't named explicitly.
Run Train
99用于深度学习研究代码库的可信通道训练执行技能。在文档化或选定的训练命令应被保守运行时使用,以进行启动验证、短时运行验证、全面启动或恢复,并将状态、检查点和指标捕获写入标准化的 `train_outputs/`。请勿用于环境设置、探索性扫描、投机性想法实现或端到端编排。
Nnsight Remote Interpretability
99Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.