Run Train
Skill ActiveTrusted-lane training execution skill for deep learning research repositories. Use when a documented or selected training command should be run conservatively for startup verification, short-run verification, full kickoff, or resume, with status, checkpoint, and metric capture written to standardized `train_outputs/`. Do not use for environment setup, exploratory sweeps, speculative idea implementation, or end-to-end orchestration.
To provide a trusted and auditable way to run deep learning training commands conservatively, ensuring verification and structured capture of results.
Features
- Conservative training command execution
- Structured status, checkpoint, and metric capture
- Handles startup verification, short-run verification, full kickoff, and resume
- Outputs evidence to `train_outputs/`
- Error and timeout handling
Use Cases
- Verifying training command startup in a research repository
- Running short-duration training for verification purposes
- Initiating or resuming full training runs with monitored evidence
- Capturing structured logs and checkpoints from training processes
Non-Goals
- Environment setup or asset downloading
- Exploratory sweeps or speculative idea implementation
- End-to-end orchestration of research goals
- Choosing training commands autonomously
Versioning
- warning:Release ManagementThe script itself does not have a version number, and the repository install instructions primarily reference installing from main (`npx skills add ... --all` or `... --skill ...`), making it difficult to pin a specific version of this script.
Installation
npx skills add lllllllama/ai-paper-reproduction-skillRuns the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.
Quality Score
Similar Extensions
Azure Monitor Query Py
100Azure Monitor Query SDK for Python. Use for querying Log Analytics workspaces and Azure Monitor metrics. Triggers: "azure-monitor-query", "LogsQueryClient", "MetricsQueryClient", "Log Analytics", "Kusto queries", "Azure metrics".
Pytorch Lightning
99High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.
PyTorch Lightning
100Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
Monitor Data Integrity
100Design and operate a data integrity monitoring programme based on ALCOA+ principles. Covers detective controls, audit trail review schedules, anomaly detection patterns (off-hours activity, sequential modifications, bulk changes), metrics dashboards, investigation triggers, and escalation matrix definition. Use when establishing a data integrity monitoring programme for GxP systems, preparing for inspections where data integrity is a focus area, after a data integrity incident requiring enhanced monitoring, or when implementing MHRA, WHO, or PIC/S guidance.
Query Netdata Cloud
100Query Netdata Cloud via its REST API -- metrics, logs (systemd-journal / windows-events / otel-logs), topology graphs (topology:snmp), network flows (flows:netflow), alerts, dynamic configuration (DynCfg), and generic Functions on a node. Use when the user asks about querying Netdata Cloud, fetching metrics from the cloud, querying logs / topology / netflow / sflow / ipfix through Cloud, listing or modifying configurations via DynCfg, calling agent Functions through Cloud, listing spaces/rooms/nodes, or building a curl command against `app.netdata.cloud`. Pairs with the `query-netdata-agents` skill when direct-agent access is needed.
Meta Observer
100Track skill performance and emerging patterns