OpenVLA OFT Fine Tuning and Evaluation

Skill Verified Active

Part of:Agent Native Research Artifact (ARA) Tooling

Fine-tunes and evaluates OpenVLA-OFT and OpenVLA-OFT+ policies for robot action generation with continuous action heads, LoRA adaptation, and FiLM conditioning on LIBERO simulation and ALOHA real-world setups. Use when reproducing OpenVLA-OFT paper results, training custom VLA action heads (L1 or diffusion), deploying server-client inference for ALOHA, or debugging normalization, LoRA merge, and cross-GPU issues.

Purpose

Enables researchers and engineers to reproduce OpenVLA-OFT paper results, train custom VLA action heads, and deploy server-client inference for robotics applications.

Features

Fine-tuning and evaluation of OpenVLA-OFT and OFT+
LoRA adaptation for efficient fine-tuning
Continuous action heads (L1 regression or diffusion)
FiLM conditioning for enhanced language grounding (OFT+)
Support for LIBERO simulation and ALOHA real-world setups
Server-client deployment for ALOHA inference
Detailed troubleshooting for common issues and invariants

Use Cases

Reproducing OpenVLA-OFT paper results
Training custom VLA action heads (L1 or diffusion)
Deploying server-client inference for ALOHA robots
Debugging normalization, LoRA merge, and cross-GPU issues

Non-Goals

General LLM fine-tuning without robot action heads
Fine-tuning other VLA architectures (e.g., pi0/pi0.5 models)
Using the NVIDIA Cosmos Policy stack

Practices

Model Fine-Tuning
Robot Action Generation
Simulation Environments
Real-World Deployment
Model Evaluation
Troubleshooting

Prerequisites

Python 3.10+
PyTorch 2.2.0+
Transformers >=4.40.0
PEFT ==0.11.1
Specific GPU VRAM requirements (see SKILL.md)

Installation

First, add the marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Quality Score

Verified

98 /100

Analyzed about 19 hours ago

Trust Signals

Last commit16 days ago

GitHub owner Orchestra-Research

Stars8.3k

Downloads 0

LicenseMIT

Websiteorchestra-research.com

Status

View Source

Similar Extensions

Peft Fine Tuning

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.

Skill

Orchestra-Research

OpenPI Fine Tuning and Serving

Fine-tune and serve Physical Intelligence OpenPI models (pi0, pi0-fast, pi0.5) using JAX or PyTorch backends for robot policy inference across ALOHA, DROID, and LIBERO environments. Use when adapting pi0 models to custom datasets, converting JAX checkpoints to PyTorch, running policy inference servers, or debugging norm stats and GPU memory issues.

Skill

Orchestra-Research

Unsloth

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

Skill

Orchestra-Research

Implementing Llms Litgpt

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

Skill

Orchestra-Research

Fine Tuning Expert

Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets, setting hyperparameters for fine-tuning runs, adapter training, transfer learning, finetuning with Hugging Face PEFT, OpenAI fine-tuning, instruction tuning, RLHF, DPO, or quantizing and deploying fine-tuned models. Trigger terms include: LoRA, QLoRA, PEFT, finetuning, fine-tuning, adapter tuning, LLM training, model training, custom model.

Skill

jeffallan

Axolotl Fine Tuning Skill

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

Skill

davila7