Audiocraft Audio Generation

Skill Verified Active

Part of:Agent Native Research Artifact (ARA) Tooling

PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.

Purpose

Generate high-quality music and sound effects from text prompts using advanced PyTorch models, suitable for creative applications and audio research.

Features

Text-to-music generation with MusicGen
Text-to-sound effects generation with AudioGen
Melody-conditioned music generation
Style-conditioned music generation
High-fidelity neural audio codec (EnCodec)

Use Cases

Generating music from text descriptions
Creating sound effects and environmental audio
Building music generation applications
Performing melody-conditioned music generation

Non-Goals

Speech-to-text generation
AI-powered music editing or mixing
Real-time interactive audio generation

Code Execution

info:ValidationWhile core logic involves library calls, explicit validation schemas for user inputs (like prompt strings or generation parameters) are not detailed in the provided snippets; some parameters have clear constraints (duration 1-120s), others are less constrained (temperature, cfg_coef).

Installation

First, add the marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Quality Score

Verified

98 /100

Analyzed about 16 hours ago

Trust Signals

Last commit16 days ago

GitHub owner Orchestra-Research

Stars8.3k

Downloads 0

LicenseMIT

Websiteorchestra-research.com

Status

View Source

Similar Extensions

Audiocraft Audio Generation

Skill

davila7

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Skill

K-Dense-AI

Implementing Llms Litgpt

100

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

Skill

davila7

Elevenlabs Dialogue

ElevenLabs multi-speaker dialogue generation - create conversations with different voices in a single audio file via inference.sh CLI. Capabilities: multi-voice dialogue, script-based generation, voice direction, conversation audio. Use for: podcasts, audiobooks, explainers, tutorials, character dialogue, video scripts. Triggers: elevenlabs dialogue, eleven labs dialogue, multi speaker, conversation audio, dialogue generation, text to dialogue, multi voice, voice acting, podcast dialogue, character voices, script to audio, elevenlabs conversation, two speakers

Skill

inferen-sh

AlterLab FC AI Sound Effects Designer

This skill should be used when the user asks about "AI sound effects", "text to SFX", "generate sound effects", "ElevenLabs sound effects", "foley generation", "ambient sounds", "soundscape design", "AI foley", "sound design for film", "generate audio for video", "podcast sound effects", "game audio SFX", "act as a sound effects designer", "sound effects mode", "SFX prompting", or needs expertise in AI-generated sound effects, descriptive audio prompting, soundscape layering, and foley creation on ElevenLabs. Part of the AlterLab FC Skills collection (GenAI pack).

Skill

AlterLab-IEU

Segment Anything Model

Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.

Skill

Orchestra-Research