Pular para o conteúdo principal
Este conteúdo ainda não está disponível no seu idioma e está sendo exibido em inglês.

Happy Audio Gen

Skill Verificado
100
Faz parte de:happy-audio-gen

Universal AI voice / text-to-speech skill supporting OpenAI TTS (gpt-4o-mini-tts, tts-1), ElevenLabs multilingual TTS with voice cloning, Bailian Qwen TTS (qwen-tts / qwen3-tts-vd with voice-design custom voices, long-text chunking built in), MiniMax speech-02-hd, SiliconFlow CosyVoice / SenseVoice, and PlayHT 2.0. Use this skill whenever the user asks to read text aloud, synthesize speech, generate narration, create voice-over, dub a script, or turn any text into audio (mp3 / wav / ogg / flac). Typical phrases include "read this aloud", "generate voice for ...", "create a narration of ...", "tts this", "把这段念出来", "做个配音", "合成语音", or mentions of voices / TTS model names like Alloy, Ash, Cherry, Rachel, CosyVoice, PlayHT. Always use this skill even if the user does not specify a provider — pick one from EXTEND.md defaults or available env keys.

Resumo de IA

This skill acts as a unified interface for generating speech from text across six different TTS providers: OpenAI, ElevenLabs, Bailian, MiniMax, SiliconFlow, and PlayHT. It handles API key management, provider selection, and output file generation, with robust error handling and clear documentation for configuration and usage.

Instalação

Primeiro, adicione o marketplace

/plugin marketplace add iamzhihuix/happy-claude-skills
/plugin install happy-audio-gen@happy-claude-skills
22 days ago
285 stars
MIT
Atualizado em 6 days ago
Ver código-fonte

Extensões similares

ElevenLabs Audio Generation

93

Generate AI voiceovers, sound effects, and music using ElevenLabs APIs. Use when creating audio content for videos, podcasts, or games. Triggers include generating voiceovers, narration, dialogue, sound effects from descriptions, background music, soundtrack generation, voice cloning, or any audio synthesis task.

Skill
digitalsamba

Characteristic Voice

98

Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 'talk like', 'speak like', 'companion voice', 'comfort me', 'cheer me up', 'sound more human', 'good night voice', 'good morning voice', or requests to add fillers, emotion, or personality to generated speech. Also use when the user wants to mimic a specific character's voice, apply speaking style presets (goodnight, morning, comfort, celebration, chatting), tune emotional parameters like warmth or tenderness, or make TTS output feel like a real person talking. If the user asks for a 'voice message', 'companion audio', 'character voice', or wants speech that sighs, laughs, hesitates, or sounds genuinely warm, use this skill. Do NOT use for plain text-to-speech without personality, music generation, sound effects, or general coding tasks unrelated to expressive speech.

Skill
noizai

ElevenLabs Text-to-Speech

98

Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.

Skill
elevenlabs

ElevenLabs Speech-to-Text

98

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.

Skill
elevenlabs

ElevenLabs Music

97

Generate music using ElevenLabs Music API. Use when creating instrumental tracks, songs with lyrics, background music, jingles, or any AI-generated music composition. Supports prompt-based generation, composition plans for granular control, and detailed output with metadata.

Skill
elevenlabs

Text-to-Speech (TTS)

95

Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to convert text into natural-sounding speech, create audio content, build voice-enabled applications, or generate spoken audio files. Supports multiple voices, adjustable speed, and various audio formats.

Skill
answerzhao