Skip to main content

Tts

Skill Active

Use this skill whenever the user wants to convert text into speech, generate audio from text, or produce voiceovers. Triggers include: any mention of 'TTS', 'text to speech', 'speak', 'say', 'voice', 'read aloud', 'audio narration', 'voiceover', 'dubbing', or requests to turn written content into spoken audio. Also use when converting EPUB/PDF/SRT/articles to audio, cloning voices from reference audio, controlling emotion or speed in speech, aligning speech to subtitle timelines, or producing per-segment voice-mapped audio.

Purpose

To provide users with a versatile and high-quality tool for generating speech audio from text, catering to a wide range of needs from simple voiceovers to complex dubbing.

Features

  • Text to speech conversion
  • Voice cloning from reference audio
  • Emotion and speed control
  • Timeline-accurate audio rendering from SRT
  • Support for Noiz cloud API and local Kokoro backend
  • Guest mode for Noiz API without authentication

Use Cases

  • Generating voiceovers for videos or presentations
  • Creating audiobooks from text files or articles
  • Producing synthesized speech for chatbots or virtual assistants
  • Dubbing video content with time-aligned voiceovers
  • Cloning a specific voice for personalized audio messages

Non-Goals

  • Real-time conversational voice interaction
  • Audio editing beyond simple synthesis and alignment
  • Direct integration with chat platforms (though output can be used for it)

Maintenance

  • warning:Dependency ManagementThe skill requires the 'requests' package for the Noiz backend, but there's no explicit mention of lockfiles or automated dependency updates for it.

Trust

  • info:Issues AttentionThere were 2 issues opened and 0 closed in the last 90 days, indicating low recent activity on issues.

Execution

  • warning:Pinned dependenciesThe script lists required packages like 'requests' but lacks explicit version pinning or lockfiles, potentially leading to compatibility issues.

Installation

npx skills add NoizAI/skills

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

96 /100
Analyzed 1 day ago

Trust Signals

Last commit7 days ago
Stars497
Status
View Source

Similar Extensions

Speech Generation Skill

100

Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.

Skill
openai

Google Tts

100

Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".

Skill
sanjay3290

Characteristic Voice

95

Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 'talk like', 'speak like', 'companion voice', 'comfort me', 'cheer me up', 'sound more human', 'good night voice', 'good morning voice', or requests to add fillers, emotion, or personality to generated speech. Also use when the user wants to mimic a specific character's voice, apply speaking style presets (goodnight, morning, comfort, celebration, chatting), tune emotional parameters like warmth or tenderness, or make TTS output feel like a real person talking. If the user asks for a 'voice message', 'companion audio', 'character voice', or wants speech that sighs, laughs, hesitates, or sounds genuinely warm, use this skill. Do NOT use for plain text-to-speech without personality, music generation, sound effects, or general coding tasks unrelated to expressive speech.

Skill
NoizAI

Sherpa Onnx Tts

99

Local text-to-speech via sherpa-onnx (offline, no cloud)

Skill
steipete

Elevenlabs Tts

99

ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis

Skill
inferen-sh

AlterLab FC AI Audio Producer

96

This skill should be used when the user asks about "audio production", "ElevenLabs", "voice isolator", "audio post-production", "AI narration", "text to speech production", "voiceover studio", "audio native", "transcription", "Scribe", "multi-track audio", "audio assembly", "batch audio processing", "audio export", "act as an audio producer", "audio producer mode", "TTS production", "podcast audio", "audiobook production", "narration workflow", "content series audio", "multi-tool audio chain", "ElevenLabs Projects", or needs expertise in end-to-end audio production pipelines using ElevenLabs tools. Part of the AlterLab FC Skills collection (GenAI pack).

Skill
AlterLab-IEU

© 2025 SkillRepo · Find the right skill, skip the noise.