跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Dialogue Audio

技能 活跃

Multi-speaker dialogue audio creation with ElevenLabs and Dia TTS. Covers speaker tags, emotion control, pacing, conversation flow, and post-production. Use for: podcasts, audiobooks, explainers, character dialogue, conversational content. Triggers: dialogue audio, multi speaker, conversation audio, dia tts, two speakers, podcast audio, character voices, voice acting, dialogue generation, conversation tts, multi voice, speaker tags, dialogue recording, elevenlabs dialogue, eleven labs conversation

目的

To enable users to create realistic, multi-speaker dialogue audio for various applications like podcasts, audiobooks, and character voices with fine-grained control over delivery.

功能

  • Multi-speaker dialogue generation (up to 2 speakers)
  • Speaker tags for voice assignment
  • Emotion control via punctuation and non-speech cues
  • Pacing control through sentence structure and pauses
  • Post-production tips for audio enhancement

使用场景

  • Creating podcast episodes with distinct hosts
  • Generating character dialogue for audiobooks or games
  • Producing explainer content with conversational narration
  • Developing conversational AI training data

非目标

  • Generating single-speaker TTS
  • Complex audio editing beyond basic merging
  • Real-time voice conversion
  • Translating dialogue

License

  • info:License usabilityThe README mentions an MIT license but there is no dedicated LICENSE file or SPDX identifier in the manifests; this is considered informal wiring.

Versioning

  • warning:Release ManagementThere is no explicit versioning in the SKILL.md frontmatter or manifests, and installation instructions reference `main`.

Compliance

  • info:GDPRThe skill may submit user prompts to the Dia TTS service, which could include personal data if not sanitized by the user.

安装

npx skills add inferen-sh/skills

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

95 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标433
状态
查看源代码

类似扩展

AlterLab FC AI Audio Producer

96

This skill should be used when the user asks about "audio production", "ElevenLabs", "voice isolator", "audio post-production", "AI narration", "text to speech production", "voiceover studio", "audio native", "transcription", "Scribe", "multi-track audio", "audio assembly", "batch audio processing", "audio export", "act as an audio producer", "audio producer mode", "TTS production", "podcast audio", "audiobook production", "narration workflow", "content series audio", "multi-tool audio chain", "ElevenLabs Projects", or needs expertise in end-to-end audio production pipelines using ElevenLabs tools. Part of the AlterLab FC Skills collection (GenAI pack).

技能
AlterLab-IEU

Elevenlabs Tts

99

ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis

技能
inferen-sh

Elevenlabs Dialogue

99

ElevenLabs multi-speaker dialogue generation - create conversations with different voices in a single audio file via inference.sh CLI. Capabilities: multi-voice dialogue, script-based generation, voice direction, conversation audio. Use for: podcasts, audiobooks, explainers, tutorials, character dialogue, video scripts. Triggers: elevenlabs dialogue, eleven labs dialogue, multi speaker, conversation audio, dialogue generation, text to dialogue, multi voice, voice acting, podcast dialogue, character voices, script to audio, elevenlabs conversation, two speakers

技能
inferen-sh

Google Tts

100

Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".

技能
sanjay3290

Speech Generation Skill

100

Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.

技能
openai

Remote Interview

100

Capture professional-quality remote interviews using double-ender technique and dedicated recording platforms for podcasts, media, and content production. Use when: Setting up remote podcast interviews with guests; Recording media interviews across distances; Creating customer interview content; Producing expert interviews for thought leadership; Conducting research interviews with high audio quality

技能
guia-matthieu