跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Ai Podcast Creation

技能 活跃

Create AI-powered podcasts with text-to-speech, music, and audio editing. Tools: Kokoro TTS, DIA TTS, Chatterbox, AI music generation, media merger. Capabilities: multi-voice conversations, background music, intro/outro, full episodes. Use for: podcast production, audiobooks, voice content, audio newsletters. Triggers: podcast, ai podcast, text to speech podcast, audio content, voice over, ai audiobook, multi voice, conversation ai, notebooklm alternative, audio generation, podcast automation, ai narrator, voice content, audio newsletter, podcast maker

目的

To enable users to easily produce professional-sounding AI-powered podcasts and audio content without complex setup.

功能

  • Text-to-speech with multiple voices (Kokoro TTS, DIA TTS)
  • AI music generation for intros, outros, and background
  • Audio merging and editing for full episodes
  • Support for multi-voice conversations and narration
  • Workflow examples for various podcasting needs

使用场景

  • Producing podcast episodes from scripts
  • Creating audiobooks from text
  • Generating voice content for newsletters or announcements
  • Experimenting with AI-generated audio formats

非目标

  • Live podcast recording or streaming
  • Advanced audio engineering beyond merging and crossfades
  • Direct integration with podcast hosting platforms

Documentation

  • info:Configuration & parameter referenceParameters for the tools used (e.g., voice IDs, crossfade_ms) are described within the workflow examples but not in a consolidated parameter reference.

Versioning

  • warning:Release ManagementThe skill does not have a discernible versioning mechanism (e.g., semver in frontmatter or GitHub releases), and installation references 'main'.

Practical Utility

  • info:Edge casesThe SKILL.md mentions using natural punctuation and short sentences for better speech synthesis but does not extensively detail failure modes or recovery steps.

Code Execution

  • warning:Tool FallbackThe skill explicitly requires the `belt` CLI and does not appear to offer a fallback for users who do not have it installed.
  • info:Error HandlingError handling details are not explicitly covered in the SKILL.md; it is assumed that the underlying 'belt' CLI and inference.sh applications manage errors.

Errors

  • info:Actionable error messagesError handling is assumed to be managed by the `belt` CLI and the underlying inference.sh applications, with specific user-facing error messages not detailed in the SKILL.md.

安装

npx skills add inferen-sh/skills

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

90 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标433
状态
查看源代码

类似扩展

Google Tts

100

Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".

技能
sanjay3290

Speech Generation Skill

100

Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.

技能
openai

Remote Interview

100

Capture professional-quality remote interviews using double-ender technique and dedicated recording platforms for podcasts, media, and content production. Use when: Setting up remote podcast interviews with guests; Recording media interviews across distances; Creating customer interview content; Producing expert interviews for thought leadership; Conducting research interviews with high audio quality

技能
guia-matthieu

Songsee

99

Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.

技能
steipete

Sherpa Onnx Tts

99

Local text-to-speech via sherpa-onnx (offline, no cloud)

技能
steipete

Elevenlabs Tts

99

ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis

技能
inferen-sh