C Voice
Skill ActiveConvert speech to text using `sag` (ElevenLabs STT) and synthesize speech using `say` (macOS built-in TTS). Enables voice input transcription and audio output.
To integrate voice input and output capabilities into Claude Code, allowing for spoken audio transcription and synthesized speech responses.
Features
- Speech-to-text transcription via sag (ElevenLabs STT)
- Text-to-speech synthesis via say (macOS built-in TTS)
- Recording audio from microphone
- Processing various audio file formats (MP3, WAV, M4A, FLAC)
Use Cases
- Transcribing spoken commands or dictation into text
- Reading Claude's responses or summaries aloud
- Capturing audio for analysis or documentation
Non-Goals
- Real-time voice chat
- Cross-platform speech synthesis (beyond macOS for `say`)
Documentation
- warning:Configuration & parameter referenceWhile tools are documented, the explicit requirement for an ElevenLabs API key and its configuration is mentioned in 'Notes' and not as a formal parameter or prerequisite.
Maintenance
- warning:Commit recencyThe last commit was over 2 months ago (March 6, 2026), indicating potential lack of recent maintenance.
Security
- warning:Secret ManagementThe skill requires an ElevenLabs API key, which is mentioned as an environment variable in the notes but not explicitly detailed in the setup or prerequisites regarding secure handling.
Trust
- warning:Issues AttentionThere is 1 open issue from the last 90 days and 0 closed issues, indicating slow or no maintainer response to recent issues.
Versioning
- warning:Release ManagementThe extension uses the `main` branch for installation and does not declare a specific version in its frontmatter or manifests, making version pinning difficult.
Compliance
- info:GDPRThe skill processes audio and text, which could potentially include personal data if spoken. However, it does not submit this data to a third party without explicit use of the ElevenLabs API.
Portability
- warning:Runtime stabilityThe skill explicitly states `say` is macOS built-in, implying it may not function on other operating systems. The `sag` tool's cross-platform compatibility is not detailed.
Install
- warning:Installation instructionThe SKILL.md details how to use the tools but assumes the user will install `sag` and have macOS for `say`. It mentions an ElevenLabs API key requirement but lacks explicit installation and setup instructions for `sag` or API key configuration verification.
Execution
- warning:Pinned dependenciesThe skill relies on external CLIs (`sag`) and macOS built-ins (`say`). While `sag` might be installed via a package manager, there's no explicit pinning or lockfile mentioned for it, and no side-effect headers are applicable to these non-script tools.
Installation
npx skills add daxaur/openpawRuns the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.
Quality Score
Trust Signals
Similar Extensions
Google Tts
100Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".
Speech Generation Skill
100Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.
Tts
96Use this skill whenever the user wants to convert text into speech, generate audio from text, or produce voiceovers. Triggers include: any mention of 'TTS', 'text to speech', 'speak', 'say', 'voice', 'read aloud', 'audio narration', 'voiceover', 'dubbing', or requests to turn written content into spoken audio. Also use when converting EPUB/PDF/SRT/articles to audio, cloning voices from reference audio, controlling emotion or speed in speech, aligning speech to subtitle timelines, or producing per-segment voice-mapped audio.
Characteristic Voice
95Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 'talk like', 'speak like', 'companion voice', 'comfort me', 'cheer me up', 'sound more human', 'good night voice', 'good morning voice', or requests to add fillers, emotion, or personality to generated speech. Also use when the user wants to mimic a specific character's voice, apply speaking style presets (goodnight, morning, comfort, celebration, chatting), tune emotional parameters like warmth or tenderness, or make TTS output feel like a real person talking. If the user asks for a 'voice message', 'companion audio', 'character voice', or wants speech that sighs, laughs, hesitates, or sounds genuinely warm, use this skill. Do NOT use for plain text-to-speech without personality, music generation, sound effects, or general coding tasks unrelated to expressive speech.
Sherpa Onnx Tts
99Local text-to-speech via sherpa-onnx (offline, no cloud)
Elevenlabs Tts
99ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis