मुख्य सामग्री पर जाएँ
यह सामग्री अभी आपकी भाषा में उपलब्ध नहीं है और अंग्रेज़ी में दिखाई जा रही है।

Zai-TTS

Skill सत्यापित
85

Text-to-speech conversion using GLM-TTS service via the `uvx zai-tts` command for generating audio from text. Use when (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, podcast, driving, cooking). (3) Using pre-cloned voices for speech.

AI सारांश

This skill converts text into speech using the GLM-TTS service through the `uvx zai-tts` command. It allows users to specify text content, output files, and customize speech parameters such as speed, volume, and voice. Configuration requires obtaining user ID and token credentials from the `audio.z.ai` service.

Documentation

  • warning:Configuration & parameter referenceThe `ZAI_AUDIO_USERID` and `ZAI_AUDIO_TOKEN` environment variables are required but their method of retrieval is described indirectly through browser console actions rather than a direct configuration reference.

Security

  • warning:Secret ManagementThe skill requires `ZAI_AUDIO_USERID` and `ZAI_AUDIO_TOKEN` environment variables, which are described as sensitive credentials obtained via browser developer tools. The method of obtaining and storing these secrets is not detailed, posing a potential security risk if not handled properly.
  • warning:Data ExfiltrationThe skill requires user ID and token for authentication to the GLM-TTS service. While these are necessary for functionality, the mechanism for obtaining and potentially submitting these credentials is not fully detailed, raising a concern about potential data exfiltration if mishandled.

Versioning

  • warning:Release ManagementThere is no explicit versioning information (e.g., `version` field in SKILL.md, CHANGELOG, or GitHub releases) for this specific skill. The install instruction points to the `main` branch, making version pinning impossible.

Compliance

  • info:GDPRThe skill requires a User ID and Token for authentication to a third-party service (`audio.z.ai`). While this is not personal data in itself, the service might process personal data, and the skill does not explicitly detail sanitization for any potential personal data submitted to the LLM.

इंस्टॉलेशन

npx skills add aahl/skills

Vercel skills CLI (skills.sh) को npx के माध्यम से चलाता है — स्थानीय रूप से Node.js और कम से कम एक इंस्टॉल किया गया skills-संगत एजेंट (Claude Code, Cursor, Codex, …) ज़रूरी है। यह मानता है कि रिपॉज़िटरी agentskills.io फ़ॉर्मैट का पालन करती है।

2 days ago को अपडेट किया गया
सोर्स देखें

मिलते-जुलते एक्सटेंशन

Happy Audio Gen

100

Universal AI voice / text-to-speech skill supporting OpenAI TTS (gpt-4o-mini-tts, tts-1), ElevenLabs multilingual TTS with voice cloning, Bailian Qwen TTS (qwen-tts / qwen3-tts-vd with voice-design custom voices, long-text chunking built in), MiniMax speech-02-hd, SiliconFlow CosyVoice / SenseVoice, and PlayHT 2.0. Use this skill whenever the user asks to read text aloud, synthesize speech, generate narration, create voice-over, dub a script, or turn any text into audio (mp3 / wav / ogg / flac). Typical phrases include "read this aloud", "generate voice for ...", "create a narration of ...", "tts this", "把这段念出来", "做个配音", "合成语音", or mentions of voices / TTS model names like Alloy, Ash, Cherry, Rachel, CosyVoice, PlayHT. Always use this skill even if the user does not specify a provider — pick one from EXTEND.md defaults or available env keys.

Skill
iamzhihuix

Text-to-Speech (TTS)

95

Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to convert text into natural-sounding speech, create audio content, build voice-enabled applications, or generate spoken audio files. Supports multiple voices, adjustable speed, and various audio formats.

Skill
answerzhao

ElevenLabs Audio Generation

93

Generate AI voiceovers, sound effects, and music using ElevenLabs APIs. Use when creating audio content for videos, podcasts, or games. Triggers include generating voiceovers, narration, dialogue, sound effects from descriptions, background music, soundtrack generation, voice cloning, or any audio synthesis task.

Skill
digitalsamba

Edge TTS

85

Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

Skill
aahl

Characteristic Voice

98

Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 'talk like', 'speak like', 'companion voice', 'comfort me', 'cheer me up', 'sound more human', 'good night voice', 'good morning voice', or requests to add fillers, emotion, or personality to generated speech. Also use when the user wants to mimic a specific character's voice, apply speaking style presets (goodnight, morning, comfort, celebration, chatting), tune emotional parameters like warmth or tenderness, or make TTS output feel like a real person talking. If the user asks for a 'voice message', 'companion audio', 'character voice', or wants speech that sighs, laughs, hesitates, or sounds genuinely warm, use this skill. Do NOT use for plain text-to-speech without personality, music generation, sound effects, or general coding tasks unrelated to expressive speech.

Skill
noizai

ElevenLabs Text-to-Speech

98

Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.

Skill
elevenlabs