[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"extension-skill-elevenlabs-speech-to-text-pl":3,"guides-for-elevenlabs-speech-to-text":229,"similar-k176861yt3z945kzntpp4a5m95866aq8":230},{"_creationTime":4,"_id":5,"children":6,"community":7,"display":9,"evaluation":20,"identity":190,"isFallback":193,"parentExtension":194,"providers":195,"relations":199,"repo":201,"workflow":226},1778053440456.66,"k176861yt3z945kzntpp4a5m95866aq8",[],{"reviewCount":8},0,{"description":10,"installMethods":11,"name":12,"sourceUrl":13,"tags":14},"Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.",{},"ElevenLabs Speech-to-Text","https://github.com/elevenlabs/skills/tree/HEAD/speech-to-text",[15,16,17,18,19],"transcription","audio","elevenlabs","api","speech-to-text",{"_creationTime":21,"_id":22,"extensionId":5,"locale":23,"result":24,"trustSignals":178,"workflow":188},1778053480675.1802,"kn72ge74941ms16b659mb8phsn867mh3","en",{"checks":25,"evaluatedAt":168,"extensionSummary":169,"promptVersionExtension":170,"promptVersionScoring":171,"rationale":172,"score":173,"summary":174,"tags":175,"targetMarket":176,"tier":177},[26,31,34,37,41,44,48,52,55,58,62,67,70,74,77,80,83,86,89,92,96,100,104,109,113,116,119,122,127,130,133,136,139,142,146,149,152,155,158,161,165],{"category":27,"check":28,"severity":29,"summary":30},"Practical Utility","Problem relevance","pass","The description clearly states the problem of converting audio to text and lists specific use cases like generating subtitles and transcribing meetings.",{"category":27,"check":32,"severity":29,"summary":33},"Unique selling proposition","The extension provides specific features like support for 90+ languages, speaker diarization, word-level timestamps, and keyterm prompting, which go beyond basic transcription and offer value over a simple API wrapper.",{"category":27,"check":35,"severity":29,"summary":36},"Production readiness","The extension provides comprehensive documentation, clear installation instructions for multiple languages, and example code for various use cases, indicating it is ready for production use.",{"category":38,"check":39,"severity":29,"summary":40},"Scope","Single responsibility principle","The extension's scope is focused on transcription using ElevenLabs Scribe v2, with clear sub-capabilities like real-time streaming and speaker diarization, without extending into unrelated domains.",{"category":38,"check":42,"severity":29,"summary":43},"Description quality","The displayed description is concise, readable, and accurately reflects the extension's core functionality and use cases.",{"category":45,"check":46,"severity":29,"summary":47},"Invocation","Scoped tools","The extension uses a single primary tool ('convert' or equivalent via SDKs) which then takes specific parameters for different transcription needs, fitting the verb-noun specialist pattern for its domain.",{"category":49,"check":50,"severity":29,"summary":51},"Documentation","Configuration & parameter reference","All relevant parameters for transcription, real-time streaming, and diarization are documented with clear descriptions and examples.",{"category":38,"check":53,"severity":29,"summary":54},"Tool naming","The primary action is 'convert' which is descriptive for the domain of speech-to-text. SDK examples showcase clear method calls.",{"category":38,"check":56,"severity":29,"summary":57},"Minimal I/O surface","Input parameters are clearly defined and specific to transcription tasks. Output formats are structured and well-documented, returning only the promised transcription data.",{"category":59,"check":60,"severity":29,"summary":61},"License","License usability","The extension is licensed under the MIT license, which is a permissive open-source license.",{"category":63,"check":64,"severity":65,"summary":66},"Maintenance","Commit recency","not_applicable","No commit data available for this skill specifically, but the overall repository structure suggests it's part of a larger, potentially maintained project. The repo itself has no recent commits.",{"category":63,"check":68,"severity":29,"summary":69},"Dependency Management","The installation instructions specify exact package versions (e.g., `@elevenlabs/elevenlabs-js@latest`) which allows for controlled dependency management.",{"category":71,"check":72,"severity":29,"summary":73},"Security","Secret Management","The extension correctly requires an API key via an environment variable (`ELEVENLABS_API_KEY`) and explicitly states not to expose it to the client in browser-side implementations.",{"category":71,"check":75,"severity":29,"summary":76},"Injection","The extension relies on the ElevenLabs API and does not appear to load or execute untrusted third-party data as instructions. Input parameters are well-defined.",{"category":71,"check":78,"severity":29,"summary":79},"Transitive Supply-Chain Grenades","The extension relies on the official ElevenLabs API and bundled SDKs; there are no indications of runtime downloads of code or executing remote scripts.",{"category":71,"check":81,"severity":29,"summary":82},"Sandbox Isolation","The extension interacts with an external API and does not appear to modify any files or paths outside of its intended scope.",{"category":71,"check":84,"severity":29,"summary":85},"Sandbox escape primitives","No detached process spawns or retry loops around denied tool calls were detected in the provided code examples.",{"category":71,"check":87,"severity":29,"summary":88},"Data Exfiltration","The extension's purpose is to send audio to an API for transcription; there are no indications of it reading or submitting confidential data beyond the API key, which is handled securely.",{"category":71,"check":90,"severity":29,"summary":91},"Hidden Text Tricks","The bundled markdown files are free of hidden text tricks, control characters, or invisible Unicode sequences that could steer the model.",{"category":93,"check":94,"severity":29,"summary":95},"Hooks","Opaque code execution","The provided code examples are clear, readable, and do not involve obfuscation, base64 payloads, or runtime script fetching.",{"category":97,"check":98,"severity":29,"summary":99},"Portability","Structural Assumption","The extension primarily interacts with an external API, and example code focuses on SDK usage, thus avoiding assumptions about user project file layouts.",{"category":101,"check":102,"severity":65,"summary":103},"Trust","Issues Attention","No issue tracking data is available for this specific skill or its directory within the repository.",{"category":105,"check":106,"severity":107,"summary":108},"Versioning","Release Management","warning","There is no explicit versioning information (e.g., `version` field in SKILL.md or package.json) for this skill. Installation instructions refer to `@latest`.",{"category":110,"check":111,"severity":29,"summary":112},"Code Execution","Validation","The SDKs and API documentation imply robust validation of input parameters for transcription tasks, ensuring correct data shapes and types.",{"category":71,"check":114,"severity":29,"summary":115},"Unguarded Destructive Operations","The skill's function is to send data to an external API for processing and receive text back; it does not perform any destructive file operations.",{"category":110,"check":117,"severity":29,"summary":118},"Error Handling","Examples demonstrate try-catch blocks for API calls and list common API error codes, indicating robust error handling.",{"category":110,"check":120,"severity":65,"summary":121},"Logging","The skill's functionality is primarily API interaction, and explicit local logging mechanisms are not detailed or required for its core operation.",{"category":123,"check":124,"severity":125,"summary":126},"Compliance","GDPR","info","The extension processes audio data, which may contain personal data, and sends it to a third-party API (ElevenLabs). While the API itself likely has privacy policies, the extension does not explicitly sanitize personal data before sending it.",{"category":123,"check":128,"severity":29,"summary":129},"Target market","The extension is a general-purpose transcription tool with no specific regional logic or limitations detected, making it globally applicable.",{"category":97,"check":131,"severity":29,"summary":132},"Runtime stability","The extension primarily uses well-defined SDKs and an API, making it portable across different environments that can run Python or JavaScript and access the internet.",{"category":45,"check":134,"severity":29,"summary":135},"Precise Purpose","The skill clearly defines its purpose (transcribing audio to text using ElevenLabs Scribe v2) and provides specific use cases and examples.",{"category":45,"check":137,"severity":29,"summary":138},"Concise Frontmatter","The frontmatter is concise and effectively summarizes the skill's core capability, supported by trigger phrases.",{"category":49,"check":140,"severity":29,"summary":141},"Concise Body","The SKILL.md file is well-structured, under 500 lines, and delegates deeper material to reference files, adhering to progressive disclosure.",{"category":143,"check":144,"severity":29,"summary":145},"Context","Progressive Disclosure","The SKILL.md file links to various reference files for installation, options, and real-time streaming, demonstrating effective progressive disclosure.",{"category":143,"check":147,"severity":65,"summary":148},"Forked exploration","This skill is not an exploration-style skill; it performs a direct transcription task and does not require forked context.",{"category":27,"check":150,"severity":29,"summary":151},"Usage examples","Multiple end-to-end examples are provided for Python, JavaScript, and cURL, demonstrating input, invocation, and expected outcomes.",{"category":27,"check":153,"severity":29,"summary":154},"Edge cases","The documentation lists common errors (e.g., 401, 422, 429) with explanations and provides examples of error handling, covering key failure modes.",{"category":110,"check":156,"severity":65,"summary":157},"Tool Fallback","The skill does not appear to rely on external tools like an MCP server; it directly interfaces with the ElevenLabs API via SDKs.",{"category":97,"check":159,"severity":29,"summary":160},"Stack assumptions","The skill clearly states its stack assumptions (Python, JavaScript/TypeScript) and installation requirements in the documentation.",{"category":162,"check":163,"severity":29,"summary":164},"Safety","Halt on unexpected state","The provided error handling examples demonstrate catching exceptions and printing error messages, implying a fail-closed behavior for unexpected states.",{"category":97,"check":166,"severity":29,"summary":167},"Cross-skill coupling","The skill is self-contained and focuses on transcription via the ElevenLabs API, with no indications of implicit reliance on other skills.",1778053479692,"This skill leverages the ElevenLabs Scribe v2 API to convert audio and video files into text, supporting numerous languages, speaker diarization, and word-level timestamps. It offers both batch and real-time streaming transcription capabilities through well-documented SDKs for Python and JavaScript, and direct cURL examples.","2.0.0","3.4.0","The skill is well-documented, provides clear examples for multiple languages, and focuses on a single, well-defined task. It adheres to security best practices for API key management. The only minor drawback is the lack of explicit versioning, which is common for repository-based skills.",95,"A comprehensive and production-ready skill for transcribing audio to text using the ElevenLabs Scribe v2 API.",[15,16,17,18,19],"global","verified",{"codeQuality":179,"collectedAt":180,"documentation":181,"maintenance":183,"security":184,"testCoverage":187},{},1778053454680,{"descriptionLength":182,"readmeSize":8},168,{},{"hasNpmPackage":185,"license":186,"smitheryVerified":185},false,"MIT",{"hasCi":185,"hasTests":185},{"updatedAt":189},1778053480675,{"githubOwner":17,"githubRepo":191,"locale":23,"slug":19,"type":192},"skills","skill",true,null,{"extract":196,"llm":198},{"commitSha":197,"license":186},"b476f0ccf4be0e22b2e77cc39307665425d1472b",{"promptVersionExtension":170,"promptVersionScoring":171,"score":173,"targetMarket":176,"tier":177},{"repoId":200},"kd71z3hz1pg97d1k2d6kaqeqtx864knt",{"_creationTime":202,"_id":200,"identity":203,"providers":205,"workflow":223},1777995558409.8555,{"githubOwner":17,"githubRepo":191,"sourceUrl":204},"https://github.com/elevenlabs/skills",{"discover":206,"github":209},{"sources":207},[208],"skills-sh",{"closedIssues90d":210,"forks":211,"homepage":212,"license":186,"openIssues90d":213,"pushedAt":214,"readmeSize":215,"stars":216,"topics":217},16,22,"https://elevenlabs.io",1,1777909457000,3014,216,[218,17,219,220,191,221,222],"ai-agents","music","sfx","stt","tts",{"discoverAt":224,"extractAt":225,"githubAt":225,"updatedAt":225},1777995558409,1778053441433,{"anyEnrichmentAt":227,"extractAt":228,"githubAt":227,"llmAt":189,"updatedAt":189},1778053440833,1778053440456,[],[231,251,269,295,329,348],{"_creationTime":232,"_id":233,"community":234,"display":235,"identity":243,"providers":245,"relations":249,"workflow":250},1778053440456.6584,"k17120x7me8p1n30wxpg972esx866b8q",{"reviewCount":8},{"description":236,"installMethods":237,"name":12,"sourceUrl":238,"tags":239},"Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.",{},"https://github.com/elevenlabs/skills/tree/HEAD/openclaw/elevenlabs-transcribe",[15,16,17,240,241,242],"python","realtime","batch",{"githubOwner":17,"githubRepo":191,"locale":23,"slug":244,"type":192},"elevenlabs-transcribe",{"extract":246,"llm":247},{"commitSha":197,"license":186},{"promptVersionExtension":170,"promptVersionScoring":171,"score":248,"targetMarket":176,"tier":177},98,{"repoId":200},{"anyEnrichmentAt":227,"extractAt":228,"githubAt":227,"llmAt":189,"updatedAt":189},{"_creationTime":252,"_id":253,"community":254,"display":255,"identity":262,"providers":263,"relations":267,"workflow":268},1778053440456.658,"k17b8tkx3b4vgys5rp9avrjfmn866jdq",{"reviewCount":8},{"description":256,"installMethods":257,"name":258,"sourceUrl":259,"tags":260},"Generate music using ElevenLabs Music API. Use when creating instrumental tracks, songs with lyrics, background music, jingles, or any AI-generated music composition. Supports prompt-based generation, composition plans for granular control, and detailed output with metadata.",{},"ElevenLabs Music","https://github.com/elevenlabs/skills/tree/HEAD/music",[219,17,18,261,16],"generation",{"githubOwner":17,"githubRepo":191,"locale":23,"slug":219,"type":192},{"extract":264,"llm":265},{"commitSha":197,"license":186},{"promptVersionExtension":170,"promptVersionScoring":171,"score":266,"targetMarket":176,"tier":177},97,{"repoId":200},{"anyEnrichmentAt":227,"extractAt":228,"githubAt":227,"llmAt":189,"updatedAt":189},{"_creationTime":270,"_id":271,"community":272,"display":273,"identity":282,"providers":285,"relations":289,"workflow":291},1778054691785.2515,"k17ev68gbw25zazp0w5z2a61hd8662cc",{"reviewCount":8},{"description":274,"installMethods":275,"name":276,"sourceUrl":277,"tags":278},"Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.",{},"ASR (Speech to Text) Skill","https://github.com/answerzhao/agent-skills/tree/HEAD/glm-skills/ASR",[279,19,15,280,281,16],"asr","sdk","cli",{"githubOwner":283,"githubRepo":284,"locale":23,"slug":279,"type":192},"answerzhao","agent-skills",{"extract":286,"llm":288},{"commitSha":287,"license":186},"aad73edbd0d9ffbc3d6a402b6eafa6dab96d5ebb",{"promptVersionExtension":170,"promptVersionScoring":171,"score":173,"targetMarket":176,"tier":177},{"repoId":290},"kd712v2g1pay70swwj0jpv2ggs864zgh",{"anyEnrichmentAt":292,"extractAt":293,"githubAt":292,"llmAt":294,"updatedAt":294},1778054692243,1778054691785,1778054738050,{"_creationTime":296,"_id":297,"community":298,"display":299,"identity":313,"providers":317,"relations":322,"workflow":325},1778053197391.382,"k170wvt5rx3c1hv9a5sfkyezc1866k0q",{"reviewCount":8},{"description":300,"installMethods":301,"name":302,"sourceUrl":303,"tags":304},"Universal AI voice / text-to-speech skill supporting OpenAI TTS (gpt-4o-mini-tts, tts-1), ElevenLabs multilingual TTS with voice cloning, Bailian Qwen TTS (qwen-tts / qwen3-tts-vd with voice-design custom voices, long-text chunking built in), MiniMax speech-02-hd, SiliconFlow CosyVoice / SenseVoice, and PlayHT 2.0. Use this skill whenever the user asks to read text aloud, synthesize speech, generate narration, create voice-over, dub a script, or turn any text into audio (mp3 / wav / ogg / flac). Typical phrases include \"read this aloud\", \"generate voice for ...\", \"create a narration of ...\", \"tts this\", \"把这段念出来\", \"做个配音\", \"合成语音\", or mentions of voices / TTS model names like Alloy, Ash, Cherry, Rachel, CosyVoice, PlayHT. Always use this skill even if the user does not specify a provider — pick one from EXTEND.md defaults or available env keys.",{},"Happy Audio Gen","https://github.com/iamzhihuix/happy-claude-skills/tree/HEAD/skills/happy-audio-gen",[222,305,16,306,307,17,308,309,310,311,312],"speech","voice-generation","openai","bailian","minimax","siliconflow","playht","bun",{"githubOwner":314,"githubRepo":315,"locale":23,"slug":316,"type":192},"iamzhihuix","happy-claude-skills","happy-audio-gen",{"extract":318,"llm":320},{"commitSha":319,"license":186},"f49e7782a551759c9f9e0a4d4417ff053f0a86fd",{"promptVersionExtension":170,"promptVersionScoring":171,"score":321,"targetMarket":176,"tier":177},100,{"parentExtensionId":323,"repoId":324},"k173ydbbp6c0vdpxv5r0q9yvgd867en5","kd7dbbtdq95nkcs3k7fg9w6fdn864j0b",{"anyEnrichmentAt":326,"extractAt":327,"githubAt":326,"llmAt":328,"updatedAt":328},1778053199195,1778053197391,1778053284450,{"_creationTime":330,"_id":331,"community":332,"display":333,"identity":342,"providers":343,"relations":346,"workflow":347},1778053440456.6604,"k17a2cxtswmmk54b8wmpfbp5f9866jr0",{"reviewCount":8},{"description":334,"installMethods":335,"name":336,"sourceUrl":337,"tags":338},"Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.",{},"ElevenLabs Text-to-Speech","https://github.com/elevenlabs/skills/tree/HEAD/text-to-speech",[339,17,340,16,341],"text-to-speech","voice","synthesis",{"githubOwner":17,"githubRepo":191,"locale":23,"slug":339,"type":192},{"extract":344,"llm":345},{"commitSha":197,"license":186},{"promptVersionExtension":170,"promptVersionScoring":171,"score":248,"targetMarket":176,"tier":177},{"repoId":200},{"anyEnrichmentAt":227,"extractAt":228,"githubAt":227,"llmAt":189,"updatedAt":189},{"_creationTime":349,"_id":350,"community":351,"display":352,"identity":365,"providers":369,"relations":373,"workflow":375},1778054812528.7214,"k17c4avaab2db2m79et4f4hnwn867qj1",{"reviewCount":8},{"description":353,"installMethods":354,"name":355,"sourceUrl":356,"tags":357},"Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.",{},"AI Multimodal Processing Skill","https://github.com/samhvw8/dot-claude/tree/HEAD/skills/ai-multimodal",[358,359,16,360,361,362,363,364,15],"gemini-api","multimodal","image","video","document-processing","text-to-image","ocr",{"githubOwner":366,"githubRepo":367,"locale":23,"slug":368,"type":192},"samhvw8","dot-claude","ai-multimodal",{"extract":370,"llm":372},{"commitSha":371,"license":186},"28c76162116d2eedab131c0e1548fdc76a2999f7",{"promptVersionExtension":170,"promptVersionScoring":171,"score":173,"targetMarket":176,"tier":177},{"repoId":374},"kd79ad9dpqazy79y2s6rvajgjn865xek",{"anyEnrichmentAt":376,"extractAt":377,"githubAt":376,"llmAt":378,"updatedAt":378},1778054813688,1778054812528,1778054896678]