[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"extension-skill-answerzhao-asr-sw":3,"guides-for-answerzhao-asr":220,"similar-k17ev68gbw25zazp0w5z2a61hd8662cc":221},{"_creationTime":4,"_id":5,"children":6,"community":7,"display":9,"evaluation":21,"identity":187,"isFallback":191,"parentExtension":192,"providers":193,"relations":197,"repo":199,"workflow":217},1778054691785.2515,"k17ev68gbw25zazp0w5z2a61hd8662cc",[],{"reviewCount":8},0,{"description":10,"installMethods":11,"name":12,"sourceUrl":13,"tags":14},"Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.",{},"ASR (Speech to Text) Skill","https://github.com/answerzhao/agent-skills/tree/HEAD/glm-skills/ASR",[15,16,17,18,19,20],"asr","speech-to-text","transcription","sdk","cli","audio",{"_creationTime":22,"_id":23,"extensionId":5,"locale":24,"result":25,"trustSignals":175,"workflow":185},1778054738050.0156,"kn73eh4gndfsb0fjhxz8ps03918660n1","en",{"checks":26,"evaluatedAt":165,"extensionSummary":166,"promptVersionExtension":167,"promptVersionScoring":168,"rationale":169,"score":170,"summary":171,"tags":172,"targetMarket":173,"tier":174},[27,32,35,38,42,45,49,53,56,59,63,68,71,75,78,81,84,87,90,93,96,100,104,109,114,117,120,123,127,130,133,136,139,142,146,149,152,155,158,162],{"category":28,"check":29,"severity":30,"summary":31},"Practical Utility","Problem relevance","pass","The description clearly states the problem of implementing speech-to-text and lists specific use cases like transcribing audio files and building voice input features.",{"category":28,"check":33,"severity":30,"summary":34},"Unique selling proposition","The skill leverages the z-ai-web-dev-sdk for speech-to-text, offering a specific capability beyond standard LLM text generation, particularly with its SDK and CLI usage examples.",{"category":28,"check":36,"severity":30,"summary":37},"Production readiness","The skill provides both CLI usage for simple tasks and SDK examples for integration, covering a complete lifecycle from basic transcription to advanced use cases with error handling and best practices.",{"category":39,"check":40,"severity":30,"summary":41},"Scope","Single responsibility principle","The extension focuses solely on Automatic Speech Recognition (ASR) using a specific SDK, with no apparent extension into unrelated domains like testing or deployment.",{"category":39,"check":43,"severity":30,"summary":44},"Description quality","The displayed description accurately reflects the skill's functionality of implementing speech-to-text capabilities using the z-ai-web-dev-sdk and supports base64 encoded audio.",{"category":46,"check":47,"severity":30,"summary":48},"Invocation","Scoped tools","The CLI usage provides specific, narrow commands like `z-ai asr --file` and `z-ai asr --base64`, and the SDK usage demonstrates calls to `zai.audio.asr.create`, indicating well-scoped operations.",{"category":50,"check":51,"severity":30,"summary":52},"Documentation","Configuration & parameter reference","The CLI parameters are clearly documented with their types and whether they are required or optional. SDK usage is demonstrated in code examples, implicitly covering parameters.",{"category":39,"check":54,"severity":30,"summary":55},"Tool naming","CLI commands are descriptive (e.g., `z-ai asr`), and SDK methods are clearly named within their namespace (`zai.audio.asr.create`).",{"category":39,"check":57,"severity":30,"summary":58},"Minimal I/O surface","Tool parameters like `--file`, `--base64`, and `--output` are specific and constrained. SDK calls use explicitly named properties like `file_base64`, indicating minimal I/O.",{"category":60,"check":61,"severity":30,"summary":62},"License","License usability","A LICENSE.txt file with the MIT license is present, and the SKILL.md frontmatter explicitly declares 'license: MIT'.",{"category":64,"check":65,"severity":66,"summary":67},"Maintenance","Commit recency","not_applicable","No commit history is available for this specific skill within the provided context, so recency cannot be evaluated.",{"category":64,"check":69,"severity":66,"summary":70},"Dependency Management","The extension relies on the external 'z-ai-web-dev-sdk' which is assumed to be installed, and no other third-party dependencies are bundled or managed within the skill itself.",{"category":72,"check":73,"severity":66,"summary":74},"Security","Secret Management","The skill does not appear to handle or expose secrets directly; it relies on an external SDK which presumably handles its own authentication mechanisms securely.",{"category":72,"check":76,"severity":30,"summary":77},"Injection","The skill processes audio files and base64 strings; there's no indication of loading or executing untrusted third-party data as instructions.",{"category":72,"check":79,"severity":30,"summary":80},"Transitive Supply-Chain Grenades","The skill bundles its own scripts and relies on an external SDK; there are no runtime downloads or execution of arbitrary remote code observed.",{"category":72,"check":82,"severity":30,"summary":83},"Sandbox Isolation","The skill's operations are confined to reading audio files and interacting with the SDK. There's no evidence of attempting to modify files outside its designated scope.",{"category":72,"check":85,"severity":30,"summary":86},"Sandbox escape primitives","The provided scripts and examples do not contain any detached-process spawns or retry loops around denied tool calls.",{"category":72,"check":88,"severity":30,"summary":89},"Data Exfiltration","The skill's purpose is audio transcription, and there are no imperative instructions to read and submit confidential data to third parties.",{"category":72,"check":91,"severity":30,"summary":92},"Hidden Text Tricks","The bundled files appear to be free of hidden-steering tricks, control characters, or invisible Unicode sequences.",{"category":72,"check":94,"severity":30,"summary":95},"Opaque code execution","The bundled JavaScript code is readable and not obfuscated, base64-encoded, or minified without source maps.",{"category":97,"check":98,"severity":30,"summary":99},"Portability","Structural Assumption","The skill uses relative paths for input files (e.g., './audio.wav') and example scripts reference local files, indicating no hardcoded user-specific paths.",{"category":101,"check":102,"severity":66,"summary":103},"Trust","Issues Attention","Issue tracking data is not available for this skill.",{"category":105,"check":106,"severity":107,"summary":108},"Versioning","Release Management","warning","There is no explicit versioning information (e.g., version field in SKILL.md, CHANGELOG, or GitHub releases) provided for this skill.",{"category":110,"check":111,"severity":112,"summary":113},"Code Execution","Validation","info","While the CLI parameters are documented, there is no explicit mention or demonstration of input validation libraries (like Zod or pydantic) for script arguments or SDK parameters. The `safeTranscribe` function includes basic file existence and size checks.",{"category":72,"check":115,"severity":30,"summary":116},"Unguarded Destructive Operations","The skill's primary function is transcription, which is not a destructive operation. No destructive primitives are present in the scripts.",{"category":110,"check":118,"severity":30,"summary":119},"Error Handling","The SDK examples and the `safeTranscribe` function demonstrate robust error handling, including try-catch blocks and checks for file existence and size.",{"category":110,"check":121,"severity":66,"summary":122},"Logging","The skill's primary function is transcription and does not involve destructive actions or outbound calls that would typically require a local audit log.",{"category":124,"check":125,"severity":112,"summary":126},"Compliance","GDPR","The skill processes audio data, which may contain personal data. While it doesn't submit this data to third parties, it's not explicitly sanitized before being sent to the z-ai-web-dev-sdk's ASR service.",{"category":124,"check":128,"severity":30,"summary":129},"Target market","The skill is a general-purpose speech-to-text tool and has no apparent regional or jurisdictional limitations, thus it is considered global.",{"category":97,"check":131,"severity":30,"summary":132},"Runtime stability","The skill uses standard JavaScript and Node.js APIs, and its CLI usage is generic. It does not appear to make assumptions about specific operating systems or shells beyond Node.js runtime requirements.",{"category":46,"check":134,"severity":30,"summary":135},"Precise Purpose","The description clearly defines the skill's purpose (ASR via z-ai-web-dev-sdk) and provides specific use cases and invocation instructions for both CLI and SDK.",{"category":46,"check":137,"severity":30,"summary":138},"Concise Frontmatter","The SKILL.md frontmatter is concise, clearly stating the name, description, and license without excessive keywords.",{"category":50,"check":140,"severity":30,"summary":141},"Concise Body","The SKILL.md body is well-structured, under 500 lines, and delegates detailed examples and advanced use cases to code blocks within the markdown.",{"category":143,"check":144,"severity":30,"summary":145},"Context","Progressive Disclosure","While the SKILL.md contains code examples, it doesn't embed large blobs of data or external library documentation inline; deeper material is presented through code.",{"category":143,"check":147,"severity":66,"summary":148},"Forked exploration","This skill is a direct tool execution skill and does not involve deep exploration or multi-file inspection requiring a forked context.",{"category":28,"check":150,"severity":30,"summary":151},"Usage examples","The skill provides numerous end-to-end examples for CLI usage, basic SDK implementation, batch processing, and advanced use cases, demonstrating input, invocation, and expected outcomes.",{"category":28,"check":153,"severity":30,"summary":154},"Edge cases","The 'Best Practices' and 'Troubleshooting' sections, along with the `safeTranscribe` function, cover edge cases like unsupported formats, file size limits, empty results, and slow transcription, providing recovery steps.",{"category":110,"check":156,"severity":66,"summary":157},"Tool Fallback","The skill relies on the 'z-ai-web-dev-sdk' which is assumed to be installed, and there are no other external tools or MCP servers that require a fallback.",{"category":159,"check":160,"severity":30,"summary":161},"Safety","Halt on unexpected state","The `safeTranscribe` function and general error handling in examples demonstrate that the skill halts and reports on unexpected states like file not found or too large.",{"category":97,"check":163,"severity":30,"summary":164},"Cross-skill coupling","The skill is self-contained and focuses on ASR. It does not implicitly rely on other skills and cross-links are not necessary.",1778054709520,"This skill provides Automatic Speech Recognition (ASR) functionality, allowing users to transcribe audio files and convert speech to text. It supports both command-line interface (CLI) for quick tasks and an SDK for programmatic integration, handling base64 encoded audio and offering detailed examples for various use cases.","2.0.0","3.4.0","The ASR skill is well-documented with clear CLI and SDK examples covering basic to advanced use cases. It demonstrates robust error handling and practices like caching and batch processing. The only minor concern is the lack of explicit versioning information, which is flagged as a warning.",95,"A high-quality skill for implementing speech-to-text capabilities using the z-ai-web-dev-sdk.",[15,16,17,18,19,20],"global","verified",{"codeQuality":176,"collectedAt":177,"documentation":178,"maintenance":180,"security":181,"testCoverage":184},{},1778054695721,{"descriptionLength":179,"readmeSize":8},321,{},{"hasNpmPackage":182,"license":183,"smitheryVerified":182},false,"MIT",{"hasCi":182,"hasTests":182},{"updatedAt":186},1778054738050,{"githubOwner":188,"githubRepo":189,"locale":24,"slug":15,"type":190},"answerzhao","agent-skills","skill",true,null,{"extract":194,"llm":196},{"commitSha":195,"license":183},"aad73edbd0d9ffbc3d6a402b6eafa6dab96d5ebb",{"promptVersionExtension":167,"promptVersionScoring":168,"score":170,"targetMarket":173,"tier":174},{"repoId":198},"kd712v2g1pay70swwj0jpv2ggs864zgh",{"_creationTime":200,"_id":198,"identity":201,"providers":203,"workflow":214},1777995558409.901,{"githubOwner":188,"githubRepo":189,"sourceUrl":202},"https://github.com/answerzhao/agent-skills",{"discover":204,"github":207},{"sources":205},[206],"skills-sh",{"closedIssues90d":8,"forks":208,"openIssues90d":209,"pushedAt":210,"readmeSize":211,"stars":212,"topics":213},15,1,1768478800000,770,26,[],{"discoverAt":215,"extractAt":216,"githubAt":216,"updatedAt":216},1777995558409,1778054693420,{"anyEnrichmentAt":218,"extractAt":219,"githubAt":218,"llmAt":186,"updatedAt":186},1778054692243,1778054691785,[],[222,246,271,291,310,341],{"_creationTime":223,"_id":224,"community":225,"display":226,"identity":234,"providers":236,"relations":240,"workflow":242},1778053440456.66,"k176861yt3z945kzntpp4a5m95866aq8",{"reviewCount":8},{"description":227,"installMethods":228,"name":229,"sourceUrl":230,"tags":231},"Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.",{},"ElevenLabs Speech-to-Text","https://github.com/elevenlabs/skills/tree/HEAD/speech-to-text",[17,20,232,233,16],"elevenlabs","api",{"githubOwner":232,"githubRepo":235,"locale":24,"slug":16,"type":190},"skills",{"extract":237,"llm":239},{"commitSha":238,"license":183},"b476f0ccf4be0e22b2e77cc39307665425d1472b",{"promptVersionExtension":167,"promptVersionScoring":168,"score":170,"targetMarket":173,"tier":174},{"repoId":241},"kd71z3hz1pg97d1k2d6kaqeqtx864knt",{"anyEnrichmentAt":243,"extractAt":244,"githubAt":243,"llmAt":245,"updatedAt":245},1778053440833,1778053440456,1778053480675,{"_creationTime":247,"_id":248,"community":249,"display":250,"identity":257,"providers":260,"relations":265,"workflow":267},1778054061126.6418,"k174gs69bcph52kqrm4pyzfsa5867bcr",{"reviewCount":8},{"description":251,"installMethods":252,"name":253,"sourceUrl":254,"tags":255},"Transcribe audio files using Qwen ASR. Use when the user sends voice messages and wants them converted to text.",{},"Qwen ASR","https://github.com/aahl/skills/tree/HEAD/skills/qwen-asr",[15,17,20,256],"qwen",{"githubOwner":258,"githubRepo":235,"locale":24,"slug":259,"type":190},"aahl","qwen-asr",{"extract":261,"llm":263},{"commitSha":262,"license":183},"503806b8502ad5965d31c46b9e46584f0746f33d",{"promptVersionExtension":167,"promptVersionScoring":168,"score":264,"targetMarket":173,"tier":174},92,{"repoId":266},"kd7f9kgmrb1hqjqtdjzws1v09d865znt",{"anyEnrichmentAt":268,"extractAt":269,"githubAt":268,"llmAt":270,"updatedAt":270},1778054061476,1778054061126,1778054102990,{"_creationTime":272,"_id":273,"community":274,"display":275,"identity":283,"providers":285,"relations":289,"workflow":290},1778053440456.6584,"k17120x7me8p1n30wxpg972esx866b8q",{"reviewCount":8},{"description":276,"installMethods":277,"name":229,"sourceUrl":278,"tags":279},"Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.",{},"https://github.com/elevenlabs/skills/tree/HEAD/openclaw/elevenlabs-transcribe",[17,20,232,280,281,282],"python","realtime","batch",{"githubOwner":232,"githubRepo":235,"locale":24,"slug":284,"type":190},"elevenlabs-transcribe",{"extract":286,"llm":287},{"commitSha":238,"license":183},{"promptVersionExtension":167,"promptVersionScoring":168,"score":288,"targetMarket":173,"tier":174},98,{"repoId":241},{"anyEnrichmentAt":243,"extractAt":244,"githubAt":243,"llmAt":245,"updatedAt":245},{"_creationTime":292,"_id":293,"community":294,"display":295,"identity":304,"providers":305,"relations":308,"workflow":309},1778053440456.6575,"k17538w4f2s5zz27n2z9d7aqbs866arf",{"reviewCount":8},{"description":296,"installMethods":297,"name":298,"sourceUrl":299,"tags":300},"Build voice AI agents with ElevenLabs. Use when creating voice assistants, customer service bots, interactive voice characters, or any real-time voice conversation experience.",{},"ElevenLabs Agents","https://github.com/elevenlabs/skills/tree/HEAD/agents",[301,302,232,303,18,19],"voice-ai","agents","conversational-ai",{"githubOwner":232,"githubRepo":235,"locale":24,"slug":302,"type":190},{"extract":306,"llm":307},{"commitSha":238,"license":183},{"promptVersionExtension":167,"promptVersionScoring":168,"score":288,"targetMarket":173,"tier":174},{"repoId":241},{"anyEnrichmentAt":243,"extractAt":244,"githubAt":243,"llmAt":245,"updatedAt":245},{"_creationTime":311,"_id":312,"community":313,"display":314,"identity":327,"providers":331,"relations":335,"workflow":337},1778054812528.7214,"k17c4avaab2db2m79et4f4hnwn867qj1",{"reviewCount":8},{"description":315,"installMethods":316,"name":317,"sourceUrl":318,"tags":319},"Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.",{},"AI Multimodal Processing Skill","https://github.com/samhvw8/dot-claude/tree/HEAD/skills/ai-multimodal",[320,321,20,322,323,324,325,326,17],"gemini-api","multimodal","image","video","document-processing","text-to-image","ocr",{"githubOwner":328,"githubRepo":329,"locale":24,"slug":330,"type":190},"samhvw8","dot-claude","ai-multimodal",{"extract":332,"llm":334},{"commitSha":333,"license":183},"28c76162116d2eedab131c0e1548fdc76a2999f7",{"promptVersionExtension":167,"promptVersionScoring":168,"score":170,"targetMarket":173,"tier":174},{"repoId":336},"kd79ad9dpqazy79y2s6rvajgjn865xek",{"anyEnrichmentAt":338,"extractAt":339,"githubAt":338,"llmAt":340,"updatedAt":340},1778054813688,1778054812528,1778054896678,{"_creationTime":342,"_id":343,"community":344,"display":345,"identity":354,"providers":355,"relations":358,"workflow":359},1778054691785.2524,"k1712xyy3wyvy83c0f9z7kccg9866jg4",{"reviewCount":8},{"description":346,"installMethods":347,"name":348,"sourceUrl":349,"tags":350},"Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to convert text into natural-sounding speech, create audio content, build voice-enabled applications, or generate spoken audio files. Supports multiple voices, adjustable speed, and various audio formats.",{},"Text-to-Speech (TTS)","https://github.com/answerzhao/agent-skills/tree/HEAD/glm-skills/TTS",[351,352,20,18,353],"tts","text-to-speech","z-ai-web-dev-sdk",{"githubOwner":188,"githubRepo":189,"locale":24,"slug":351,"type":190},{"extract":356,"llm":357},{"commitSha":195,"license":183},{"promptVersionExtension":167,"promptVersionScoring":168,"score":170,"targetMarket":173,"tier":174},{"repoId":198},{"anyEnrichmentAt":218,"extractAt":219,"githubAt":218,"llmAt":186,"updatedAt":186}]