跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Whisper

技能 活跃

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

目的

To provide a powerful and flexible solution for converting spoken audio into text, suitable for a wide range of applications from podcast transcription to multilingual audio analysis.

功能

  • Multilingual speech-to-text transcription
  • Translation to English
  • Language identification
  • Support for multiple model sizes
  • Configurable transcription options

使用场景

  • Transcribing podcasts and videos
  • Automating meeting notes
  • Processing multilingual audio content
  • Speech-to-text conversion in noisy environments

非目标

  • Real-time streaming transcription (faster-whisper is mentioned as an alternative)
  • Speaker diarization (identifying different speakers)
  • Managed API service (focus is on local execution)

Trust

  • warning:Issues AttentionIn the last 90 days, 17 issues were opened and 4 were closed, indicating a slow response rate to opened issues.

安装

npx skills add davila7/claude-code-templates

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

95 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标27.2k
许可证MIT
状态
查看源代码

类似扩展

Whisper

97

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

技能
Orchestra-Research

Openai Whisper Api

95

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

技能
steipete

Cli Anything Videocaptioner

99

AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.

技能
hkuds

Transcribe

97

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

技能
openai

Video to Text Bcut

96

Transcribe video/audio URL to text + word-level timestamps using Bilibili Bcut ASR API (free, no API key). Preferred for Chinese content — Bcut gives character-level timestamps vs Whisper word-level. Returns text + segments [{start, end, text}]. Requires yt-dlp + ffmpeg.

技能
0xmariowu

Whisper Transcription

95

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

技能
guia-matthieu