此内容尚未提供您的语言版本,正在以英文显示。

Whisper

技能活跃

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

目的

To provide a powerful and flexible solution for converting spoken audio into text, suitable for a wide range of applications from podcast transcription to multilingual audio analysis.

功能

Multilingual speech-to-text transcription
Translation to English
Language identification
Support for multiple model sizes
Configurable transcription options

使用场景

Transcribing podcasts and videos
Automating meeting notes
Processing multilingual audio content
Speech-to-text conversion in noisy environments

非目标

Real-time streaming transcription (faster-whisper is mentioned as an alternative)
Speaker diarization (identifying different speakers)
Managed API service (focus is on local execution)

Trust

warning:Issues AttentionIn the last 90 days, 17 issues were opened and 4 were closed, indicating a slow response rate to opened issues.

安装

npx skills add davila7/claude-code-templates

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

95 /100

1 day ago 分析

信任信号

最近提交1 day ago

GitHub 所有者 davila7

星标27.2k

下载量 23k

许可证MIT

网站aitmpl.com

状态

查看源代码

类似扩展

Whisper

技能

Orchestra-Research

Openai Whisper Api

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

技能

steipete

Cli Anything Videocaptioner

AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.

技能

hkuds

Transcribe

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

技能

openai

Video to Text Bcut

Transcribe video/audio URL to text + word-level timestamps using Bilibili Bcut ASR API (free, no API key). Preferred for Chinese content — Bcut gives character-level timestamps vs Whisper word-level. Returns text + segments [{start, end, text}]. Requires yt-dlp + ffmpeg.

技能

0xmariowu

Whisper Transcription

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

技能

guia-matthieu