Skip to main content

Whisper

Skill Active

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

Purpose

To provide a powerful and flexible solution for converting spoken audio into text, suitable for a wide range of applications from podcast transcription to multilingual audio analysis.

Features

  • Multilingual speech-to-text transcription
  • Translation to English
  • Language identification
  • Support for multiple model sizes
  • Configurable transcription options

Use Cases

  • Transcribing podcasts and videos
  • Automating meeting notes
  • Processing multilingual audio content
  • Speech-to-text conversion in noisy environments

Non-Goals

  • Real-time streaming transcription (faster-whisper is mentioned as an alternative)
  • Speaker diarization (identifying different speakers)
  • Managed API service (focus is on local execution)

Trust

  • warning:Issues AttentionIn the last 90 days, 17 issues were opened and 4 were closed, indicating a slow response rate to opened issues.

Installation

npx skills add davila7/claude-code-templates

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

95 /100
Analyzed about 22 hours ago

Trust Signals

Last commitabout 24 hours ago
Stars27.2k
LicenseMIT
Status
View Source

Similar Extensions

Whisper

97

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

Skill
Orchestra-Research

Openai Whisper Api

95

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

Skill
steipete

Cli Anything Videocaptioner

99

AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.

Skill
hkuds

Transcribe

97

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

Skill
openai

Video to Text Bcut

96

Transcribe video/audio URL to text + word-level timestamps using Bilibili Bcut ASR API (free, no API key). Preferred for Chinese content — Bcut gives character-level timestamps vs Whisper word-level. Returns text + segments [{start, end, text}]. Requires yt-dlp + ffmpeg.

Skill
0xmariowu

Whisper Transcription

95

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

Skill
guia-matthieu

© 2025 SkillRepo · Find the right skill, skip the noise.