Whisper

Skill Active

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

Purpose

To provide a powerful and flexible solution for converting spoken audio into text, suitable for a wide range of applications from podcast transcription to multilingual audio analysis.

Features

Multilingual speech-to-text transcription
Translation to English
Language identification
Support for multiple model sizes
Configurable transcription options

Use Cases

Transcribing podcasts and videos
Automating meeting notes
Processing multilingual audio content
Speech-to-text conversion in noisy environments

Non-Goals

Real-time streaming transcription (faster-whisper is mentioned as an alternative)
Speaker diarization (identifying different speakers)
Managed API service (focus is on local execution)

Trust

warning:Issues AttentionIn the last 90 days, 17 issues were opened and 4 were closed, indicating a slow response rate to opened issues.

Installation

npx skills add davila7/claude-code-templates

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

95 /100

Analyzed about 22 hours ago

Trust Signals

Last commitabout 24 hours ago

GitHub owner davila7

Stars27.2k

Downloads 23k

LicenseMIT

Websiteaitmpl.com

Status

View Source

Similar Extensions

Whisper

Skill

Orchestra-Research

Openai Whisper Api

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

Skill

steipete

Cli Anything Videocaptioner

AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.

Skill

hkuds

Transcribe

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

Skill

openai

Video to Text Bcut

Transcribe video/audio URL to text + word-level timestamps using Bilibili Bcut ASR API (free, no API key). Preferred for Chinese content — Bcut gives character-level timestamps vs Whisper word-level. Returns text + segments [{start, end, text}]. Requires yt-dlp + ffmpeg.

Skill

0xmariowu

Whisper Transcription

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

Skill

guia-matthieu