Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Whisper

Skill Aktiv

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

Zweck

To provide a powerful and flexible solution for converting spoken audio into text, suitable for a wide range of applications from podcast transcription to multilingual audio analysis.

Funktionen

  • Multilingual speech-to-text transcription
  • Translation to English
  • Language identification
  • Support for multiple model sizes
  • Configurable transcription options

Anwendungsfälle

  • Transcribing podcasts and videos
  • Automating meeting notes
  • Processing multilingual audio content
  • Speech-to-text conversion in noisy environments

Nicht-Ziele

  • Real-time streaming transcription (faster-whisper is mentioned as an alternative)
  • Speaker diarization (identifying different speakers)
  • Managed API service (focus is on local execution)

Trust

  • warning:Issues AttentionIn the last 90 days, 17 issues were opened and 4 were closed, indicating a slow response rate to opened issues.

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

95 /100
Analysiert about 22 hours ago

Vertrauenssignale

Letzter Commit1 day ago
Sterne27.2k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

Whisper

97

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

Skill
Orchestra-Research

Openai Whisper Api

95

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

Skill
steipete

Cli Anything Videocaptioner

99

AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.

Skill
hkuds

Transcribe

97

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

Skill
openai

Video to Text Bcut

96

Transcribe video/audio URL to text + word-level timestamps using Bilibili Bcut ASR API (free, no API key). Preferred for Chinese content — Bcut gives character-level timestamps vs Whisper word-level. Returns text + segments [{start, end, text}]. Requires yt-dlp + ffmpeg.

Skill
0xmariowu

Whisper Transcription

95

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

Skill
guia-matthieu