Openai Whisper Api
Skill Verified ActiveTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
To transcribe audio files accurately and efficiently using the OpenAI Whisper API, providing a convenient command-line interface.
Features
- Audio transcription via OpenAI Whisper API
- Customizable output format (text/JSON)
- Support for specifying language and prompt hints
- Configurable API base URL for proxies
- Environment variable for API key management
Use Cases
- Transcribing meeting recordings for notes
- Converting spoken content from videos into text
- Generating transcripts for podcasts or interviews
- Processing voice commands or dictations
Non-Goals
- Real-time speech-to-text streaming
- Speaker diarization or identification
- On-device or offline audio transcription
- Advanced audio editing or manipulation
Practical Utility
- info:Unique selling propositionThe skill is a direct wrapper around the OpenAI API, with some convenience scripting. While it provides a usable interface, it doesn't offer significant custom logic beyond API interaction.
Installation
npx skills add steipete/clawdisRuns the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.
Quality Score
VerifiedTrust Signals
Similar Extensions
Whisper
97OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.
Speech Generation Skill
100Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.
YouTube Downloader
100Download and process YouTube content for research. Use when: downloading competitor videos for analysis; extracting audio for podcasts; getting transcripts for content repurposing; archiving webinars; research content curation
Sheet Music Publisher
99Converts mastered audio to sheet music and creates printable songbooks. Use after mastering when the user wants sheet music or a songbook for their album.
Transcribe
97Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.
Whisper Transcription
95Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives