此内容尚未提供您的语言版本,正在以英文显示。

Whisper Transcription

技能已验证活跃

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

目的

To accurately convert spoken word from audio and video files into searchable text formats using advanced AI, enabling content repurposing and archival.

功能

Transcribe audio and video files
Batch processing of multiple files
Translate transcriptions to specified languages
Extract timestamps with text segments
Support for multiple output formats (txt, srt, vtt, json, tsv)

使用场景

Convert podcasts to blog posts
Create video subtitles (SRT/VTT)
Extract quotes from interviews
Build searchable audio archives

非目标

Replacing professional audio engineering
Making subjective creative decisions
Directly accessing or editing audio files
Guaranteeing commercial success of content

工作流

Specify input file and desired command (transcribe, batch, translate, timestamps).
Select model size, output format, and optionally language.
Execute the command via Python script.
Receive the transcribed text or formatted output file.

先决条件

Python 3
pip install openai-whisper torch ffmpeg-python click
ffmpeg installed on system

Code Execution

info:LoggingThe script provides informative output to stdout/stderr during execution, detailing model loading, transcription progress, and output file creation.

安装

npx skills add guia-matthieu/clawfu-skills

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

已验证

95 /100

1 day ago 分析

信任信号

最近提交about 1 month ago

GitHub 所有者 guia-matthieu

星标104

许可证MIT

网站clawfu.com

状态

查看源代码

类似扩展

YouTube Downloader

100

Download and process YouTube content for research. Use when: downloading competitor videos for analysis; extracting audio for podcasts; getting transcripts for content repurposing; archiving webinars; research content curation

技能

guia-matthieu

Whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

技能

Orchestra-Research

Openai Whisper Api

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

技能

steipete

Speech Generation Skill

100

Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.

技能

openai

Ffmpeg

使用 FFmpeg 进行视频和音频处理。用于格式转换、调整大小、压缩、音频提取以及为 Remotion 准备素材。触发器包括将 GIF 转换为 MP4、调整视频大小、提取音频、压缩文件或任何媒体转换任务。

技能

digitalsamba

Sheet Music Publisher

Converts mastered audio to sheet music and creates printable songbooks. Use after mastering when the user wants sheet music or a songbook for their album.

技能

bitwize-music-studio