跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Azure Speech To Text Rest Py

技能 活跃

Azure Speech to Text REST API for short audio (Python). Use for simple speech recognition of audio files up to 60 seconds without the Speech SDK. Triggers: "speech to text REST", "short audio transcription", "speech recognition REST API", "STT REST", "recognize speech REST". DO NOT USE FOR: Long audio (>60 seconds), real-time streaming, batch transcription, custom speech models, speech translation. Use Speech SDK or Batch Transcription API instead.

目的

To enable simple, SDK-free speech-to-text transcription of short audio files using the Azure REST API.

功能

  • Transcribe short audio files (up to 60 seconds)
  • Utilize Azure Speech to Text REST API directly
  • Support for WAV and OGG audio formats
  • Provide basic and detailed response formats
  • Enable pronunciation assessment capabilities

使用场景

  • Quickly transcribing short voice memos or notes.
  • Integrating speech-to-text into applications without the overhead of the Speech SDK.
  • Performing simple audio file analysis for content extraction.

非目标

  • Transcribing audio longer than 60 seconds
  • Real-time streaming transcription
  • Batch transcription of multiple files
  • Speech translation
  • Using custom speech models

Trust

  • warning:Issues AttentionThere are 19 open issues and 11 closed issues in the last 90 days, indicating a closure rate below 50% and potentially slow maintainer response.

安装

请先添加 Marketplace

/plugin marketplace add microsoft/skills
/plugin install azure-sdk-python@skills

质量评分

75 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标2.3k
许可证MIT
状态
查看源代码

类似扩展

Azure Servicebus Py

100

Azure Service Bus SDK for Python messaging. Use for queues, topics, subscriptions, and enterprise messaging patterns. Triggers: "service bus", "ServiceBusClient", "queue", "topic", "subscription", "message broker".

技能
microsoft

Azure Monitor Query Py

100

Azure Monitor Query SDK for Python. Use for querying Log Analytics workspaces and Azure Monitor metrics. Triggers: "azure-monitor-query", "LogsQueryClient", "MetricsQueryClient", "Log Analytics", "Kusto queries", "Azure metrics".

技能
microsoft

Azure Container Registry SDK for Python

100

Azure Container Registry SDK for Python. Use for managing container images, artifacts, and repositories. Triggers: "azure-containerregistry", "ContainerRegistryClient", "container images", "docker registry", "ACR".

技能
microsoft

Azure App Configuration SDK for Python

100

Azure App Configuration SDK for Python. Use for centralized configuration management, feature flags, and dynamic settings. Triggers: "azure-appconfiguration", "AzureAppConfigurationClient", "feature flags", "configuration", "key-value settings".

技能
microsoft

Elevenlabs Stt

98

ElevenLabs speech-to-text with Scribe models and forced alignment via inference.sh CLI. Models: Scribe v1/v2 (98%+ accuracy, 90+ languages). Capabilities: transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, subtitle generation. Use for: meeting transcription, subtitles, podcast transcripts, lip-sync timing, karaoke. Triggers: elevenlabs stt, elevenlabs transcription, scribe, elevenlabs speech to text, forced alignment, word alignment, subtitle timing, diarization, speaker identification, audio event detection, eleven labs transcribe

技能
inferen-sh

Whisper

97

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

技能
Orchestra-Research