The Azure AI Speech Toolkit helps developers explore Azure AI Speech Service and run quick-starts and scenario use cases with "simple clicks".
The Speech service provides speech to text and text to speech capabilities with a Speech resource. You can transcribe speech to text with high accuracy, produce natural-sounding text to speech voices, translate spoken audio, etc.
Speech to text: Instant transcription with intermediate results for live audio inputs with Realtime Transcription; Fastest synchronous output for situations with predictable latency with Fast Transcription; Efficient processing for large volumes of prerecorded audio with Batch Transcription; and enhanced accuracy for specific domains and conditions with Custom speech.
Text to speech: With text to speech, you can convert input text into human like synthesized speech. Use neural voices, which are human like voices powered by deep neural networks. Use the Speech Synthesis Markup Language (SSML) to fine-tune the pitch, pronunciation, speaking rate, volume, and more.
Speech translation: Speech translation enables real-time, multilingual translation of speech to your applications, tools, and devices. Use this feature for speech to speech and speech to text translation.