TTS Turbo V2.5 Text to Speech
Low-latency, high-quality text-to-speech optimized for speed. 32 languages, ~300ms response time, 3x faster than standard models.
Audio Generator
Cost: -
Audio Preview
No audio generated yet
Key Features
~300ms Latency
Ultra-fast response time, 3x faster than Multilingual v2
32 Languages
Support for 32 languages with automatic language detection
40,000 Characters
Generate up to 40 minutes of audio in a single request
50% Lower Cost
Half the price per character compared to standard models
Premium Voice Library
20+ default voices plus access to community voice library
Word Timestamps
Get timing data for each word, perfect for synchronization
How to Use
Generate speech in three simple steps
Enter Text
Type or paste your text in any of 32 supported languages (up to 40,000 characters)
Choose Voice
Select from premium voices and customize stability, similarity, and speed
Generate Audio
Get your audio in ~300ms with optional word timestamps
Enter Text
Type or paste your text in any of 32 supported languages (up to 40,000 characters)
Choose Voice
Select from premium voices and customize stability, similarity, and speed
Generate Audio
Get your audio in ~300ms with optional word timestamps
Technical Specifications
Use Cases
Real-Time Voice Agents
Build conversational AI assistants with fast response times
Interactive Applications
Add voice to games, apps, and interactive experiences
Large-Scale TTS
Process high volumes of text efficiently at lower cost
Multilingual Content
Create content in 32 languages with consistent quality
IVR Systems
Build natural-sounding interactive voice response systems
Live Streaming
Generate real-time voice content for live applications
Frequently Asked Questions
Find answers to common questions about this model
Turbo v2.5 is a low-latency, high-quality text-to-speech model optimized for speed. It delivers ~300ms response time (3x faster than Multilingual v2) while maintaining excellent audio quality across 32 languages.
Turbo v2.5 generates speech in approximately 250-300ms, making it ideal for real-time applications like voice agents and interactive experiences.
32 languages including English, Chinese, Japanese, Korean, Spanish, French, German, Italian, Portuguese, Hindi, Arabic, Dutch, Polish, Swedish, Turkish, Vietnamese, Hungarian, Norwegian, and more.
You can convert up to 40,000 characters in a single request, producing approximately 40 minutes of audio.
Turbo v2.5 is 3x faster with 50% lower cost per character. Choose Turbo for speed-critical applications; choose Multilingual v2 for maximum emotional expressiveness.
When enabled, Turbo v2.5 returns timing data for each word in the generated speech, useful for lip-sync, captions, and synchronization applications.
Adjust stability (0-1, controls consistency), similarity boost (0-1, voice matching), and speed (0.7x to 1.2x). For real-time apps, use higher stability settings.
Yes, all audio generated through our platform can be used for commercial purposes including apps, games, videos, and products.
Start Creating with Turbo v2.5
Low-latency, high-quality speech synthesis for real-time applications