SAUTI TTS
v1.0 — Swahili
Serves synthesized Swahili audio via a low-latency REST API with multiple voice options including voice cloning.
What it does
SAUTI TTS converts written Swahili text into natural-sounding speech, delivering high-quality Swahili synthesis through a production API.
How it works
Send a POST request with Swahili text and receive synthesized audio in seconds. The API supports WAV and MP3 output formats, configurable speaking rate, and multiple voice options.
Capabilities
- Multiple voice options including custom voices
- Zero-shot voice cloning from short audio clips
- REST API with API key authentication
- WAV and MP3 output formats
- Configurable speaking rate (0.5× to 2.0×)
- Swahili-specific text processing and phonemisation
Roadmap
Voice cloning (in progress)
Upload 6-30 seconds of any voice and get a personalised voice clone — custom voices for brand identities, accessibility, and personalised user experiences.
Additional languages
After stabilising the Swahili pipeline, we plan to extend SAUTI TTS to additional East African languages — starting with Kikuyu and Luo — using the same methodology.