FiniFlow Labs API
Developer Overview
SAUTI is a voice AI platform for African languages. This documentation covers the REST APIs for text-to-speech, speech recognition, voice management, voice cloning, async job processing, the conversational voice agent, and the real-time translation WebSocket.
APIs
SAUTI TTS
Text-to-Speech API
Convert Swahili text to natural-sounding audio. REST endpoint, returns base64-encoded WAV in JSON.
View TTS reference →SAUTI ASR
Speech Recognition API
Transcribe Swahili audio with a fine-tuned model that achieves a 13.5% word error rate — about half the error of multilingual baselines.
View ASR reference →Voices
Voice Registry API
List available voices and retrieve voice metadata. Filter by language to find the right voice for your application.
View Voices reference →Voice Cloning
Personalised Voices API
Create a personalised voice from a 6–30 second reference clip and use it for synthesis through the TTS endpoint.
View Voice Clone reference →Async Jobs
Long-text Synthesis API
Create, poll, and download audio for long-running TTS synthesis jobs. Texts over 2,000 characters are processed asynchronously.
View Jobs reference →Voice Agent API
Conversational AI
End-to-end Swahili conversation: ASR → LLM → TTS in one call. Text and audio modes, with preset general, banking, and health scenarios.
View Voice Agent reference →Real-time Translation
Streaming WebSocket
Bidirectional WebSocket for live English ↔ Kiswahili speech-to-speech translation, with sub-second target end-to-end latency.
View Translation reference →Early access
SAUTI TTS is live. API keys are available on request — email hello@finiflowlabs.com with your use case.