FiniFlow Labs API

Developer Overview

SAUTI is a voice AI platform for African languages. This documentation covers the REST APIs for text-to-speech, speech recognition, voice management, voice cloning, async job processing, the conversational voice agent, and the real-time translation WebSocket.

APIs

SAUTI TTS

Text-to-Speech API

Live

Convert Swahili text to natural-sounding audio. REST endpoint, returns base64-encoded WAV in JSON.

View TTS reference →

SAUTI ASR

Speech Recognition API

Live

Transcribe Swahili audio with a fine-tuned model that achieves a 13.5% word error rate — about half the error of multilingual baselines.

View ASR reference →

Voices

Voice Registry API

Live

List available voices and retrieve voice metadata. Filter by language to find the right voice for your application.

View Voices reference →

Voice Cloning

Personalised Voices API

Beta

Create a personalised voice from a 6–30 second reference clip and use it for synthesis through the TTS endpoint.

View Voice Clone reference →

Async Jobs

Long-text Synthesis API

Live

Create, poll, and download audio for long-running TTS synthesis jobs. Texts over 2,000 characters are processed asynchronously.

View Jobs reference →

Voice Agent API

Conversational AI

Beta

End-to-end Swahili conversation: ASR → LLM → TTS in one call. Text and audio modes, with preset general, banking, and health scenarios.

View Voice Agent reference →

Real-time Translation

Streaming WebSocket

Beta

Bidirectional WebSocket for live English ↔ Kiswahili speech-to-speech translation, with sub-second target end-to-end latency.

View Translation reference →

Early access

SAUTI TTS is live. API keys are available on request — email hello@finiflowlabs.com with your use case.