Voice Cloning
Clone Any Voice
Upload 6-30 seconds of audio, enter text, and hear the AI speak in that voice. Powered by XTTS v2 zero-shot voice cloning.
Live1Upload audio
2Clone & speak
3Listen
Reference Audio6-30 seconds of clear speech recommended
or
How it worksXTTS v2 extracts a speaker embedding from your audio, then uses it to synthesize new speech that preserves the original voice characteristics.
Best resultsUse 10-30 seconds of clear speech with minimal background noise. A single speaker works best.
NoteVoice cloning requires GPU hardware. First synthesis may take 30-60s while the model loads. XTTS v2 works best with English text.