Guides
Rate Limits
SAUTI API rate limits, response headers, and guidance on handling 429 errors.
Per-minute sliding windows
Rate limits are applied per API key using sliding windows. Each endpoint category has its own limit.
| Endpoint category | Requests / minute | Examples |
|---|---|---|
| Synthesis | 10 | POST /v1/text-to-speech/*, POST /v1/tts/jobs |
| Status / polling | 60 | GET /v1/tts/jobs/* |
| Default | 30 | GET /v1/voices, all other endpoints |
Rate limit headers
Every response includes headers indicating your current quota state.
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Requests remaining before you are limited |
X-RateLimit-Reset | Unix timestamp at which the current window resets. |
Retry-After | Seconds to wait before retrying. Present only on 429 responses. |
Handling 429 errors
When you exceed your limit, the API returns HTTP 429 with a Retry-After header. Do not retry immediately — wait the specified number of seconds, then use exponential backoff for subsequent failures.
python
import time
import requests
def request_with_backoff(url, headers, payload, max_retries=3):
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=payload)
if response.status_code != 429:
return response
retry_after = int(response.headers.get("Retry-After", 10))
print(f"Rate limited. Retrying in {retry_after}s...")
time.sleep(retry_after)
raise Exception("Max retries exceeded")json
{
"type": "https://sauti.finiflowlabs.com/errors/rate_limit_exceeded",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "You have exceeded 10 requests per minute for synthesis. Try again shortly."
}Requesting higher limits
For production workloads that require higher rate limits, email hello@finiflowlabs.com with:
- Your use case and deployment environment
- Expected peak request volume per minute
- Endpoints you are using
We are in early access. Limits are intentionally conservative while we scale infrastructure.