Guides

Rate Limits

SAUTI API rate limits, response headers, and guidance on handling 429 errors.

Per-minute sliding windows

Rate limits are applied per API key using sliding windows. Each endpoint category has its own limit.

Endpoint categoryRequests / minuteExamples
Synthesis10POST /v1/text-to-speech/*, POST /v1/tts/jobs
Status / polling60GET /v1/tts/jobs/*
Default30GET /v1/voices, all other endpoints

Rate limit headers

Every response includes headers indicating your current quota state.

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRequests remaining before you are limited
X-RateLimit-ResetUnix timestamp at which the current window resets.
Retry-AfterSeconds to wait before retrying. Present only on 429 responses.

Handling 429 errors

When you exceed your limit, the API returns HTTP 429 with a Retry-After header. Do not retry immediately — wait the specified number of seconds, then use exponential backoff for subsequent failures.

python
import time
import requests

def request_with_backoff(url, headers, payload, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=payload)
        if response.status_code != 429:
            return response
        retry_after = int(response.headers.get("Retry-After", 10))
        print(f"Rate limited. Retrying in {retry_after}s...")
        time.sleep(retry_after)
    raise Exception("Max retries exceeded")
json
{
  "type": "https://sauti.finiflowlabs.com/errors/rate_limit_exceeded",
  "title": "Rate Limit Exceeded",
  "status": 429,
  "detail": "You have exceeded 10 requests per minute for synthesis. Try again shortly."
}

Requesting higher limits

For production workloads that require higher rate limits, email hello@finiflowlabs.com with:

  • Your use case and deployment environment
  • Expected peak request volume per minute
  • Endpoints you are using

We are in early access. Limits are intentionally conservative while we scale infrastructure.