Rate Limiting

100 requests per minute per key, and how to back off cleanly

The Servicebay API enforces rate limits to keep the service responsive for everyone. Limits are applied per API key, not per IP, so multiple integrations using the same key share one bucket.

Current limit: 100 requests per minute per API key. If you need more, email support@servicebay.io — higher limits are available for production integrations on request.

Response headers

Every response includes the current rate-limit state:

HeaderDescription
X-RateLimit-LimitMaximum requests per minute
X-RateLimit-RemainingRemaining requests in the current window
X-RateLimit-ResetUnix timestamp when the limit resets

Example:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1705325400

When you hit the limit

A 429 Too Many Requests response is returned, with the same envelope as any other error:

{
  "success": false,
  "error": "Rate limit exceeded. Please wait before making more requests."
}

Retry pattern

A small wrapper that respects X-RateLimit-Reset and retries on 429:

async function fetchWithRetry(url, options = {}, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) return response;

    const resetUnix = parseInt(
      response.headers.get("X-RateLimit-Reset") ?? "0",
      10
    );
    const waitMs = Math.max(0, resetUnix * 1000 - Date.now());

    if (attempt === maxRetries - 1) return response;

    console.log(`Rate limited. Waiting ${waitMs}ms...`);
    await new Promise((resolve) => setTimeout(resolve, waitMs));
  }

  throw new Error("Max retries exceeded");
}
import time
import requests

def fetch_with_retry(method, url, max_retries=3, **kwargs):
    for attempt in range(max_retries):
        res = requests.request(method, url, **kwargs)

        if res.status_code != 429:
            return res

        reset_unix = int(res.headers.get("X-RateLimit-Reset", "0"))
        wait_seconds = max(0, reset_unix - int(time.time()))

        if attempt == max_retries - 1:
            return res

        print(f"Rate limited. Waiting {wait_seconds}s...")
        time.sleep(wait_seconds)

    raise RuntimeError("Max retries exceeded")

Best practices

  1. Monitor X-RateLimit-Remaining — pre-emptively slow down before you hit zero.
  2. Use exponential backoff for retries beyond the first one.
  3. Cache read-mostly data to reduce repeat calls.
  4. Batch operations where possible — combine multiple writes into fewer round-trips.
  5. Long-poll responsibly — for "watch for changes" use cases, prefer polling at 30-second intervals over tight loops.

On this page