Skip to content

Data Retention, Quotas, and Rate Limits

Koldan enforces data retention policies, usage quotas, and rate limits per user. These controls determine how long your data is kept, how much you can store and process, and how frequently you can call the API.

All three are resolved using the same priority chain:

Priority Source Description
User override Set directly on your account by an admin
Subscription plan From your active subscription plan
Tenant default Baseline configured for your organization

The first non-null value wins. If nothing is overridden, tenant defaults apply.


Data Retention

Retention policies define how many days your data is kept before Koldan automatically removes it. Different retention periods apply depending on the resource type and its state.

Speech Services

Resource Retention applies to
Source media Original uploaded audio/video binary - auto-discarded after the configured number of days
Deleted files Deleted file content - permanently purged after a retention window
Transcription results Completed transcription output
Failed/canceled transcriptions Job records for unsuccessful transcription attempts
Deleted transcriptions Deleted transcription data
Summary results Completed summary output
Failed/canceled summaries Job records for unsuccessful summary attempts
Deleted summaries Deleted summary data
Translation results Completed translation output
Failed/canceled translations Job records for unsuccessful translation attempts
Deleted translations Deleted translation data
Listening audio Generated MP3 playback files
Deleted listening audio Deleted listening audio files

Text Services

Resource Retention applies to
Translation history On-demand text translation records
Deleted translations Deleted text translation data

Automatic Deletion

When a retention period expires, the associated data is permanently deleted and cannot be recovered. Download or export any data you need before it reaches its retention limit.

Check Your Retention Policy

GET /api/v1/retention

Returns your effective retention periods (in days) for all resource types, including whether each value is a user-level override or the tenant default.


Quotas

Quotas limit how much you can store and process. They are tracked as a combination of a limit and your current usage.

Speech Services

Quota Unit
Storage Bytes
Transcription Minutes per month (includes offline jobs and online streaming sessions)
Summaries Requests per month
Summary tokens LLM tokens per month
Translations Requests per month
Translation tokens LLM tokens per month

Text Services

Quota Unit
Text translations Requests per month
Text translation tokens LLM tokens per month

Quota Response Format

Each quota is returned as an object with three fields:

Field Description
limit Maximum allowed value
used Current consumption
available Remaining capacity (limit - used)

Monthly quotas reset automatically on the 1st of each month.

Exceeding a Quota

When a quota is exhausted (available = 0), further requests for that operation return 402 Payment Required or 403 Forbidden until the quota resets or an administrator increases your limit.

Check Your Quotas

GET /api/v1/quotas

Returns your effective quota limits and current usage across all services.


Rate Limits

Rate limits restrict how many requests per minute you can make to specific operations and to the API as a whole.

Per-Operation Limits

Operation Applies to
File uploads Uploading audio/video files
Transcription jobs Creating new transcription jobs
Summary executions Generating summaries
Translation executions Generating translations
Text translation executions On-demand text translations
User lookups Searching / looking up users
Stream session starts per minute Starting new online streaming sessions per minute
Stream concurrent sessions Maximum concurrent streaming sessions per user
Stream session duration Maximum allowed streaming session duration (seconds)
Stream bytes per second Maximum audio bandwidth per streaming session (bps)
File download bytes per second Maximum file download speed (bytes per second)

Global Limit

In addition to per-operation limits, a global requests-per-minute cap applies across all API endpoints combined.

Rate Limit Headers

Every API response includes rate limit headers so you can track your remaining budget:

Header Description
X-RateLimit-Limit Your per-minute limit for this operation
X-RateLimit-Remaining Requests remaining in the current window
X-RateLimit-Reset Unix epoch timestamp when the window resets

Handling 429 Too Many Requests

When you exceed a rate limit, the API responds with:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712073600
Header What to do
Retry-After Number of seconds to wait before retrying

Best Practice

Implement exponential backoff with the Retry-After header value as the minimum wait time. Avoid tight retry loops - they will continue to receive 429 responses and may delay your recovery.

Check Your Rate Limits

GET /api/v1/rate-limits

Returns your effective per-operation and global rate limits.