Data Retention, Quotas, and Rate Limits

Koldan enforces data retention policies, usage quotas, and rate limits per user. These controls determine how long your data is kept, how much you can store and process, and how frequently you can call the API.

All three are resolved using the same priority chain:

Priority	Source	Description
	User override	Set directly on your account by an admin
	Subscription plan	From your active subscription plan
	Tenant default	Baseline configured for your organization

The first non-null value wins. If nothing is overridden, tenant defaults apply.

Data Retention

Retention policies define how many days your data is kept before Koldan automatically removes it. Different retention periods apply depending on the resource type and its state.

Speech Services

Resource	Retention applies to
Source media	Original uploaded audio/video binary - auto-discarded after the configured number of days
Deleted files	Deleted file content - permanently purged after a retention window
Transcription results	Completed transcription output
Failed/canceled transcriptions	Job records for unsuccessful transcription attempts
Deleted transcriptions	Deleted transcription data
Summary results	Completed summary output
Failed/canceled summaries	Job records for unsuccessful summary attempts
Deleted summaries	Deleted summary data
Translation results	Completed translation output
Failed/canceled translations	Job records for unsuccessful translation attempts
Deleted translations	Deleted translation data
Listening audio	Generated MP3 playback files
Deleted listening audio	Deleted listening audio files

Text Services

Resource	Retention applies to
Translation history	On-demand text translation records
Deleted translations	Deleted text translation data

Automatic Deletion

When a retention period expires, the associated data is permanently deleted and cannot be recovered. Download or export any data you need before it reaches its retention limit.

Check Your Retention Policy

GET /api/v1/retention

Returns your effective retention periods (in days) for all resource types, including whether each value is a user-level override or the tenant default.

Quotas

Quotas limit how much you can store and process. They are tracked as a combination of a limit and your current usage.

Speech Services

Quota	Unit
Storage	Bytes
Transcription	Minutes per month (includes offline jobs and online streaming sessions)
Summaries	Requests per month
Summary tokens	LLM tokens per month
Translations	Requests per month
Translation tokens	LLM tokens per month

Text Services

Quota	Unit
Text translations	Requests per month
Text translation tokens	LLM tokens per month

Quota Response Format

Each quota is returned as an object with three fields:

Field	Description
`limit`	Maximum allowed value
`used`	Current consumption
`available`	Remaining capacity (`limit - used`)

Monthly quotas reset automatically on the 1st of each month.

Exceeding a Quota

When a quota is exhausted (available = 0), further requests for that operation return 402 Payment Required or 403 Forbidden until the quota resets or an administrator increases your limit.

Check Your Quotas

GET /api/v1/quotas

Returns your effective quota limits and current usage across all services.

Rate Limits

Rate limits restrict how many requests per minute you can make to specific operations and to the API as a whole.

Per-Operation Limits

Operation	Applies to
File uploads	Uploading audio/video files
Transcription jobs	Creating new transcription jobs
Summary executions	Generating summaries
Translation executions	Generating translations
Text translation executions	On-demand text translations
User lookups	Searching / looking up users
Stream session starts per minute	Starting new online streaming sessions per minute
Stream concurrent sessions	Maximum concurrent streaming sessions per user
Stream session duration	Maximum allowed streaming session duration (seconds)
Stream bytes per second	Maximum audio bandwidth per streaming session (bps)
File download bytes per second	Maximum file download speed (bytes per second)

Global Limit

In addition to per-operation limits, a global requests-per-minute cap applies across all API endpoints combined.

Rate Limit Headers

Every API response includes rate limit headers so you can track your remaining budget:

Header	Description
`X-RateLimit-Limit`	Your per-minute limit for this operation
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix epoch timestamp when the window resets

Handling `429 Too Many Requests`

When you exceed a rate limit, the API responds with:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712073600

Header	What to do
`Retry-After`	Number of seconds to wait before retrying

Best Practice

Implement exponential backoff with the Retry-After header value as the minimum wait time. Avoid tight retry loops - they will continue to receive 429 responses and may delay your recovery.

Check Your Rate Limits

GET /api/v1/rate-limits

Returns your effective per-operation and global rate limits.

Subscriptions - how subscription plans override quotas and rate limits
Authentication - authenticating your API requests
Admin Rate Limits - managing tenant-wide default rate limits

Data Retention, Quotas, and Rate Limits

Data Retention

Speech Services

Text Services

Check Your Retention Policy

Quotas

Speech Services

Text Services

Quota Response Format

Check Your Quotas

Rate Limits

Per-Operation Limits

Global Limit

Rate Limit Headers

Handling 429 Too Many Requests

Check Your Rate Limits

Related Pages

Handling `429 Too Many Requests`