Data Retention, Quotas, and Rate Limits
Koldan enforces data retention policies, usage quotas, and rate limits per user. These controls determine how long your data is kept, how much you can store and process, and how frequently you can call the API.
All three are resolved using the same priority chain:
| Priority | Source | Description |
|---|---|---|
| User override | Set directly on your account by an admin | |
| Subscription plan | From your active subscription plan | |
| Tenant default | Baseline configured for your organization |
The first non-null value wins. If nothing is overridden, tenant defaults apply.
Data Retention
Retention policies define how many days your data is kept before Koldan automatically removes it. Different retention periods apply depending on the resource type and its state.
Speech Services
| Resource | Retention applies to |
|---|---|
| Source media | Original uploaded audio/video binary - auto-discarded after the configured number of days |
| Deleted files | Deleted file content - permanently purged after a retention window |
| Transcription results | Completed transcription output |
| Failed/canceled transcriptions | Job records for unsuccessful transcription attempts |
| Deleted transcriptions | Deleted transcription data |
| Summary results | Completed summary output |
| Failed/canceled summaries | Job records for unsuccessful summary attempts |
| Deleted summaries | Deleted summary data |
| Translation results | Completed translation output |
| Failed/canceled translations | Job records for unsuccessful translation attempts |
| Deleted translations | Deleted translation data |
| Listening audio | Generated MP3 playback files |
| Deleted listening audio | Deleted listening audio files |
Text Services
| Resource | Retention applies to |
|---|---|
| Translation history | On-demand text translation records |
| Deleted translations | Deleted text translation data |
Automatic Deletion
When a retention period expires, the associated data is permanently deleted and cannot be recovered. Download or export any data you need before it reaches its retention limit.
Check Your Retention Policy
Returns your effective retention periods (in days) for all resource types, including whether each value is a user-level override or the tenant default.
Quotas
Quotas limit how much you can store and process. They are tracked as a combination of a limit and your current usage.
Speech Services
| Quota | Unit |
|---|---|
| Storage | Bytes |
| Transcription | Minutes per month (includes offline jobs and online streaming sessions) |
| Summaries | Requests per month |
| Summary tokens | LLM tokens per month |
| Translations | Requests per month |
| Translation tokens | LLM tokens per month |
Text Services
| Quota | Unit |
|---|---|
| Text translations | Requests per month |
| Text translation tokens | LLM tokens per month |
Quota Response Format
Each quota is returned as an object with three fields:
| Field | Description |
|---|---|
limit |
Maximum allowed value |
used |
Current consumption |
available |
Remaining capacity (limit - used) |
Monthly quotas reset automatically on the 1st of each month.
Exceeding a Quota
When a quota is exhausted (available = 0), further requests for that operation return 402 Payment Required or 403 Forbidden until the quota resets or an administrator increases your limit.
Check Your Quotas
Returns your effective quota limits and current usage across all services.
Rate Limits
Rate limits restrict how many requests per minute you can make to specific operations and to the API as a whole.
Per-Operation Limits
| Operation | Applies to |
|---|---|
| File uploads | Uploading audio/video files |
| Transcription jobs | Creating new transcription jobs |
| Summary executions | Generating summaries |
| Translation executions | Generating translations |
| Text translation executions | On-demand text translations |
| User lookups | Searching / looking up users |
| Stream session starts per minute | Starting new online streaming sessions per minute |
| Stream concurrent sessions | Maximum concurrent streaming sessions per user |
| Stream session duration | Maximum allowed streaming session duration (seconds) |
| Stream bytes per second | Maximum audio bandwidth per streaming session (bps) |
| File download bytes per second | Maximum file download speed (bytes per second) |
Global Limit
In addition to per-operation limits, a global requests-per-minute cap applies across all API endpoints combined.
Rate Limit Headers
Every API response includes rate limit headers so you can track your remaining budget:
| Header | Description |
|---|---|
X-RateLimit-Limit |
Your per-minute limit for this operation |
X-RateLimit-Remaining |
Requests remaining in the current window |
X-RateLimit-Reset |
Unix epoch timestamp when the window resets |
Handling 429 Too Many Requests
When you exceed a rate limit, the API responds with:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712073600
| Header | What to do |
|---|---|
Retry-After |
Number of seconds to wait before retrying |
Best Practice
Implement exponential backoff with the Retry-After header value as the minimum wait time. Avoid tight retry loops - they will continue to receive 429 responses and may delay your recovery.
Check Your Rate Limits
Returns your effective per-operation and global rate limits.
Related Pages
- Subscriptions - how subscription plans override quotas and rate limits
- Authentication - authenticating your API requests
- Admin Rate Limits - managing tenant-wide default rate limits