Speech Models
Speech models are the core of Koldan's transcription engine. Each model is trained for specific languages, use cases, and performance characteristics. When you create a transcription, you select a model - and Koldan handles the rest.
How Models Work
Every model you see in the API represents a stable identifier that resolves to a specific speech recognition engine version on the server. This means:
- You always reference models by name - e.g.,
general-v3,medical-en,telephony. - The server resolves your model name to the best available engine version behind the scenes.
- Model updates are transparent - when a new engine version is deployed, your existing model name automatically points to the improved version. No code changes needed.
flowchart LR
A["Your API Request\n<code>model: general-v3</code>"] --> B["Koldan Server"]
B --> C["Resolved Engine\nVersion"]
C --> D["Transcription Result"]
Model Properties
Each model exposes the following information:
| Property | Description |
|---|---|
| Name | The stable identifier you use in API requests (e.g., general-v3) |
| Display Name | A human-readable label (e.g., "General v3") |
| Description | What the model is designed for |
| Status | Current availability - see Model Status below |
| Current Version | The engine version this model currently resolves to |
| Capabilities | Supported languages, streaming, and auto-detection - see Capabilities |
Model Status
| Status | Meaning |
|---|---|
AVAILABLE |
The model is ready for use |
UNAVAILABLE |
The model is not currently deployed on this server |
MAINTENANCE |
Temporarily offline for updates - try again later |
DEPRECATED |
Still functional but scheduled for removal - migrate to the suggested replacement |
Deprecated Models
When a model is deprecated, the API response includes a deprecationDate and a deprecationMessage with migration guidance. Deprecated models continue to work until their sunset date, after which requests return 410 Gone. Plan your migration early.
Capabilities
Each model declares what it can do. Check capabilities before using a model to ensure it fits your use case.
| Capability | Description |
|---|---|
| Languages | List of supported BCP-47 language codes (e.g., en, he, de, ar) |
| Auto-detect | Whether the model can automatically identify the spoken language |
| Streaming | Whether the model supports real-time streaming transcription |
Check Languages Before Transcribing
If you specify a language that the model doesn't support, the transcription will fail. Use the model languages endpoint to verify supported languages, or enable auto-detection if the model supports it.
Model Types
Models are organized into three categories that determine how they resolve to engine versions:
| Type | Behavior | Example |
|---|---|---|
| Family | Always resolves to the latest version in the model family. Automatically upgrades when new versions are deployed. | general → currently resolves to general-v3-20240915 |
| Pinned | Points to a specific major version but may receive minor updates (patches, accuracy improvements). | general-v3 → currently resolves to general-v3-20240915 |
| Concrete | Locked to an exact engine version. Never changes. Use when you need deterministic, reproducible results. | general-v3-20240915 → always this exact version |
Which Type Should I Use?
- Use Family models for most applications - you'll always get the best available version.
- Use Pinned models when you want a stable major version but still benefit from patches.
- Use Concrete models only when reproducibility is critical (e.g., compliance, benchmarking).
Default Model
Each Koldan deployment has a default model. If you create a transcription without specifying a model, the default is used automatically.
To find out which model is the default, call the list models endpoint and look for the model marked as default.
Role-Based Model Access
Not all models are available to all users. Administrators can restrict which models each role can access. When you call the list models endpoint, you only see models assigned to your role.
If you need access to a model that isn't listed, contact your administrator.
→ See Roles and Permissions for more on how roles work.
Checking Available Models
Use the Speech Models API to discover what's available to you:
| What You Need | Endpoint |
|---|---|
| List all models you can access | GET /api/v1/speech-services/models |
| Get details for a specific model | GET /api/v1/speech-services/models/{model} |
| Check supported languages for a model | GET /api/v1/speech-services/models/{model}/languages |
→ Full API details in the REST API Reference.
Related Pages
- Files, Transcriptions, and Summaries - how models are used in the transcription workflow
- Languages - full language management API
- Roles and Permissions - role-based model access
- Models and Aliases Administration - admin model management