Speech Models

Speech models are the core of Koldan's transcription engine. Each model is trained for specific languages, use cases, and performance characteristics. When you create a transcription, you select a model - and Koldan handles the rest.

How Models Work

Every model you see in the API represents a stable identifier that resolves to a specific speech recognition engine version on the server. This means:

You always reference models by name - e.g., general-v3, medical-en, telephony.
The server resolves your model name to the best available engine version behind the scenes.
Model updates are transparent - when a new engine version is deployed, your existing model name automatically points to the improved version. No code changes needed.

flowchart LR
    A["Your API Request\n<code>model: general-v3</code>"] --> B["Koldan Server"]
    B --> C["Resolved Engine\nVersion"]
    C --> D["Transcription Result"]

Model Properties

Each model exposes the following information:

Property	Description
Name	The stable identifier you use in API requests (e.g., `general-v3`)
Display Name	A human-readable label (e.g., "General v3")
Description	What the model is designed for
Status	Current availability - see Model Status below
Current Version	The engine version this model currently resolves to
Capabilities	Supported languages, streaming, and auto-detection - see Capabilities

Model Status

Status	Meaning
`AVAILABLE`	The model is ready for use
`UNAVAILABLE`	The model is not currently deployed on this server
`MAINTENANCE`	Temporarily offline for updates - try again later
`DEPRECATED`	Still functional but scheduled for removal - migrate to the suggested replacement

Deprecated Models

When a model is deprecated, the API response includes a deprecationDate and a deprecationMessage with migration guidance. Deprecated models continue to work until their sunset date, after which requests return 410 Gone. Plan your migration early.

Capabilities

Each model declares what it can do. Check capabilities before using a model to ensure it fits your use case.

Capability	Description
Languages	List of supported BCP-47 language codes (e.g., `en`, `he`, `de`, `ar`)
Auto-detect	Whether the model can automatically identify the spoken language
Streaming	Whether the model supports real-time streaming transcription

Check Languages Before Transcribing

If you specify a language that the model doesn't support, the transcription will fail. Use the model languages endpoint to verify supported languages, or enable auto-detection if the model supports it.

Model Types

Models are organized into three categories that determine how they resolve to engine versions:

Type	Behavior	Example
Family	Always resolves to the latest version in the model family. Automatically upgrades when new versions are deployed.	`general` → currently resolves to `general-v3-20240915`
Pinned	Points to a specific major version but may receive minor updates (patches, accuracy improvements).	`general-v3` → currently resolves to `general-v3-20240915`
Concrete	Locked to an exact engine version. Never changes. Use when you need deterministic, reproducible results.	`general-v3-20240915` → always this exact version

Which Type Should I Use?

Use Family models for most applications - you'll always get the best available version.
Use Pinned models when you want a stable major version but still benefit from patches.
Use Concrete models only when reproducibility is critical (e.g., compliance, benchmarking).

Default Model

Each Koldan deployment has a default model. If you create a transcription without specifying a model, the default is used automatically.

To find out which model is the default, call the list models endpoint and look for the model marked as default.

Role-Based Model Access

Not all models are available to all users. Administrators can restrict which models each role can access. When you call the list models endpoint, you only see models assigned to your role.

If you need access to a model that isn't listed, contact your administrator.

→ See Roles and Permissions for more on how roles work.

Checking Available Models

Use the Speech Models API to discover what's available to you:

What You Need	Endpoint
List all models you can access	`GET /api/v1/speech-services/models`
Get details for a specific model	`GET /api/v1/speech-services/models/{model}`
Check supported languages for a model	`GET /api/v1/speech-services/models/{model}/languages`

→ Full API details in the REST API Reference.

Files, Transcriptions, and Summaries - how models are used in the transcription workflow
Languages - full language management API
Roles and Permissions - role-based model access
Models and Aliases Administration - admin model management