Streaming Sessions
Manage and retrieve results from real-time speech recognition streaming sessions. Access session history, download recordings, and manage the lifecycle of session data.
Base path: /api/v1/speech-services/streams
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/speech-services/streams |
List Sessions |
GET |
/api/v1/speech-services/streams/{id} |
Get Session |
GET |
/api/v1/speech-services/streams/{id}/segments |
Get Session Segments |
GET |
/api/v1/speech-services/streams/{id}/recording |
Download Recording |
DELETE |
/api/v1/speech-services/streams/{id} |
Delete Session |
DELETE |
/api/v1/speech-services/streams/{id}/recording |
Purge Recording |
POST |
/api/v1/speech-services/streams/{id}/purge |
Purge Session Content |
List Sessions
GET /api/v1/speech-services/streams
Requires Authentication - Scopes: speech:sessions:read
Retrieve a paginated list of streaming sessions. By default, only sessions belonging to the authenticated user are returned.
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
page |
integer |
No | 0 |
Page number (zero-based). |
size |
integer |
No | 20 |
Page size (max 100). |
state |
string |
No | - | Filter by session state: CREATED, ACTIVE, FINISHED, CANCELLED, FAILED. |
modelAlias |
string |
No | - | Filter by model alias (partial, case-insensitive). |
startedAfter |
string (ISO 8601) |
No | - | Return sessions started on or after this timestamp. |
startedBefore |
string (ISO 8601) |
No | - | Return sessions started on or before this timestamp. |
recordingRequested |
boolean |
No | - | Filter by whether recording was requested. |
sort |
string |
No | startTime,desc |
Sort field and direction. Allowed fields: startTime, endTime, durationMilliseconds, wordsCount, segmentCount. |
SpeechServiceSessionsListResponse
| Field | Type | Nullable | Description |
|---|---|---|---|
sessions |
SpeechServiceSessionResponse[] |
No | Array of streaming session objects. |
page |
integer |
No | Current page number. |
size |
integer |
No | Number of items returned in this page. |
total |
long |
No | Total number of matching sessions across all pages. |
{
"page": 0,
"size": 20,
"total": 72,
"sessions": [
{
"sessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"state": "FINISHED",
"startTime": "2026-04-06T10:00:00Z",
...
}
]
}
| Status | Description |
|---|---|
200 OK |
Sessions retrieved successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope. |
Get Session
GET /api/v1/speech-services/streams/{id}
Requires Authentication - Scopes: speech:sessions:read
Retrieve detailed metadata for a specific streaming session by its unique identifier.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
SpeechServiceSessionResponse
Returns the session metadata object.
| Status | Description |
|---|---|
200 OK |
Session details retrieved successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized to access this session. |
404 Not Found |
Session not found. |
Get Session Segments
GET /api/v1/speech-services/streams/{id}/segments
Requires Authentication - Scopes: speech:sessions:read
Retrieve all finalized transcript segments for a streaming session, ordered by their occurrence.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
SpeechServiceSessionSegmentResponse[]
Returns an array of segment objects.
| Field | Type | Nullable | Description |
|---|---|---|---|
id |
string (UUID) |
No | Unique identifier of the segment. |
segmentIndex |
integer |
No | Zero-based index of the segment within the session. |
transcript |
string |
No | Finalized transcription text for this segment. |
confidence |
double |
Yes | Confidence score (0.0–1.0). |
words |
SpeechServiceSessionWordResponse[] |
Yes | Word-level timing and confidence. |
startSeconds |
double |
No | Start time in seconds relative to session start. |
endSeconds |
double |
No | End time in seconds relative to session start. |
createdAt |
string (ISO 8601) |
No | Timestamp when the segment was finalized. |
[
{
"id": "segment-uuid-...",
"segmentIndex": 0,
"transcript": "hello world",
"confidence": 0.95,
"words": [
{ "word": "hello", "startSeconds": 0.5, "endSeconds": 0.8, "confidence": 0.98 },
{ "word": "world", "startSeconds": 0.9, "endSeconds": 1.2, "confidence": 0.92 }
],
"startSeconds": 0.5,
"endSeconds": 1.5,
"createdAt": "2026-04-06T10:00:05Z"
}
]
| Status | Description |
|---|---|
200 OK |
Segments retrieved successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized. |
404 Not Found |
Session not found. |
Download Recording
GET /api/v1/speech-services/streams/{id}/recording
Requires Authentication - Scopes: speech:sessions:read
Bandwidth Throttled – This endpoint enforces stricter bandwidth limitations
Download the raw audio recording of the streaming session. Recording must have been requested at session start and enabled for the tenant.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
import requests
session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.get(
f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}/recording",
headers={"Authorization": f"Bearer {JWT}"},
stream=True
)
with open("session-audio.pcm", "wb") as f:
for chunk in resp.iter_content(chunk_size=8192):
f.write(chunk)
Response
Returns the raw binary audio data with audio/x-pcm content type.
Audio format:
- Sample rate: 16,000 Hz
- Bit depth: 16-bit signed little-endian
- Channels: 1 (mono)
| Status | Description |
|---|---|
200 OK |
Recording content returned successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized. |
404 Not Found |
Session not found or recording was not requested. |
410 Gone |
Recording has been purged and is no longer available. |
Delete Session
DELETE /api/v1/speech-services/streams/{id}
Requires Authentication - Scopes: speech:sessions:delete
Deletes a streaming session. This is a "soft" deletion; the session record remains in the database for auditing and quota purposes but will no longer appear in standard list results.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
Response
No response body.
| Status | Description |
|---|---|
204 No Content |
Session deleted successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized. |
404 Not Found |
Session not found or already deleted. |
Purge Recording
DELETE /api/v1/speech-services/streams/{id}/recording
Requires Authentication - Scopes: speech:sessions:delete
Rate Limited - This endpoint enforces stricter rate limits
Permanently removes the audio recording for a session from storage. Session metadata and transcript segments are preserved. This operation is irreversible.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
Response
No response body.
| Status | Description |
|---|---|
204 No Content |
Recording purged successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized. |
404 Not Found |
Session not found or no recording was requested. |
Purge Session Content
POST /api/v1/speech-services/streams/{id}/purge
Requires Authentication - Scopes: speech:sessions:delete
Rate Limited - This endpoint enforces stricter rate limits
Permanently deletes all transcript segments and the audio recording for a session. Session metadata remains available for auditing and quota tracking. This operation is irreversible.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
id |
string (UUID) |
Yes | Unique identifier of the session. |
Response
No response body.
| Status | Description |
|---|---|
204 No Content |
Session content purged successfully. |
401 Unauthorized |
Missing or invalid authentication. |
403 Forbidden |
Insufficient scope or not authorized. |
404 Not Found |
Session not found. |
Data Types
SpeechServiceSessionResponse
| Field | Type | Nullable | Description |
|---|---|---|---|
sessionId |
string (UUID) |
No | Unique identifier of the streaming session. |
state |
string |
No | Current session state: CREATED, ACTIVE, FINISHED, CANCELLED, FAILED. |
requestedModelAlias |
string |
No | The model alias requested by the client. |
resolvedModelName |
string |
No | The actual speech model name used for processing. |
usedFallback |
boolean |
No | Whether a fallback model was used. |
startTime |
string (ISO 8601) |
No | Timestamp when the session was started. |
endTime |
string (ISO 8601) |
Yes | Timestamp when the session ended. |
durationMilliseconds |
long |
No | Total duration of processed audio in milliseconds. |
wordsCount |
integer |
No | Total number of words recognized in this session. |
segmentCount |
integer |
No | Total number of finalized segments. |
recordingRequested |
boolean |
No | Whether audio recording was requested for this session. |
recordingPurged |
boolean |
No | Whether the recording has been purged. |
recordingPurgedAt |
string (ISO 8601) |
Yes | Timestamp when the recording was purged. |
clientAddress |
string |
No | IP address of the client that started the session. |
userAgent |
string |
Yes | User-Agent header of the client. |
failureReason |
string |
Yes | Description of why the session failed (if state is FAILED). |
metadata |
object |
Yes | Custom key-value metadata associated with the session. |
SpeechServiceSessionWordResponse
| Field | Type | Nullable | Description |
|---|---|---|---|
word |
string |
No | The transcribed word. |
startSeconds |
double |
No | Start time of the word in seconds. |
endSeconds |
double |
No | End time of the word in seconds. |
confidence |
double |
Yes | Confidence score for this word (0.0–1.0). |
speakerTag |
integer |
Yes | Identified speaker label (if diarization was active). |
Enumerations
SpeechServiceSessionState
| Value | Description |
|---|---|
CREATED |
Session record created, engine connection in progress. |
ACTIVE |
Engine connected, actively processing audio. |
FINISHED |
Session completed normally. |
CANCELLED |
Client disconnected or session was cancelled. |
FAILED |
An error occurred during the session. |
Real-time Processing
This REST API is used for managing session history and retrieving final results. For real-time interaction and audio streaming, use the Streaming WebSocket API.
Quota Tracking
Streaming sessions count toward your monthly transcription minutes quota. Duration is tracked accurately even in case of connection failures.
Related Documentation
- Streaming Fundamentals: Learn about the concepts, lifecycle, and best practices of streaming.
- Streaming WebSocket API: Detailed protocol and message formats for real-time transcription.
- Transcription API: REST API for pre-recorded file transcriptions.