Skip to content

Streaming Sessions

Manage and retrieve results from real-time speech recognition streaming sessions. Access session history, download recordings, and manage the lifecycle of session data.


Base path: /api/v1/speech-services/streams

Method Endpoint Description
GET /api/v1/speech-services/streams List Sessions
GET /api/v1/speech-services/streams/{id} Get Session
GET /api/v1/speech-services/streams/{id}/segments Get Session Segments
GET /api/v1/speech-services/streams/{id}/recording Download Recording
DELETE /api/v1/speech-services/streams/{id} Delete Session
DELETE /api/v1/speech-services/streams/{id}/recording Purge Recording
POST /api/v1/speech-services/streams/{id}/purge Purge Session Content

List Sessions

GET /api/v1/speech-services/streams

Requires Authentication - Scopes: speech:sessions:read

Retrieve a paginated list of streaming sessions. By default, only sessions belonging to the authenticated user are returned.

Query Parameters
Parameter Type Required Default Description
page integer No 0 Page number (zero-based).
size integer No 20 Page size (max 100).
state string No - Filter by session state: CREATED, ACTIVE, FINISHED, CANCELLED, FAILED.
modelAlias string No - Filter by model alias (partial, case-insensitive).
startedAfter string (ISO 8601) No - Return sessions started on or after this timestamp.
startedBefore string (ISO 8601) No - Return sessions started on or before this timestamp.
recordingRequested boolean No - Filter by whether recording was requested.
sort string No startTime,desc Sort field and direction. Allowed fields: startTime, endTime, durationMilliseconds, wordsCount, segmentCount.
curl -X GET "https://koldan.dixilang.com/api/v1/speech-services/streams?page=0&size=20&state=FINISHED" \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

resp = requests.get(
    "https://koldan.dixilang.com/api/v1/speech-services/streams",
    headers={"Authorization": f"Bearer {JWT}"},
    params={"page": 0, "size": 20, "state": "FINISHED"}
)
print(resp.json())
SpeechServiceSessionsListResponse
Field Type Nullable Description
sessions SpeechServiceSessionResponse[] No Array of streaming session objects.
page integer No Current page number.
size integer No Number of items returned in this page.
total long No Total number of matching sessions across all pages.
SpeechServiceSessionsListResponse
{
  "page": 0,
  "size": 20,
  "total": 72,
  "sessions": [
    {
      "sessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "state": "FINISHED",
      "startTime": "2026-04-06T10:00:00Z",
      ...
    }
  ]
}
Status Description
200 OK Sessions retrieved successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope.

Get Session

GET /api/v1/speech-services/streams/{id}

Requires Authentication - Scopes: speech:sessions:read

Retrieve detailed metadata for a specific streaming session by its unique identifier.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X GET https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.get(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}",
    headers={"Authorization": f"Bearer {JWT}"}
)
print(resp.json())
SpeechServiceSessionResponse

Returns the session metadata object.

Status Description
200 OK Session details retrieved successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized to access this session.
404 Not Found Session not found.

Get Session Segments

GET /api/v1/speech-services/streams/{id}/segments

Requires Authentication - Scopes: speech:sessions:read

Retrieve all finalized transcript segments for a streaming session, ordered by their occurrence.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X GET https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890/segments \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.get(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}/segments",
    headers={"Authorization": f"Bearer {JWT}"}
)
print(resp.json())
SpeechServiceSessionSegmentResponse[]

Returns an array of segment objects.

Field Type Nullable Description
id string (UUID) No Unique identifier of the segment.
segmentIndex integer No Zero-based index of the segment within the session.
transcript string No Finalized transcription text for this segment.
confidence double Yes Confidence score (0.0–1.0).
words SpeechServiceSessionWordResponse[] Yes Word-level timing and confidence.
startSeconds double No Start time in seconds relative to session start.
endSeconds double No End time in seconds relative to session start.
createdAt string (ISO 8601) No Timestamp when the segment was finalized.
SpeechServiceSessionSegmentResponse
[
  {
    "id": "segment-uuid-...",
    "segmentIndex": 0,
    "transcript": "hello world",
    "confidence": 0.95,
    "words": [
      { "word": "hello", "startSeconds": 0.5, "endSeconds": 0.8, "confidence": 0.98 },
      { "word": "world", "startSeconds": 0.9, "endSeconds": 1.2, "confidence": 0.92 }
    ],
    "startSeconds": 0.5,
    "endSeconds": 1.5,
    "createdAt": "2026-04-06T10:00:05Z"
  }
]
Status Description
200 OK Segments retrieved successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized.
404 Not Found Session not found.

Download Recording

GET /api/v1/speech-services/streams/{id}/recording

Requires Authentication - Scopes: speech:sessions:read

Bandwidth Throttled – This endpoint enforces stricter bandwidth limitations

Download the raw audio recording of the streaming session. Recording must have been requested at session start and enabled for the tenant.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X GET https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890/recording \
  -H "X-API-Key: $KOLDAN_API_KEY" \
  -o session-audio.pcm
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.get(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}/recording",
    headers={"Authorization": f"Bearer {JWT}"},
    stream=True
)
with open("session-audio.pcm", "wb") as f:
    for chunk in resp.iter_content(chunk_size=8192):
        f.write(chunk)
Response

Returns the raw binary audio data with audio/x-pcm content type.

Audio format:

  • Sample rate: 16,000 Hz
  • Bit depth: 16-bit signed little-endian
  • Channels: 1 (mono)
Status Description
200 OK Recording content returned successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized.
404 Not Found Session not found or recording was not requested.
410 Gone Recording has been purged and is no longer available.

Delete Session

DELETE /api/v1/speech-services/streams/{id}

Requires Authentication - Scopes: speech:sessions:delete

Deletes a streaming session. This is a "soft" deletion; the session record remains in the database for auditing and quota purposes but will no longer appear in standard list results.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X DELETE https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.delete(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}",
    headers={"Authorization": f"Bearer {JWT}"}
)
print(resp.status_code) # 204
Response

No response body.

Status Description
204 No Content Session deleted successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized.
404 Not Found Session not found or already deleted.

Purge Recording

DELETE /api/v1/speech-services/streams/{id}/recording

Requires Authentication - Scopes: speech:sessions:delete

Rate Limited - This endpoint enforces stricter rate limits

Permanently removes the audio recording for a session from storage. Session metadata and transcript segments are preserved. This operation is irreversible.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X DELETE https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890/recording \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.delete(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}/recording",
    headers={"Authorization": f"Bearer {JWT}"}
)
print(resp.status_code) # 204
Response

No response body.

Status Description
204 No Content Recording purged successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized.
404 Not Found Session not found or no recording was requested.

Purge Session Content

POST /api/v1/speech-services/streams/{id}/purge

Requires Authentication - Scopes: speech:sessions:delete

Rate Limited - This endpoint enforces stricter rate limits

Permanently deletes all transcript segments and the audio recording for a session. Session metadata remains available for auditing and quota tracking. This operation is irreversible.

Path Parameters
Parameter Type Required Description
id string (UUID) Yes Unique identifier of the session.
curl -X POST https://koldan.dixilang.com/api/v1/speech-services/streams/a1b2c3d4-e5f6-7890-abcd-ef1234567890/purge \
  -H "X-API-Key: $KOLDAN_API_KEY"
import requests

session_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
resp = requests.post(
    f"https://koldan.dixilang.com/api/v1/speech-services/streams/{session_id}/purge",
    headers={"Authorization": f"Bearer {JWT}"}
)
print(resp.status_code) # 204
Response

No response body.

Status Description
204 No Content Session content purged successfully.
401 Unauthorized Missing or invalid authentication.
403 Forbidden Insufficient scope or not authorized.
404 Not Found Session not found.

Data Types

SpeechServiceSessionResponse

Field Type Nullable Description
sessionId string (UUID) No Unique identifier of the streaming session.
state string No Current session state: CREATED, ACTIVE, FINISHED, CANCELLED, FAILED.
requestedModelAlias string No The model alias requested by the client.
resolvedModelName string No The actual speech model name used for processing.
usedFallback boolean No Whether a fallback model was used.
startTime string (ISO 8601) No Timestamp when the session was started.
endTime string (ISO 8601) Yes Timestamp when the session ended.
durationMilliseconds long No Total duration of processed audio in milliseconds.
wordsCount integer No Total number of words recognized in this session.
segmentCount integer No Total number of finalized segments.
recordingRequested boolean No Whether audio recording was requested for this session.
recordingPurged boolean No Whether the recording has been purged.
recordingPurgedAt string (ISO 8601) Yes Timestamp when the recording was purged.
clientAddress string No IP address of the client that started the session.
userAgent string Yes User-Agent header of the client.
failureReason string Yes Description of why the session failed (if state is FAILED).
metadata object Yes Custom key-value metadata associated with the session.

SpeechServiceSessionWordResponse

Field Type Nullable Description
word string No The transcribed word.
startSeconds double No Start time of the word in seconds.
endSeconds double No End time of the word in seconds.
confidence double Yes Confidence score for this word (0.0–1.0).
speakerTag integer Yes Identified speaker label (if diarization was active).

Enumerations

SpeechServiceSessionState

Value Description
CREATED Session record created, engine connection in progress.
ACTIVE Engine connected, actively processing audio.
FINISHED Session completed normally.
CANCELLED Client disconnected or session was cancelled.
FAILED An error occurred during the session.

Real-time Processing

This REST API is used for managing session history and retrieving final results. For real-time interaction and audio streaming, use the Streaming WebSocket API.

Quota Tracking

Streaming sessions count toward your monthly transcription minutes quota. Duration is tracked accurately even in case of connection failures.