Skip to main content
POST
/
v1
/
audio
/
transcriptions
Speech-to-Text
curl --request POST \
  --url https://stt.freyavoice.ai/v1/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form model=freya-stt \
  --form response_format=json \
  --form temperature=0
{
  "text": "<string>",
  "inference_time_ms": 123
}

Authorizations

Authorization
string
header
required

Bearer token issued by your workspace.

Body

multipart/form-data
file
file
required

The audio file to transcribe. Supported formats: wav, mp3, flac, ogg, m4a, webm. Maximum size: 15 MB.

model
string
default:freya-stt

The model to use for transcription.

response_format
enum<string>
default:json

The format of the response. json returns {"text": "..."}. text returns plain text. verbose_json includes language, timing, and word-level data.

Available options:
json,
text,
verbose_json
temperature
number
default:0

Sampling temperature between 0 and 1. Lower values are more deterministic.

Response

Transcription successful.

Standard JSON response (response_format=json)

text
string

The transcribed text.

inference_time_ms
number

Server-side inference time in milliseconds.