Elevenlabs v3 Text to Speech
Audio
Elevenlabs v3 Text to Speech
POST
Elevenlabs v3 Text to Speech
Converts text to speech using your chosen voice and returns audio.
Request Headers
Enum:
application/jsonBearer authentication format: Bearer {{API Key}}.
Request Body
If specified, the system will attempt to sample deterministically. Repeated requests with the same seed and parameters should return the same result, but full determinism is not guaranteed.Range: [0, 4294967295]
The text to convert to speech.
The voice ID to use.
The text that comes after the current request text. Used to improve speech continuity when concatenating multiple generations.
Language code (ISO 639-1) used for the model and text normalization. An error will be returned if the model does not support this language code.
Output format for the generated audio. Format is codec_sample_rate_bitrate. The 192kbps bitrate for MP3 requires a Creator or higher account; the 44.1kHz sample rate for PCM requires a Pro or higher account.Possible values:
mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, pcm_8000, pcm_16000, pcm_22050, pcm_24000, pcm_32000, pcm_44100, pcm_48000, ulaw_8000, alaw_8000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192The text that comes before the current request text. Used to improve speech continuity when concatenating multiple generations.
If true, uses the IVC version of the voice instead of the PVC version. This is a temporary workaround for higher latency with the PVC version.
List of request_ids for subsequent samples. Used to maintain speech continuity when regenerating samples. Up to 3 request_ids can be provided.Array length: 0 - 3
List of request_ids for previously generated samples before the current generation. Can be used to improve speech continuity. Up to 3 request_ids can be provided.Array length: 0 - 3
Controls text normalization. ‘auto’ lets the system decide, ‘on’ always normalizes, ‘off’ skips normalization.Possible values:
auto, on, offControls language-specific text normalization for certain supported languages to achieve more natural pronunciation. Warning: may significantly increase latency. Currently only supports Japanese.
List of pronunciation dictionary locators (id, version_id) to apply to the text. Applied in order. Up to 3 locators per request.Array length: 0 - 3
Response
The generated audio file Format:binary