Skip to main content
POST
/
v3
/
elevenlabs-tts-flash-v2
Elevenlabs flash v2 Text to Speech
curl --request POST \
  --url https://api.myrouter.ai/v3/elevenlabs-tts-flash-v2 \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "text": "<string>",
  "voice_id": "<string>",
  "next_text": "<string>",
  "language_code": "<string>",
  "output_format": "<string>",
  "previous_text": "<string>",
  "use_pvc_as_ivc": true,
  "voice_settings": {
    "speed": 123,
    "style": 123,
    "stability": 123,
    "similarity_boost": 123,
    "use_speaker_boost": true
  },
  "next_request_ids": [
    {}
  ],
  "previous_request_ids": [
    {}
  ],
  "apply_text_normalization": "<string>",
  "apply_language_text_normalization": true,
  "pronunciation_dictionary_locators": [
    {
      "version_id": "<string>",
      "pronunciation_dictionary_id": "<string>"
    }
  ]
}
'
Converts text to speech using your chosen voice and returns audio.

Request Headers

Content-Type
string
required
Enum: application/json
Authorization
string
required
Bearer authentication format: Bearer {{API Key}}.

Request Body

seed
integer
If specified, the system will attempt to sample deterministically. Repeated requests with the same seed and parameters should return the same result, but full determinism is not guaranteed.Range: [0, 4294967295]
text
string
required
The text to convert to speech.
voice_id
string
required
The voice ID to use.
next_text
string
The text that comes after the current request text. Used to improve speech continuity when concatenating multiple generations.
language_code
string
Language code (ISO 639-1) used for the model and text normalization. An error will be returned if the model does not support this language code.
output_format
string
default:"mp3_44100_128"
Output format for the generated audio. Format is codec_sample_rate_bitrate. The 192kbps bitrate for MP3 requires a Creator or higher account; the 44.1kHz sample rate for PCM requires a Pro or higher account.Possible values: mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, pcm_8000, pcm_16000, pcm_22050, pcm_24000, pcm_32000, pcm_44100, pcm_48000, ulaw_8000, alaw_8000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192
previous_text
string
The text that comes before the current request text. Used to improve speech continuity when concatenating multiple generations.
use_pvc_as_ivc
boolean
default:false
If true, uses the IVC version of the voice instead of the PVC version. This is a temporary workaround for higher latency with the PVC version.
voice_settings
object
next_request_ids
array
List of request_ids for subsequent samples. Used to maintain speech continuity when regenerating samples. Up to 3 request_ids can be provided.Array length: 0 - 3
previous_request_ids
array
List of request_ids for previously generated samples before the current generation. Can be used to improve speech continuity. Up to 3 request_ids can be provided.Array length: 0 - 3
apply_text_normalization
string
default:"auto"
Controls text normalization. ‘auto’ lets the system decide, ‘on’ always normalizes, ‘off’ skips normalization.Possible values: auto, on, off
apply_language_text_normalization
boolean
default:false
Controls language-specific text normalization for certain supported languages to achieve more natural pronunciation. Warning: may significantly increase latency. Currently only supports Japanese.
pronunciation_dictionary_locators
array
List of pronunciation dictionary locators (id, version_id) to apply to the text. Applied in order. Up to 3 locators per request.Array length: 0 - 3

Response

The generated audio file Format: binary