Elevenlabs flash v2 Text to Speech

curl --request POST \
  --url https://api.myrouter.ai/v3/elevenlabs-tts-flash-v2 \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "text": "<string>",
  "voice_id": "<string>",
  "next_text": "<string>",
  "language_code": "<string>",
  "output_format": "<string>",
  "previous_text": "<string>",
  "use_pvc_as_ivc": true,
  "voice_settings": {
    "speed": 123,
    "style": 123,
    "stability": 123,
    "similarity_boost": 123,
    "use_speaker_boost": true
  },
  "next_request_ids": [
    {}
  ],
  "previous_request_ids": [
    {}
  ],
  "apply_text_normalization": "<string>",
  "apply_language_text_normalization": true,
  "pronunciation_dictionary_locators": [
    {
      "version_id": "<string>",
      "pronunciation_dictionary_id": "<string>"
    }
  ]
}
'

POST

elevenlabs-tts-flash-v2

curl --request POST \
  --url https://api.myrouter.ai/v3/elevenlabs-tts-flash-v2 \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "text": "<string>",
  "voice_id": "<string>",
  "next_text": "<string>",
  "language_code": "<string>",
  "output_format": "<string>",
  "previous_text": "<string>",
  "use_pvc_as_ivc": true,
  "voice_settings": {
    "speed": 123,
    "style": 123,
    "stability": 123,
    "similarity_boost": 123,
    "use_speaker_boost": true
  },
  "next_request_ids": [
    {}
  ],
  "previous_request_ids": [
    {}
  ],
  "apply_text_normalization": "<string>",
  "apply_language_text_normalization": true,
  "pronunciation_dictionary_locators": [
    {
      "version_id": "<string>",
      "pronunciation_dictionary_id": "<string>"
    }
  ]
}
'

Converts text to speech using your chosen voice and returns audio.

Request Headers

Content-Type

string

required

Enum: application/json

Authorization

string

required

Bearer authentication format: Bearer {{API Key}}.

Request Body

seed

integer

If specified, the system will attempt to sample deterministically. Repeated requests with the same seed and parameters should return the same result, but full determinism is not guaranteed.Range: [0, 4294967295]

text

string

required

The text to convert to speech.

voice_id

string

required

The voice ID to use.

next_text

string

The text that comes after the current request text. Used to improve speech continuity when concatenating multiple generations.

language_code

string

Language code (ISO 639-1) used for the model and text normalization. An error will be returned if the model does not support this language code.

output_format

string

default:"mp3_44100_128"

Output format for the generated audio. Format is codec_sample_rate_bitrate. The 192kbps bitrate for MP3 requires a Creator or higher account; the 44.1kHz sample rate for PCM requires a Pro or higher account.Possible values: mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, pcm_8000, pcm_16000, pcm_22050, pcm_24000, pcm_32000, pcm_44100, pcm_48000, ulaw_8000, alaw_8000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192

previous_text

string

The text that comes before the current request text. Used to improve speech continuity when concatenating multiple generations.

use_pvc_as_ivc

boolean

default:false

If true, uses the IVC version of the voice instead of the PVC version. This is a temporary workaround for higher latency with the PVC version.

voice_settings

object

Hide properties

speed

number

default:1

Adjusts the speed of the voice. 1.0 is the default speed; values below 1.0 slow down the speech, values above 1.0 speed it up.

style

number

default:0

Controls the exaggeration of the voice style. Attempts to amplify the style of the original speaker. Setting to a non-zero value consumes more compute resources and may increase latency.

stability

number

Controls the stability of voice generation and the randomness between each generation. Lower values produce a wider emotional range; higher values may result in a more monotone voice.

similarity_boost

number

Controls how closely the AI attempts to replicate the original voice.

use_speaker_boost

boolean

default:true

Enhances similarity to the original speaker. Requires slightly more compute and increases latency.

next_request_ids

array

List of request_ids for subsequent samples. Used to maintain speech continuity when regenerating samples. Up to 3 request_ids can be provided.Array length: 0 - 3

previous_request_ids

array

List of request_ids for previously generated samples before the current generation. Can be used to improve speech continuity. Up to 3 request_ids can be provided.Array length: 0 - 3

apply_text_normalization

string

default:"auto"

Controls text normalization. ‘auto’ lets the system decide, ‘on’ always normalizes, ‘off’ skips normalization.Possible values: auto, on, off

apply_language_text_normalization

boolean

default:false

Controls language-specific text normalization for certain supported languages to achieve more natural pronunciation. Warning: may significantly increase latency. Currently only supports Japanese.

pronunciation_dictionary_locators

array

List of pronunciation dictionary locators (id, version_id) to apply to the text. Applied in order. Up to 3 locators per request.Array length: 0 - 3

Hide properties

version_id

string

The ID of the pronunciation dictionary version. If not specified, the latest version will be used.

pronunciation_dictionary_id

string

required

The ID of the pronunciation dictionary.

Response

The generated audio file Format: binary

Elevenlabs scribe v2 Speech to Text

Elevenlabs flash v2.5 Text to Speech

​Request Headers

​Request Body

​Response

Request Headers

Request Body

Response