Skip to main content
POST
/
v3
/
minimax-speech-2.8-hd
MiniMax Speech 2.8 HD Sync Text-to-Speech
curl --request POST \
  --url https://api.myrouter.ai/v3/minimax-speech-2.8-hd \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "text": "<string>",
  "stream": true,
  "voice_modify": {
    "pitch": 123,
    "timbre": 123,
    "intensity": 123,
    "sound_effects": "<string>"
  },
  "audio_setting": {
    "format": "<string>",
    "bitrate": 123,
    "channel": 123,
    "force_cbr": true,
    "sample_rate": 123
  },
  "output_format": "<string>",
  "voice_setting": {
    "vol": 123,
    "pitch": 123,
    "speed": 123,
    "emotion": "<string>",
    "voice_id": "<string>",
    "latex_read": true,
    "text_normalization": true
  },
  "aigc_watermark": true,
  "language_boost": "<string>",
  "stream_options": {
    "exclude_aggregated_audio": true
  },
  "timber_weights": [
    {
      "weight": 123,
      "voice_id": "<string>"
    }
  ],
  "subtitle_enable": true,
  "continuous_sound": true,
  "pronunciation_dict": {
    "tone": [
      {}
    ]
  }
}
'
{
  "data": {},
  "trace_id": "<string>",
  "base_resp": {},
  "extra_info": {}
}
Convert text to speech with support for multiple voices, emotion control, speed adjustment, and more. Text length limit is less than 10,000 characters. For text longer than 3,000 characters, streaming output is recommended.

Request Headers

Content-Type
string
required
Enum: application/json
Authorization
string
required
Bearer authentication format: Bearer {{API Key}}.

Request Body

text
string
required
The text to be synthesized into speech. Length limit is less than 10,000 characters. For text longer than 3,000 characters, streaming output is recommended. Supports paragraph breaks (newline characters), pause control (&lt;#x#&gt; markers), and interjection tags (such as (laughs), (coughs), etc., only supported by speech-2.8-hd/turbo).
stream
boolean
default:false
Controls whether to enable streaming output. Default: false (streaming disabled).
voice_modify
object
audio_setting
object
output_format
string
default:"hex"
Parameter that controls the output format. Possible values: url, hex. Default: hex. This parameter only takes effect in non-streaming scenarios; streaming scenarios only support hex output. The returned URL is valid for 24 hours.Possible values: url, hex
voice_setting
object
aigc_watermark
boolean
default:false
Controls whether to add an audio rhythm identifier at the end of the synthesized audio. Default: false. This parameter only applies to non-streaming synthesis.
language_boost
string
Whether to enhance recognition of specified minority languages and dialects. Default: null. Can be set to auto to let the model automatically determine the language type.Possible values: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans, auto
stream_options
object
timber_weights
array
Mixed voice settings. Supports mixing up to 4 voices.
subtitle_enable
boolean
default:false
Controls whether to enable the subtitle service. Default: false. This parameter is only effective in non-streaming output scenarios and only applies to speech-2.6-hd, speech-2.6-turbo, speech-02-turbo, speech-02-hd, speech-01-turbo, speech-01-hd models.
continuous_sound
boolean
default:false
Enable this parameter to make clause transitions more natural. Only supported for speech-2.8-hd and speech-2.8-turbo models.
pronunciation_dict
object

Response

data
object
The returned synthesis data object. May be null; null-check is required.
trace_id
string
The session ID for this request, used to help locate issues during inquiries or feedback.
base_resp
object
The status code and details of this request.
extra_info
object
Additional information about the audio.