Skip to main content

Speak text

POST 
/calls/:call_control_id/actions/speak

Convert text to speech and play it back on the call. If multiple speak text commands are issued consecutively, the audio files will be placed in a queue awaiting playback.

Expected Webhooks (see callback schema below):

  • call.speak.started
  • call.speak.ended

Request

Path Parameters

    call_control_id stringrequired

    Unique identifier and token for controlling the call

Body

required

Speak request

    payload stringrequired

    The text or SSML to be converted into speech. There is a 3,000 character limit.

    payload_type string

    Possible values: [text, ssml]

    Default value: text

    The type of the provided payload. The payload can either be plain text, or Speech Synthesis Markup Language (SSML).

    service_level string

    Possible values: [basic, premium]

    Default value: premium

    This parameter impacts speech quality, language options and payload types. When using basic, only the en-US language and payload type text are allowed.

    stop string

    When specified, it stops the current audio being played. Specify current to stop the current audio being played, and to play the next file in the queue. Specify all to stop the current audio file being played and to also clear all audio files from the queue.

    voice stringrequired

    Specifies the voice used in speech synthesis.

    • Define voices using the format <Provider>.<Model>.<VoiceId>. Specifying only the provider will give default values for voice_id and model_id.

    Supported Providers:

    • AWS: Use AWS.Polly.<VoiceId> (e.g., AWS.Polly.Joanna). For neural voices, which provide more realistic, human-like speech, append -Neural to the VoiceId (e.g., AWS.Polly.Joanna-Neural). Check the available voices for compatibility.
    • Azure: Use `Azure. (e.g. Azure.en-CA-ClaraNeural, Azure.en-CA-LiamNeural, Azure.en-US-BrianMultilingualNeural, Azure.en-US-AvaMultilingualNeural. For a complete list of voices, go to Azure Voice Gallery.)
    • ElevenLabs: Use ElevenLabs.<ModelId>.<VoiceId> (e.g., ElevenLabs.eleven_multilingual_v2.21m00Tcm4TlvDq8ikWAM). The ModelId part is optional. To use ElevenLabs, you must provide your ElevenLabs API key as an integration identifier secret in "voice_settings": {"api_key_ref": "<secret_identifier>"}. See integration secrets documentation for details. Check available voices.

    For service_level basic, you may define the gender of the speaker (male or female).

    voice_settings

    object

    The settings associated with the voice selected

    oneOf

    The settings associated with the voice selected

    language string

    Possible values: [arb, cmn-CN, cy-GB, da-DK, de-DE, en-AU, en-GB, en-GB-WLS, en-IN, en-US, es-ES, es-MX, es-US, fr-CA, fr-FR, hi-IN, is-IS, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, pl-PL, pt-BR, pt-PT, ro-RO, ru-RU, sv-SE, tr-TR]

    The language you want spoken. This parameter is ignored when a Polly.* voice is specified.

    client_state string

    Use this field to add state to every subsequent webhook. It must be a valid Base-64 encoded string.

    command_id string

    Use this field to avoid duplicate commands. Telnyx will ignore any command with the same command_id for the same call_control_id.

Responses

200: Successful response upon making a call control command.

default: Unexpected error

Callbacks

Request samples


curl -L 'https://api.telnyx.com/v2/calls/:call_control_id/actions/speak' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>' \
-d '{
"payload": "Say this on the call",
"payload_type": "text",
"service_level": "basic",
"stop": "current",
"voice": "female",
"language": "arb",
"client_state": "aGF2ZSBhIG5pY2UgZGF5ID1d",
"command_id": "891510ac-f3e4-11e8-af5b-de00688a4901"
}'

Response samples


{
"data": {
"result": "ok"
}
}