Speak text
POST/calls/:call_control_id/actions/speak
Convert text to speech and play it back on the call. If multiple speak text commands are issued consecutively, the audio files will be placed in a queue awaiting playback.
Expected Webhooks (see callback schema below):
call.speak.started
call.speak.ended
Request
Path Parameters
Unique identifier and token for controlling the call
- application/json
Body
required
Speak request
- Define voices using the format
<Provider>.<Model>.<VoiceId>
. Specifying only the provider will give default values for voice_id and model_id. - AWS: Use
AWS.Polly.<VoiceId>
(e.g.,AWS.Polly.Joanna
). For neural voices, which provide more realistic, human-like speech, append-Neural
to theVoiceId
(e.g.,AWS.Polly.Joanna-Neural
). Check the available voices for compatibility. - Azure: Use `Azure.
(e.g. Azure.en-CA-ClaraNeural, Azure.en-CA-LiamNeural, Azure.en-US-BrianMultilingualNeural, Azure.en-US-AvaMultilingualNeural. For a complete list of voices, go to Azure Voice Gallery.) - ElevenLabs: Use
ElevenLabs.<ModelId>.<VoiceId>
(e.g.,ElevenLabs.eleven_multilingual_v2.21m00Tcm4TlvDq8ikWAM
). TheModelId
part is optional. To use ElevenLabs, you must provide your ElevenLabs API key as an integration identifier secret in"voice_settings": {"api_key_ref": "<secret_identifier>"}
. See integration secrets documentation for details. Check available voices.
The text or SSML to be converted into speech. There is a 3,000 character limit.
Possible values: [text
, ssml
]
Default value: text
The type of the provided payload. The payload can either be plain text, or Speech Synthesis Markup Language (SSML).
Possible values: [basic
, premium
]
Default value: premium
This parameter impacts speech quality, language options and payload types. When using basic
, only the en-US
language and payload type text
are allowed.
When specified, it stops the current audio being played. Specify current
to stop the current audio being played, and to play the next file in the queue. Specify all
to stop the current audio file being played and to also clear all audio files from the queue.
Specifies the voice used in speech synthesis.
Supported Providers:
For service_level basic, you may define the gender of the speaker (male or female).
voice_settings
object
The settings associated with the voice selected
oneOf
The settings associated with the voice selected
Possible values: [arb
, cmn-CN
, cy-GB
, da-DK
, de-DE
, en-AU
, en-GB
, en-GB-WLS
, en-IN
, en-US
, es-ES
, es-MX
, es-US
, fr-CA
, fr-FR
, hi-IN
, is-IS
, it-IT
, ja-JP
, ko-KR
, nb-NO
, nl-NL
, pl-PL
, pt-BR
, pt-PT
, ro-RO
, ru-RU
, sv-SE
, tr-TR
]
The language you want spoken. This parameter is ignored when a Polly.*
voice is specified.
Use this field to add state to every subsequent webhook. It must be a valid Base-64 encoded string.
Use this field to avoid duplicate commands. Telnyx will ignore any command with the same command_id
for the same call_control_id
.
Responses
200: Successful response upon making a call control command.
- application/json
default: Unexpected error
- application/json
Request samples
curl -L 'https://api.telnyx.com/v2/calls/:call_control_id/actions/speak' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>' \
-d '{
"payload": "Say this on the call",
"payload_type": "text",
"service_level": "basic",
"stop": "current",
"voice": "female",
"language": "arb",
"client_state": "aGF2ZSBhIG5pY2UgZGF5ID1d",
"command_id": "891510ac-f3e4-11e8-af5b-de00688a4901"
}'
Response samples
{
"data": {
"result": "ok"
}
}
{
"errors": [
{
"code": "string",
"title": "string",
"detail": "string",
"source": {
"pointer": "string",
"parameter": "string"
},
"meta": {}
}
]
}