Skip to main content

Embed documents

POST 
/ai/embeddings

Perform embedding on a Telnyx Storage Bucket using the a embedding model. The current supported file types are:

  • PDF
  • HTML
  • txt/unstructured text files
  • json
  • csv
  • audio / video (mp3, mp4, mpeg, mpga, m4a, wav, or webm ) - Max of 100mb file size.

Any files not matching the above types will be attempted to be embedded as unstructured text.

This process can be slow, so it runs in the background and the user can check the status of the task using the endpoint /ai/embeddings/{task_id}.

Important Note: When you update documents in a Telnyx Storage bucket, their associated embeddings are automatically kept up to date. If you add or update a file, it is automatically embedded. If you delete a file, the embeddings are deleted for that particular file.

You can also specify a custom loader param. Currently the only supported loader value is intercom which loads Intercom article jsons as specified by the Intercom article API This loader will split each article into paragraphs and save additional parameters relevant to Intercom docs, such as article_url and heading. These values will be returned by the /v2/ai/embeddings/similarity-search endpoint in the loader_metadata field.

Request

Body

required

    bucket_name Bucket Name (string)required
    document_chunk_size Document Chunk Size (integer)

    Default value: 1024

    document_chunk_overlap_size Document Chunk Overlap Size (integer)

    Default value: 512

    embedding_model SupportedEmbeddingModels (string)

    Possible values: [thenlper/gte-large, intfloat/multilingual-e5-large, sentence-transformers/all-mpnet-base-v2]

    Supported models to vectorize and embed documents.

    loader SupportedEmbeddingLoaders (string)

    Possible values: [default, intercom]

    Supported types of custom document loaders for embeddings.

Responses

200: Successful Response

422: Validation Error

Request samples


curl -L 'https://api.telnyx.com/v2/ai/embeddings' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <TOKEN>' \
-d '{
"bucket_name": "string",
"document_chunk_size": 1024,
"document_chunk_overlap_size": 512,
"embedding_model": "thenlper/gte-large",
"loader": "default"
}'

Response samples


{
"data": {
"task_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"task_name": "string",
"status": "string",
"created_at": "string",
"finished_at": "string",
"user_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6"
}
}