Embeddings
An embedding is a vector of floating-point numbers. The distance between two vectors measures their relatedness. A small distance suggests high relatedness whereas a large distances suggest low relatedness.
Compass provides access to the embedding model, Embeddings 3 Large.
Embedding is used for:
- Clustering: text strings are grouped by their similarity.
- Search: results are ranked based on the relevancy of the query strings.
- Classification: text strings are classified based on similarity with the label.
- Recommendations: items with related text strings are recommended.
- Anomaly detection: outliers with little relatedness are identified.
- Diversity measurement: analyzes similarity distributions.
To get an embedding, send a text string to embedding models. Learn more about embedding API endpoints by referring to our API reference documentation.
The response contains an embedding with a list of floating point numbers you can extract, save in the vector database, and use for various use cases. Some of them are described above.
{
"input": "hi",
"model": "text-embedding-3-large"
}
The response will contain an embedding vector with some additional metadata.
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
-0.006654263,
0.0054927324,
-0.0018895327,
....(3072 floats total for text-embedding-3-large)
]
}
],
"model": "text-embedding-3-large",
"usage": {
"prompt_tokens": 1,
"total_tokens": 1
}
}
By default, the length of the embedding vector is 3072 for text-embedding-3-large. Users can reduce the dimensions by passing the dimension parameter without the embedding losing its concept representing properties.