Semantic search with OpenAI embeddings
Introduction
This guide will walk you through the process of setting up Meilisearch with OpenAI embeddings to enable semantic search capabilities. By leveraging Meilisearch's AI features and OpenAI's embedding API, you can enhance your search experience and retrieve more relevant results.
Requirements
To follow this guide, you'll need:
- A Meilisearch Cloud project running version 1.10 or above with the Vector store activated.
- An OpenAI account with an API key for embedding generation. You can sign up for an OpenAI account at OpenAI.
- No backend required.
Setting up Meilisearch
To set up an embedder in Meilisearch, you need to configure it to your settings. You can refer to the Meilisearch documentation for more details on updating the embedder settings.
OpenAI offers three main embedding models:
text-embedding-3-large
: 3,072 dimensionstext-embedding-3-small
: 1,536 dimensionstext-embedding-ada-002
: 1,536 dimensions
Here's an example of embedder settings for OpenAI:
{
"openai": {
"source": "openAi",
"apiKey": "<OpenAI API Key>",
"dimensions": 1536,
"documentTemplate": "<Custom template (Optional, but recommended)>",
"model": "text-embedding-3-small"
}
}
In this configuration:
source
: Specifies the source of the embedder, which is set to "openAi" for using OpenAI's API.apiKey
: Replace<OpenAI API Key>
with your actual OpenAI API key.dimensions
: Specifies the dimensions of the embeddings. Set to 1536 fortext-embedding-3-small
andtext-embedding-ada-002
, or 3072 fortext-embedding-3-large
.documentTemplate
: Optionally, you can provide a custom template for generating embeddings from your documents.model
: Specifies the OpenAI model to use for generating embeddings. Choose fromtext-embedding-3-large
,text-embedding-3-small
, ortext-embedding-ada-002
.
Once you've configured the embedder settings, Meilisearch will automatically generate embeddings for your documents and store them in the vector store.
Please note that OpenAI has rate limiting, which is managed by Meilisearch. If you have a free account, the indexation process may take some time, but Meilisearch will handle it with a retry strategy.
It's recommended to monitor the tasks queue to ensure everything is running smoothly. You can access the tasks queue using the Cloud UI or the Meilisearch API
Testing semantic search
With the embedder set up, you can now perform semantic searches using Meilisearch. When you send a search query, Meilisearch will generate an embedding for the query using the configured embedder and then use it to find the most semantically similar documents in the vector store. To perform a semantic search, you simply need to make a normal search request but include the hybrid parameter:
{
"q": "<Query made by the user>",
"hybrid": {
"semanticRatio": 1,
"embedder": "openai"
}
}
In this request:
q
: Represents the user's search query.hybrid
: Specifies the configuration for the hybrid search.semanticRatio
: Allows you to control the balance between semantic search and traditional search. A value of 1 indicates pure semantic search, while a value of 0 represents full-text search. You can adjust this parameter to achieve a hybrid search experience.embedder
: The name of the embedder used for generating embeddings. Make sure to use the same name as specified in the embedder configuration, which in this case is "openai".
You can use the Meilisearch API or client libraries to perform searches and retrieve the relevant documents based on semantic similarity.
Conclusion
By following this guide, you should now have Meilisearch set up with OpenAI embedding, enabling you to leverage semantic search capabilities in your application. Meilisearch's auto-batching and efficient handling of embeddings make it a powerful choice for integrating semantic search into your project.
To explore further configuration options for embedders, consult the detailed documentation about the embedder setting possibilities.