This guide shows the main steps to search through a database of images using Meilisearch’s experimental multimodal embeddings.Documentation Index
Fetch the complete documentation index at: https://www.meilisearch.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Enable multimodal embeddings
First, enable themultimodal experimental feature:
Configure a multimodal embedder
Much like other embedders, multimodal embedders must set theirsource to rest and explicitly declare their url. Depending on your chosen provider, you may also have to specify apiKey.
All multimodal embedders must contain an indexingFragments field and a searchFragments field. Fragments are sets of embeddings built out of specific parts of document data.
Fragments must follow the structure defined by the REST API of your chosen provider.
indexingFragments
Use indexingFragments to tell Meilisearch how to send document data to the provider’s API when generating document embeddings.
For example, when using VoyageAI’s multimodal model, an indexing fragment might look like this:
doc. In IMAGE_FRAGMENT_NAME, that’s image_url which outputs the plain URL string in the document field poster_url. In TEXT_FRAGMENT_NAME, text contains a longer string contextualizing two document fields, title and description.
searchFragments
Use searchFragments to tell Meilisearch how to send search query data to the chosen provider’s REST API when converting them into embeddings:
- A textual search based on the
qparameter, which will be embedded as text - An image search based on data url rebuilt from the
image.mimeandimage.datafield in themediafield of the query
media and q.
Each semantic search query for this embedder should match exactly one search fragment of this embedder, so the fragments should each have at least one disambiguating field.
Complete embedder configuration
Your embedder should look similar to this example with all fragments and embedding provider data:source of this embedder is rest, you must also specify a request and a response fields. These respectively instruct Meilisearch on how to structure the request sent to the embeddings provider, and where to find the embeddings in the provider’s response.
Add documents
Once your embedder is configured, you can add documents to your index with the/documents endpoint.
During indexing, Meilisearch will automatically generate multimodal embeddings for each document using the configured indexingFragments.
Perform searches
The final step is to perform searches using different types of content.Use text to search for images
Use the following search query to retrieve a mix of documents with images matching the description and documents containing the specified keywords:Use an image to search for images
You can also use an image to search for other, similar images:Convert images to Base64 on the client
To search with a user-submitted image, read it as a data URL and extract the MIME type and Base64 data:media.image.mime and media.image.data fields in the search request correspond to the {{media.image.mime}} and {{media.image.data}} template variables used in the searchFragments configuration above.
Large images increase request payload size and embedding latency. Consider resizing images to a maximum of 1024x1024 pixels before encoding. The embedding provider handles any further resizing internally.
Conclusion
With multimodal embedders you can:- Configure Meilisearch to embed both images and queries
- Add image documents. Meilisearch automatically generates embeddings
- Accept text or image input from users
- Run hybrid searches using a mix of textual and non-textual input, or run pure semantic searches using only non-textual input