Image search with user-provided embeddings

    This article shows you the main steps for performing multimodal searches where you can use text to search through a database of images with no associated metadata.

    Requirements

    Configure your local embedding generation pipeline

    First, set up a system that sends your images to your chosen embedding generation provider, then integrates the returned embeddings into your dataset.

    The exact procedure depends heavily on your specific setup, but should include these main steps:

    1. Choose a provider you can run locally
    2. Choose a model that supports both image and text input
    3. Send your images to the embedding generation provider
    4. Add the returned embeddings to the _vector field for each image in your database

    In most cases your system should run these steps periodically or whenever you update your database.

    Configure a user-provided embedder

    Configure the embedder index setting, settings its source to userProvided:

    curl \
      -X PATCH 'MEILISEARCH_URL/indexes/movies/settings' \
      -H 'Content-Type: application/json' \
      --data-binary '{
        "embedders": {
          "EMBEDDER_NAME": {
            "source":  "userProvided",
            "dimensions": MODEL_DIMENSIONS
          }
        }
      }'
    

    Replace EMBEDDER_NAME with the name you wish to give your embedder. Replace MODEL_DIMENSIONS with the number of dimensions of your chosen model.

    Add documents to Meilisearch

    Next, use the /documents endpoint to upload the vectorized images.

    In most cases, you should automate this step so Meilisearch is up to date with your primary database.

    Set up pipeline for vectorizing queries

    Since you are using a userProvided embedder, you must also generate the embeddings for the search query. This process should be similar to generating embeddings for your images:

    1. Receive user query from your front-end
    2. Send query to your local embedding generation provider
    3. Perform search using the returned query embedding

    Vector search with user-provided embeddings

    Once you have the query's vector, pass it to the vector search parameter to perform a semantic AI-powered search:

    curl -X POST -H 'content-type: application/json' \
      'localhost:7700/indexes/products/search' \
      --data-binary '{ 
        "vector": VECTORIZED_QUERY,
        "hybrid": {
          "embedder": "EMBEDDER_NAME",
        }
      }'
    

    Replace VECTORIZED_QUERY with the embedding generated by your provider and EMBEDDER_NAME with your embedder.

    If your images have any associated metadata, you may perform a hybrid search by including the original q:

    curl -X POST -H 'content-type: application/json' \
      'localhost:7700/indexes/products/search' \
      --data-binary '{ 
        "vector": VECTORIZED_QUERY,
        "hybrid": {
          "embedder": "EMBEDDER_NAME",
        }
        "q": "QUERY",
      }'
    

    Conclusion

    You have seen the main steps for implementing image search with Meilisearch:

    1. Prepare a pipeline that converts your images into vectors
    2. Index the vectorized images with Meilisearch
    3. Prepare a pipeline that converts your users' queries into vectors
    4. Perform searches using the converted queries