Meilisearch 1.9
Meilisearch 1.9 brings similar documents, ranking score threshold, grouping by attribute, and improved AI search.
We're excited to unveil Meilisearch v1.9. In this article, we’ll review the most impactful changes. For an exhaustive listing, check out the changelog on GitHub.
Meilisearch 1.9 is available on Meilisearch Cloud, too—upgrade now!
New: ranking score threshold
Meilisearch 1.9 allows excluding search results with low ranking scores. When using the new rankingScoreThreshold
option, Meilisearch will not return any documents below the defined threshold.
curl -X POST 'http://localhost:7700/indexes/movies/search' -H 'Content-Type: application/json' --data-binary '{ "q": "green ogre living in a swamp", "hybrid": { "semanticRatio": 0.9, "embedder": "default" }, "showRankingScore": true, "limit": 5, "rankingScoreThreshold": 0.2 }'
Using the ranking score threshold when implementing hybrid search removes irrelevant results and allows your search analytics to collect the No search results metric properly.
Excluded results do not count towards estimatedTotalHits
, totalHits
, or facet distribution.
New: distinct attribute at search time (group by)
Meilisearch 1.9 adds the ability to define a distinct attribute at search time. Meilisearch will only return one document with the specified attribute value when using the new distinct
search parameter.
This feature is commonly used in ecommerce applications. Consider a products
index containing multiple variants of the same product, e.g. Blue iPhone 15
and Red iPhone 15
documents that share the same product_id
. The API call below will return a single iPhone 15:
curl -X POST 'http://localhost:7700/indexes/products/search' -H 'Content-Type: application/json' --data-binary '{ "q": "iphone", "distinct": "product_id" }'
When distinct
is provided, Meilisearch ignores the index’s distinct attribute.
New: frequency
matching strategy
Meilisearch 1.9 introduces a new matching strategy to prioritize results that contain occurrences of the least frequent query terms. When using the frequency
matching strategy, Meilisearch will deprioritize very common words.
Let’s take the example of the "the little prince"
query. In our indexed documents, the words "the"
and "little"
likely have a lot of occurrences. Consequently, the matching strategy will prioritize documents containing "prince"
instead.
Experimental: new similar documents API
Meilisearch 1.9 introduces a new AI-powered search feature that allows searching for documents similar to an existing document.
The following API call searches for documents similar to the document whose primary key is 23
in the movies index:
curl -X POST /indexes/movies/similar -H 'Content-Type: application/json' --data-binary '{ "id": "23", "embedder": "default", }'
Check out the similar documents API for more information about additional parameters.
Experimental: avoid re-generating embeddings
When importing a dump created with Meilisearch 1.9 or higher, Meilisearch will not re-generate the embeddings. This will avoid unnecessary computing when upgrading your Meilisearch database.
New: regenerate
parameter
Additionally, Meilisearch 1.9 introduces a new API to give more fine-grained control over document embedding generation. Specifically, it enables embedding generation whenever a document is updated.
The document _vectors
object now accepts objects in addition to arrays. The provided object accepts a regenerate
boolean and an optional embeddings
array.
Consider the example document below with user-provided embeddings:
{ "id": 42, "_vectors": { // Embeddings for the `default` embedder // Equivalent to `regenerate: true` "default": [0.1, 0.2 ], // Embeddings for the `text` embedder "text": { "embeddings": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], // Never regenerate embeddings "regenerate": false }, "translation": { "embeddings": [0.1, 0.2, 0.3, 0.4], // Regenerate embeddings when document is updated "regenerate": true } } }
While, in general, you might want to re-generate your embeddings whenever the document is updated, this facilitates migrating from user-provided embeddings to letting Meilisearch handle embedding without incurring unnecessary costs.
Experimental: hybrid search breaking changes
As we move toward stabilizing AI-powered search features, we introduced minor breaking changes to make the APIs less error-prone.
Breaking: empty embeddings
array
Following user feedback that the previous behavior was unexpected and unhelpful, providing an empty embeddings
array will now tell Meilisearch that the document has no embeddings.
Before Meilisearch 1.9, an empty embeddings
array was interpreted as a single embedding of dimension 0.
Breaking: removed _vectors
in search results
Starting with Meilisearch 1.9, API responses to vector search and hybrid search requests will not include _vectors
in the response.
However, you can now use the new retrieveVectors
search parameter if you want API responses to include them:
curl -X POST 'http://localhost:7700/indexes/movies/search' -H 'Content-Type: application/json' --data-binary '{ "q": "star wars", "retrieveVectors": true }'
Breaking: optimizing user-provided embeddings
Starting with Meilisearch 1.9, vector embeddings will no longer be stored as-is. Numbers will be cast to floats in canonicalized representation to save storage and optimize performance. Simply put, the vector [3]
might be stored as [3.0]
.
Contributors shout-out
Thanks to all community members who participated in this release. Shout out to @gh2k, @writegr, and @yudrywet for their contributions to Meilisearch and @mosuka, @Soham1803, and @tkhshtsh0917 for their contributions to Charabia.
And, of course, many thanks to our SDK maintainers, thanks to whom Meilisearch is available across many languages. Special thanks to @the-sinner and @norkunas. 🫶
And that’s a wrap for v1.9! This release post highlights the most significant updates. For an exhaustive listing, read the changelog on Github.
Stay in the loop of everything Meilisearch by subscribing to our monthly newsletter. To learn more about Meilisearch's future and help shape it, take a look at our roadmap and participate in our Product Discussions.
For anything else, join our developers community on Discord.