What's new in v0.28
This month’s release brings you two great new features: smart crop and deterministic API keys. It also takes us a little closer to v1.0 with the stabilization of our API!
In this article, let's take a look at some of the most significant changes in Meilisearch's latest update. This release brings you new features such as smart crop and deterministic API keys. With v0.28, we have stabilized our API, taking the first step towards a v1.0 🎉 This stabilization brings multiple changes. You can read the full changelog on GitHub, but we’ll go over the main ones in this article.
New feature: smart crop
Instead of considering the first search term match as the best cropping location, Meilisearch centers the crop around the largest number of unique matches, giving priority to terms that are closer to each other and follow the original query order. Meilisearch also considers context when cropping and prioritizes keeping the sentence together.
Given the following string:
“A young elephant, whose oversized ears enable him to fly, helps save a struggling circus, but when the circus plans a new venture, Dumbo and his friends discover dark secrets beneath its shiny veneer.”
If the search query is Dumbo
and the cropLength
is 5
, Meilisearch will now return:
"… Dumbo and his friends discover…”
Instead of:
"…new venture, Dumbo and his…”
New feature: deterministic API keys
A deterministic algorithm is an algorithm that, given a particular input, always produces the same output, with no randomness involved.
You can create a deterministic key
value by specifying a uid
field at creation. The uid
value must follow the uuid v4 format. If you don't specify anything, Meilisearch automatically generates the uid
for you.
The value of the key
field is generated by hashing the master key with the uid
. The same combination always results in the same key
value.
This will allow you to have the same set of API keys across different Meilisearch instances. Henceforth, when upgrading or redeploying your Meilisearch instance, you’ll be able to keep your API keys.
As a result of these modifications, keys imported from older versions of Meilisearch will have their key
and uid
fields regenerated. When updating your Meilisearch instance, you will need to update your keys.
We have also added a name
field to make API key retrieval more convenient. A key object should now look like this:
{ "name": null, "description": "Manage documents: Products/Reviews API key", "key": "d0552b41536279a0ad88bd595327b96f01176a60c2243e906c52ac02375f9bc4", "uid": "6062abda-a5aa-4414-ac91-ecd7944c0f8d", "actions": [ "documents.add", "documents.delete" ], "indexes": [ "products", "reviews" ], "expiresAt": "2021-12-31T23:59:59Z", "createdAt": "2021-10-12T00:00:00Z", "updatedAt": "2021-10-13T15:00:00Z" }
Other changes regarding API key management include:
- being able to retrieve, update, and delete a key by either the
key
oruid
fields - introducing new actions to manage API keys (
keys.get
,keys.create
,keys.update
,keys.delete
) - removing the possibility of updating the
actions
,indexes
, orexpiresAt
properties of an API key after creation for security reasons
Breaking change: search nomenclature
We’re on the road to v1, which means defining a stable API. To improve clarity, we have made several changes to the naming of some search parameters and response fields in the /indexes/{uid}/search
endpoint.
Search parameters formerly known as facetsDistribution
and matches
are now called facets
and showMatchesPosition
, respectively.
The response fields returned when using those parameters are now facetDistribution
instead of facetsDistribution
–note the suppression of the s– and _matchesPosition
instead of _matchesInfo
.
The response field nbHits
has been renamed estimatedTotalHits
. This value was recurrently used to calculate the number of search result pages, which we strongly advise against. To learn how to paginate with Meilisearch without using nbHits
, check out this fresh new guide.
For the following query:
curl -X POST 'http://localhost:7700/indexes/movies/search' -H 'Content-Type: application/json' --data-binary '{ "q": "Shazam", "facets: ["genres"], "showMatchesPosition": true }'
You’ll get the following response:
{ "hits": [ { "id": "287947", "title": "Shazam!", "poster": "https://image.tmdb.org/t/p/w500/xnopI5Xtky18MPhK40cZAGAOVeV.jpg", "overview": "A boy is given the ability to become an adult superhero in times of need with a single magic word.", "release_date": 1553299200, "genres": [ "Action", "Comedy", "Fantasy" ], "_matchesPosition": { "title": [ { "start": 0, "length": 6 } ] } }, ... ], "estimatedTotalHits": 3, "query": "Shazam", "limit": 20, "offset": 0, "processingTimeMs": 4, "facetDistribution": { "genres": { "Action": 3, "Animation": 2, "Comedy": 1, "Fantasy": 1 } } }
Breaking change: task management
Browsing tasks
We have added a new pagination system to the /tasks
endpoint.
With this change, it’s much easier to browse through the tasks, as their number can quickly grow in instances with a large number of asynchronous operations.
For each call to this endpoint, the response will return the following fields:
limit
: number of tasks returned (defaults to 20)from
: theuid
of the first task returnednext
: theuid
of the next task
To view the next page of results, you would repeat the same query, replacing the value of from
with the value of next
. When the value of next
is null
, there are no more tasks to view.
This type of pagination system is called keyset pagination. As opposed to offset pagination used to browse indexes, documents, and keys, it has two main advantages: it prevents any inconsistencies and, since there is no need to scan and count records, it’s more efficient, which is a significant advantage when your task queue grows fast.
Filtering tasks
We have also made the task list filterable. You can now get tasks by status
, type
, or indexUid
.
For example, the following command returns all tasks belonging to the index movies that succeeded:
curl -X GET 'http://localhost:7700/tasks?indexUid=movies&status=succeeded'
These modifications have led to the deletion of the GET /indexes/:indexUid/tasks
and the GET /indexes/:indexUid/tasks/:taskUid
endpoints.
Breaking change: dumps
Dump creation has always been an asynchronous operation but used a separate queue from the task queue. With v0.28, dumps have become tasks. This has resulted in a new task type called dumpCreation
.
Despite being tasks and thus sharing the same queue, dumps are given priority. They’ll be processed as soon as the current task is done running. You can think of dumps as VIPs in a club; even though they arrived last - which is reflected in their taskUid
- they get to skip the line.
Contributors’ experience
We have worked hard on improving the contribution experience of our tokenizer: charabia. The tokenizer’s role is to split a sentence or phrase into smaller units of language, called tokens. It is a critical factor in the quality of search results. Now, it is much easier to add languages to Meilisearch. You just need to follow the instructions on CONTRIBUTING.md.
Meilisearch works perfectly with any space-separated language and has special support for Japanese and Chinese. We now support Hebrew, too, thanks to our awesome community! Other languages will still work, but the quality and relevancy of search results may vary significantly.
We would love to provide global language support. The more feedback we get from native speakers, the easier it is for us to understand how to improve performance for those languages. If you want to help us support your language, we are eager to hear from you and see how we could make progress together!
Other changes
- We have added pagination to the response of the
GET /indexes
and theGET /keys
endpoints, and we have improved pagination for theGET /indexes/{uid}/documents
- For performance reasons, we have decided to limit the number of facet values returned per faceted attribute. This limit is customizable and defaults to 100
- You can customize the number of documents Meilisearch returns on search. The default limit is 1000 and protects the database from malicious scraping. Beware that increasing this limit can affect performance
We apologize in advance for any inconvenience caused by all these changes. It’s for a good cause: we are making these changes now to move towards v1.0 and avoid breaking changes later. Don’t hesitate to reach out to us if you need support or have any doubts. We are always happy to help!
Contributors
We are really grateful for this amazing community. We want to thank @0x0x1, @choznerol, @pierre-l, @ryanrussell, @Thearas, and @walterbm for their help with Meilisearch, and @matthias-wright for his help with milli. We want to send a special shout-out to @benny-n for adding Hebrew to our tokenizer.
And that’s it for v0.28! Remember to check the changelog for the full release notes, and see you next time!