Distinct attribute
The distinct attribute is a special, user-designated field. It is most commonly used to prevent Meilisearch from returning a set of several similar documents, instead forcing it to return only one.
You may set a distinct attribute in two ways: using the distinctAttribute
index setting during configuration, or the distinct
search parameter at search time.
Setting a distinct attribute during configuration
distinctAttribute
is an index setting that configures a default distinct attribute Meilisearch applies to all searches and facet retrievals in that index.
WARNING
There can be only one distinctAttribute
per index. Trying to set multiple fields as a distinctAttribute
will return an error.
The value of a field configured as a distinct attribute will always be unique among returned documents. This means there will never be more than one occurrence of the same value in the distinct attribute field among the returned documents.
When multiple documents have the same value for the distinct attribute, Meilisearch returns only the highest-ranked result after applying ranking rules. If two or more documents are equivalent in terms of ranking, Meilisearch returns the first result according to its internal_id
.
Example
Suppose you have an e-commerce dataset. For an index that contains information about jackets, you may have several identical items with minor variations such as color or size.
As shown below, this dataset contains three documents representing different versions of a Lee jeans leather jacket. One of the jackets is brown, one is black, and the last one is blue.
[
{
"id": 1,
"description": "Leather jacket",
"brand": "Lee jeans",
"color": "brown",
"product_id": "123456"
},
{
"id": 2,
"description": "Leather jacket",
"brand": "Lee jeans",
"color": "black",
"product_id": "123456"
},
{
"id": 3,
"description": "Leather jacket",
"brand": "Lee jeans",
"color": "blue",
"product_id": "123456"
}
]
By default, a search for lee leather jacket
would return all three documents. This might not be desired, since displaying nearly identical variations of the same item can make results appear cluttered.
In this case, you may want to return only one document with the product_id
corresponding to this Lee jeans leather jacket. To do so, you could set product_id
as the distinctAttribute
.
curl \
-X PUT 'http://localhost:7700/indexes/jackets/settings/distinct-attribute' \
-H 'Content-Type: application/json' \
--data-binary '"product_id"'
By setting distinctAttribute
to product_id
, search requests will never return more than one document with the same product_id
.
After setting the distinct attribute as shown above, querying for lee leather jacket
would only return the first document found. The response would look like this:
{
"hits": [
{
"id": 1,
"description": "Leather jacket",
"brand": "Lee jeans",
"color": "brown",
"product_id": "123456"
}
],
"offset": 0,
"limit": 20,
"estimatedTotalHits": 1,
"processingTimeMs": 0,
"query": "lee leather jacket"
}
For more in-depth information on distinct attribute, consult the API reference.
Setting a distinct attribute at search time
distinct
is a search parameter you may add to any search query. It allows you to selectively use distinct attributes depending on the context. distinct
takes precedence over distinctAttribute
.
To use an attribute with distinct
, first add it to the filterableAttributes
list:
curl \
-X PUT 'http://localhost:7700/indexes/products/settings/filterable-attributes' \
-H 'Content-Type: application/json' \
--data-binary '[
"product_id",
"sku",
"url"
]'
Then use distinct
in a search query, specifying one of the configured attributes:
curl \
-X POST 'http://localhost:7700/indexes/products/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "white shirt",
"distinct": "sku"
}'