Typo tolerance calculations
Typo tolerance helps users find relevant results even when their search queries contain spelling mistakes or typos, for example, typing phnoe
instead of phone
. You can configure the typo tolerance feature for each index.
Meilisearch uses a prefix Levenshtein algorithm to determine if a word in a document could be a possible match for a query term.
The number of typos referenced above is roughly equivalent to Levenshtein distance. The Levenshtein distance between two words M and P can be thought of as "the minimum cost of transforming M into P" by performing the following elementary operations on M:
- substitution of a character (for example,
kitten
→sitten
) - insertion of a character (for example,
siting
→sitting
) - deletion of a character (for example,
saturday
→satuday
)
By default, Meilisearch uses the following rules for matching documents. Note that these rules are by word and not for the whole query string.
- If the query word is between
1
and4
characters, no typo is allowed. Only documents that contain words that start with or are of the same length with this query word are considered valid - If the query word is between
5
and8
characters, one typo is allowed. Documents that contain words that match with one typo are retained for the next steps. - If the query word contains more than
8
characters, we accept a maximum of two typos
This means that saturday
which is 7
characters long, uses the second rule and matches every document containing one typo. For example:
saturday
is accepted because it is the same wordsatuday
is accepted because it contains one typosutuday
is not accepted because it contains two typoscaturday
is not accepted because it contains two typos (as explained above, a typo on the first letter of a word is treated as two typos)
Impact of typo tolerance on the typo
ranking rule
The typo
ranking rule sorts search results by increasing number of typos on matched query words. Documents with 0 typos will rank highest, followed by those with 1 and then 2 typos.
The presence or absence of the typo
ranking rule has no impact on the typo tolerance setting. However, disabling the typo tolerance setting effectively also disables the typo
ranking rule. This is because all returned documents will contain 0
typos.
To summarize:
- Typo tolerance affects how lenient Meilisearch is when matching documents
- The
typo
ranking rule affects how Meilisearch sorts its results - Disabling typo tolerance also disables
typo