Skip to main content

Optimize indexing performance by analyzing batch statistics

Indexing performance can vary significantly depending on your dataset, index settings, and hardware. The batch object provides information about the progress of asynchronous indexing operations. The progressTrace field within the batch object offers a detailed breakdown of where time is spent during the indexing process. Use this data to identify bottlenecks and improve indexing speed.

Understanding the progressTrace

progressTrace is a hierarchical trace showing each phase of indexing and how long it took. Each entry follows the structure:
"processing tasks > indexing > extracting word proximity": "33.71s"
This means:
  • The step occurred during indexing.
  • The subtask was extracting word proximity.
  • It took 33.71 seconds.
Focus on the longest-running steps and investigate which index settings or data characteristics influence them.

Key phases and how to optimize them

computing document changesand extracting documents

DescriptionOptimization
Meilisearch compares incoming documents to existing ones.No direct optimization possible. Process duration scales with the number and size of incoming documents.

extracting facets and merging facet caches

DescriptionOptimization
Extracts and merges filterable attributes.Keep the number of filterable attributes to a minimum.

extracting words and merging word caches

DescriptionOptimization
Tokenizes text and builds the inverted index.Ensure the searchable attributes list only includes the fields you want to be checked for query word matches.

extracting word proximity and merging word proximity

DescriptionOptimization
Builds data structures for phrase and attribute ranking.Lower the precision of this operation by setting proximity precision to byAttribute

waiting for database writes

DescriptionOptimization
Time spent writing data to disk.No direct optimization possible. Either the disk is too slow or you are writing too much data in a single operation. Avoid HDDs (Hard Disk Drives)

waiting for extractors

DescriptionOptimization
Time spent waiting for CPU-bound extraction.No direct optimization possible. Indicates a CPU bottleneck. Use more cores or scale horizontally with sharding.

post processing facets > strings bulk / numbers bulk

DescriptionOptimization
Processes equality or comparison filters.- Disable unused filter features, such as comparison operators on string values.
- Reduce the number of sortable attributes.

post processing facets > facet search

DescriptionOptimization
Builds structures for the facet search API.If you don’t use the facet search API, disable it.

Embeddings

Trace keyDescriptionOptimization
writing embeddings to databaseTime spent saving vector embeddings.Use embedding vectors with fewer dimensions.
- Disabling embedding regeneration on document update.
- Consider enabling binary quantization.

post processing words > word prefix *

DescriptionOptimization
Builds prefix data for autocomplete. Allows matching documents that begin with a specific query term, instead of only exact matches.Disable prefix search (prefixSearch: disabled). This can severely impact search result relevancy.

post processing words > word fst

DescriptionOptimization
Builds the word FST (finite state transducer).No direct action possible, as FST size reflect the number of different words in the database. Using documents with fewer searchable words may improve operation speed.

Example analysis

If you see:
"processing tasks > indexing > post processing facets > facet search": "1763.06s"
Facet searching is raking significant indexing time. If your application doesn’t use facets, disable the feature:
curl \
  -X PUT 'MEILISEARCH_URL/indexes/INDEX_UID/settings/facet-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'false'

Learn more