Storage
Meilisearch is in many ways a database: it stores indexed documents along with the data needed to return relevant search results.
Database location
Meilisearch creates the database the moment you first launch an instance. By default, you can find it inside a data.ms
folder located in the same directory as the meilisearch
binary.
The database location can change depending on a number of factors, such as whether you have configured a different database path with the --db-path
instance option, or if you're using an OS virtualization tool like Docker.
LMDB
Creating a database from scratch and managing it is hard work. It would make no sense to try and reinvent the wheel, so Meilisearch uses a storage engine under the hood. This allows the Meilisearch team to focus on improving search relevancy and search performance while abstracting away the complicated task of creating, reading, and updating documents on disk and in memory.
Our storage engine is called Lightning Memory-Mapped Database (LMDB for short). LMDB is a transactional key-value store written in C that was developed for OpenLDAP and has ACID properties. Though we considered other options, such as Sled and RocksDB, we chose LMDB because it provided us with the best combination of performance, stability, and features.
Memory mapping
LMDB stores its data in a memory-mapped file. All data fetched from LMDB is returned straight from the memory map, which means there is no memory allocation or memory copy during data fetches.
All documents stored on disk are automatically loaded in memory when Meilisearch asks for them. This ensures LMDB will always make the best use of the RAM available to retrieve the documents.
For the best performance, it is recommended to provide the same amount of RAM as the size the database takes on disk, so all the data structures can fit in memory.
Understanding LMDB
The choice of LMDB comes with certain pros and cons, especially regarding database size and memory usage. We summarize the most important aspects of LMDB here, but check out this blog post by LMDB's developers for more in-depth information.
Database size
When deleting documents from a Meilisearch index, you may notice disk space usage remains the same. This happens because LMDB internally marks that space as free, but does not make it available for the operating system at large. This design choice leads to better performance, as there is no need for periodic compaction operations. As a result, disk space occupied by LMDB (and thus by Meilisearch) tends to increase over time. It is not possible to calculate the precise maximum amount of space a Meilisearch instance can occupy.
Memory usage
Since LMDB is memory mapped, it is the operating system that manages the real memory allocated (or not) to Meilisearch.
Thus, if you run Meilisearch as a standalone program on a server, LMDB will use the maximum RAM it can use. In general, you should have the same amount of RAM as the space taken on disk by Meilisearch for optimal performance.
On the other hand, if you run Meilisearch along with other programs, the OS will manage memory based on everyone's needs. This makes Meilisearch's memory usage quite flexible when used in development.
TIP
Virtual Memory != Real Memory Virtual memory is the disk space a program requests from the OS. It is not the memory that the program will actually use.
Meilisearch will always demand a certain amount of space to use as a memory map. This space will be used as virtual memory, but the amount of real memory (RAM) used will be much smaller.
Measured disk usage
The following measurements were taken using movies.json an 8.6 MB JSON dataset containing 19,553 documents.
After indexing, the dataset size in LMDB is about 122MB.
Raw JSON | Meilisearch database size on disk | RAM usage | Virtual memory usage |
---|---|---|---|
9.1 MB | 224 MB | ≃ 305 MB | 205 Gb (memory map) |
This means the database is using 305 MB of RAM and 224 MB of disk space. Note that virtual memory refers only to disk space allocated by your computer for Meilisearch—it does not mean that it's actually in use by the database. See Memory Usage for more details.
WARNING
These metrics are highly dependent on the machine that is running Meilisearch. Running this test on significantly underpowered machines is likely to give different results.
It is important to note that there is no reliable way to predict the final size of a database. This is true for just about any search engine on the market—we're just the only ones saying it out loud.
Database size is affected by a large number of criteria, including settings, relevancy rules, use of facets, the number of different languages present, and more.