Data durability

Data durability

Memgraph uses two mechanisms to ensure the durability of stored data and make disaster recovery possible:

  • write-ahead logging (WAL)
  • periodic snapshot creation

These mechanisms generate durability files and save them in the respective wal and snapshots folders in the data directory. Data directory stores permanent data on disk.

The default data directory path is /var/lib/memgraph but the path can be changed by modifying the --data-directory configuration flag. To learn how to modify configuration flags, head over to the Configuration page.

With Memgraph Enterprise, the data_directory holds databases directory which splits durability files by database name. The reason for that is the multi-tenant architecture in Memgraph Enterprise, where the durability files for each database are stored under /data_directory/databases/<db_name>. The databases directory will exist even if you're not using the multi-tenancy feature.

Durability files are deleted when certain events are triggered, for example, exceeding the maximum number of snapshots, defined by the --storage-snapshot-retention-count=3 flag.

To prevent the deletion of durability files, you need to lock the data directory, and enable it again by unlocking the directory.

To manage this behavior, use the following queries:

LOCK DATA DIRECTORY;
UNLOCK DATA DIRECTORY;

To show the status of the data directory, run:

DATA DIRECTORY LOCK STATUS;

To encrypt the data directory, use LUKS (opens in a new tab) as it works with Memgraph out of the box and is undetectable from the application perspective so it shouldn't break any existing applications.

Durability mechanisms

To configure the durability mechanisms, check their respective configuration flags.

Write-ahead logging

Write-ahead logging (WAL) is a technique applied in providing atomicity and durability to database systems.

In the default IN_MEMORY_TRANSACTIONAL storage mode, Memgraph creates a Delta object each time data is changed. By using Deltas, Memgraph creates write-ahead logs. Each database modification is therefore recorded in a log file before being written to the DB, and in the end the log file contains all steps needed to reconstruct the DB’s most recent state.

Memgraph has WAL enabled by default. To switch it on and off, use the boolean --storage-wal-enabled flag. For other WAL-related flags check the configuration reference guide.

By default, WAL files are located at /var/lib/memgraph/wal.

Snapshots

Snapshots provide a faster way to restore the states of your database. Snapshots are created periodically based on the value defined with the --storage-snapshot-interval-sec configuration flag, as well as upon exit based on the value of the --storage-snapshot-on-exit configuration flag. When a snapshot creation is triggered, the entire data storage is written to the drive. Nodes and relationships are divided into groups called batches.

On startup, the database state is recovered from the most recent snapshot file. Memgraph can read the data and build the indexes on multiple threads, using batches as a parallelization unit: each thread will recover one batch at a time until there are no unhandled batches.

This means the same batch size might not be suitable for every dataset. A smaller dataset might require a smaller batch size to utilize a multi-threaded processor, while bigger datasets might use bigger batches to minimize the synchronization between the worker threads. Therefore, the size of batches and the number of used threads are configurable similarly to other durability-related settings.

The timestamp of the snapshot is compared with the latest update recorded in the WAL file and, if the snapshot is less recent, the state of the DB will be recovered using the WAL file.

Memgraph has snapshot creation enabled by default. You can configure the exact snapshot creation behavior by defining the relevant flags. Alternatively, you can make one directly by running the following query:

CREATE SNAPSHOT;

By default, snapshot files are saved inside the var/lib/memgraph/snapshots directory.

⚠️

Snapshots and WAL files are presently not compatible between Memgraph versions.

Storage modes

Memgraph has the option to work in IN_MEMORY_ANALYTICAL, IN_MEMORY_TRANSACTIONAL or ON_DISK_TRANSACTIONAL storage modes.

Memgraph always starts in the IN_MEMORY_TRANSACTIONAL mode in which it creates periodic snapshots and write-ahead logging as durability mechanisms, and also enables creating manual snapshots.

In the IN_MEMORY_ANALYTICAL mode, Memgraph offers no periodic snapshots and write-ahead logging. Users can create a snapshot with the CREATE SNAPSHOT; Cypher query. During the process of snapshot creation, other transactions will be prevented from starting until the snapshot creation is completed.

In the ON_DISK_TRANSACTIONAL mode, durability is supported by RocksDB since it keeps its own WAL (opens in a new tab) files. Memgraph persists the metadata used in the implementation of the on-disk storage.