Neo4j Page Cache Introduced

This article explains the Page Cache in the Neo4j database.

About Page Cache

Since Neo4j is a database management system (DBMS), data is stored on disk. However, reading from the disk every time a user makes a query would be extremely slow, and write performance would also suffer.

To improve performance, various techniques are employed, one of which is the Page Cache.

In short, the Page Cache is a "cache stored in memory." Since it is stored in memory, it is significantly faster than reading and writing data from the disk each time.

Incidentally, in Neo4j, both node data and relationship data are cached in the Page Cache.

Neo4j Page Cache

Eviction

Basically, it is a cache. It is impossible to keep all the data on disk in memory due to size constraints, so when the cache overflows, it is necessary to evict existing unused entries.

The number of evictions can be measured with the metric <prefix>.page_cache.evictions, which will be discussed later. By measuring this along with another metric that indicates how much of the cache hit, <prefix>.page_cache.hit_ratio, you can understand how effectively the Page Cache is being utilized.

Page Cache Configuration

The size of the Page Cache can be configured with dbms.memory.pagecache.size.

For example, if you want to allocate 4GB of memory to the Page Cache, you can set it as follows:

dbms.memory.pagecache.size=4GB

Page Cache Metrics

When using the Page Cache, you might be concerned about how effectively the memory cache is being utilized.

In the Neo4j database, you can obtain metrics related to the Page Cache.

The main metric to check is <prefix>.page_cache.hit_ratio, which indicates how much of the cache hit.

Additionally, you can use <prefix>.page_cache.evictions to see the number of evictions that occurred.

The status of IOPS can be obtained with the metric <prefix>.page_cache.iops.

Warm-Up Page Cache

The Page Cache uses a Copy-On-Write method. This means that data is only read from the disk and moved to the Page Cache in memory on-demand when there is a query for that data.

This means that immediately after the Neo4j database starts, the Page Cache is empty. Therefore, the initial read requests will be read from the disk until the Page Cache warms up, negatively impacting latency.

For applications where this is a problem, one approach is to pre-warm the Page Cache by executing dummy read queries from a separate process.

NOTE: In the Enterprise Edition, you can enable the setting dbms.memory.pagecache.warmup.enable.

Conclusion

In summary, we introduced the overview of the Page Cache in the Neo4j database, as well as how to configure it and the metrics that can be obtained.

2021-11-12