Speedb Memory Manager  - Part #1

Introduction

Speedb Open Source is a modern fork of RocksDB that is designed for better performance and efficient resource utilization. One of those scarce resources is RAM: since Speedb/Rocksdb are consistent storage engines, which means data does not reside in memory, the memory footprint is expected to be reasonable compared to the data size.As a result, many components require memory allocations.Managing memory allocation while maintaining performance and not exceeding the memory limit is one of the challenges when configuring this smart and complex storage engine.In order to get the maximum performance having a given memory, the user needs to configure different parameters for each column family, and change the configuration based on the size of the databases and the workload while risking OOM. The outcome is that users are ultra conservative and use the options that prevent OOM events by using options that are less optimal but are “safe”. The above observation, that we viewed in many user scenarios, is the motivation to create a memory management tool that will get a maximum cap on the amount of memory that can be used and will provide the best performance under this limitation. Hence, the main requirement from such a capability is to work as good as manual configuration of each of the above options, assuming the configuration is legal and does not lead to requiring more memory than the user intends the system to use.

“Dirty” vs “clean” memory  

The term dirty memory in storage systems is used to describe data that needs to be flushed to media eventually, while clean memory includes data structures that can be recreated from the media. Dirty data is accumulated in cache to provide a large enough chunk of memory to stage at once and convert in this way random writes to sequential writes to the media, while clean data is read from the media to make read and seek operations faster. The second and very important difference is the fact that while clean data can be dropped at any time, dirty data must be staged to media which may take a significant amount of time before it can be dropped from cache. In this document we are going to review the current behavior and the changes Speedb introduced with the new dirty data manager. Clean memory management is still a work  in progress .

Dirty data in Speedb

Dirty data is defined as data that requires being written to the media before it can be removed. In most  systems this definition covers both data that was written and also changes to the metadata.

  The structure of Speedb includes a mutable mem-table for each column family. When a write command comes to the system it is broken down into objects and each object is placed in the associated mem-table. Eventually the mem-table is considered full and the current mem-table becomes immutable and it is ready for flush.

Dirty Data Manager (DDM)

The dirty data manager (AKA DDM) is a replacement for the Rocksdb write buffer manager.  The dirty data manager is playing two roles:

  1. Initiate one or more flush to reduce the size of the dirty data (you can read more about this in the proactive flushes documentation)
  2. In case the rate of incoming write requests exceeds the flush rate, gradually slow down the requests to ensure that the system will not overflood with dirty data. You can read more about the dynamic delayed write here.

The DDM is based on and enhances the implementation of the  write buffer manager.  Just like the write buffer manager, it tracks the size of the mem-tables in all the column families that are associated with it and manages the memory that holds the dirty data.

In addition, the new write buffer manager simplifies the way the users should define the memory. With the old WBM, the user should calculate the max memory to allocate and divide it by the number of CFs,  memtables and their sizes. And in order to extend the memory allocation all these parameters should be updated. With the new write buffer manager, you need to change a single parameter in order to allocate additional memory.

Speedb configures the sizes for you. By using the Speedb tuning function, all you need to do is set the memory size you would like to allocate and Speedb will do the rest.

The dirty data manager introduced in Speedb Open Source version 2.6

Benefits of the new dirty data manager:

  • Provides high and stable performance, with no stalls while remaining in the memory limit boundaries and avoiding out of memory situations and system crash. 
  • Simplifies the configuration: setting a single write buffer manager to the entire database. Easier to allocate extra memory when needed.
  • Eliminate the chances of overflow of writes
  • Initiate as many flushes as needed 
  • Prevents running of parallel flushes when it is not required.
  • Overall performance stability - gradual slowdown based on size of dirty

Users that may benefit from the DDM

The users that will benefit from DDM are mainly users with a multiple CF environment (in one or more databases.

  • Multiple CFs: that have similar activities currently will flush together, creating a spike on the system load and the disk usage which will lead to performance instability. 
  • In environments that have CF with different loads the new dirty data manager will ensure that the CF that has more data will be flushed first. This may improve performance since small L0 files cause a large write amplification.
  • Users that have heavy writes workload, even on a single CF, will see a stabilization of performance by using the DDM as the trigger for delay, since this trigger is more granular . 

Test 

Benchmark Configuration

The following test compares between RocksDB and Speedb write buffer manager with the following configuration:

Test1: 

The buffer manager size is 4GB. 

Max memtable size: 256MB

32 column families 

Allow stalls is enabled 

Benchmark Results #1 

Benchmark Results #2

 

Conclusion Results #1

This test shows that with Speedb’s write buffer manager the performance is stable without stalls while with RocksDB’s write buffer manager there is an impact on the application performance due to the stalls observed. 

Conclusion Results #2

The table shows that Speedb’s write buffer manager remained in the memory boundaries the user defined while the RocksDB write buffer manager caused many stalls and exceeded the memory (reached to 6GB instead of 4GB). In heavy write workload scenarios, it can lead to out of memory scenario.Also the memtable size consumes much less memory compared to RocksDB. 

What’s next? 

Clean data manager!

With the clean data manager, users can enjoy the performance benefit of pinning over LRU cache, without getting into out of memory conditions, as well as without having to deal with too many memory configurations.

Related content: