TLDR

  • Storage engines, such as RocksDB and Speedb,  have memtables that fill up with data and require flushes to persist to disk.
  • Uneven data distribution can cause memory management issues and inefficient flushes in busy databases.
  • The new feature presented in Speedb open source, Proactive flushes, involves a more proactive and global write buffer manager that initiates flushes based on heuristics.
  • Enabling proactive flushes improves application performance and better utilization of the system memory, compared to RocksDB memory management performance

Imagine you are running a delivery service company

You have incoming packages, outgoing packages, trucks routes, drivers, etc’.

At the core of the business is the fact you are getting lots of packages into your dispatching warehouse and you need to distribute them evenly between the different trucks and drivers.

But… What happens when you have lots of packages to a specific route and not that many packages to another route?

You send out two trucks to the busy route and wait for a day or two until you have enough packages to the non-busy route.

But.. (again…) these packages are now taking up storage space, and you still can't get them out as the trucks are not full.

Now imagine you have more and more areas, you might get to a scenario where you have most of your storage space occupied by packages that you just can't ship and you have very little space for new packages.

So you are forced to ship half-empty trucks to the busy area just to handle the incoming load!

Sounds weird right?

Wouldn't an intelligent dispatcher just send some half-empty trucks to the non-busy routes and free up the warehouse space?

Of course, you would, and this is exactly what we did for actively flushing memtable 🙂

Why do RocksDB's small flushes impact your application performance?

Why are too many flushes or small flushes a problem? 

Simply put,  they create a lot of small files on the filesystem.

But why are small files are a problem? It’s a complicated question,  involving compaction, read\write amplifications, bloom filters, and read flow, but we'll try to explain it in a nutshell.

Image credit: Dall-E, by OpenAI 

Firstly, having many files open isn’t usually a big deal in most systems, but it’s something to consider - especially when it comes to filesystem and kernel limitations. However, this is usually less relevant in real-life scenarios.

Secondly,  creating many small files leads to many compactions, which increases the write amplification.  You might assume that these compactions will be smaller, but that’s not necessarily the case.  Even if you're compacting smaller files in L0, the L1 file size remains the same, resulting in a significant cost of read and write amplification.

Thirdly,  having too many small files can increase the read amplification in some cases. If you have not compacted your files yet, reads will take a considerable hit because there are more files (and filters if you use them) to go through before moving on to the next level.

Speedb’s proactive flushes - dynamic performance stabilization

After such a long preface, let's get to the actual Speedb's new proactive flushes feature and its oriented technical details.

In Speedb \ Rocksdb, each column family in each database (comparable to areas in our example)  has its own memtable (Truck)

Data coming in (packages) is filling up the memtable that sits in memory (the warehouse).

Flushes (out for delivery) are initiated if the memtable is full, or when the write buffer manager (dispatcher) triggers them.

With every new write, the WBM is being asked by the CF - should I flush?

If the answer is yes, only the specific database where the CF lives will initiate a flush.

So in an environment where the data is not evenly distributed between databases, a scenario where some memtables are half full on a non-busy database will never flush is a real problem.

Memory is being taken for no need, and even worse, the busy database keeps initiating small flushes as the total amount of memory forces it to flush.

As with the dispatcher example, we have introduced a new concept of proactive flushes.

The proactive flushes feature creates a more aware and global WBM that initiates flushes on all databases based on complicated heuristics, but mainly - where you have old data and valuable (large enough) flushes.

In other words, we taught the dispatcher to send half-empty trucks to non-busy areas in order to clear out storage space for the busy area to be able to store the packages until the truck is full.

The graph below shows how the overwrite performance is dramatically improved compared to RocksDB when the proactive flushes feature is enabled. The system has more memory to handle other flushes, and the flushing mechanism is much more efficient, so it enables the system to handle more operations. Speedb’s proactive flushes improve RocksDB stability and capacity.

Why upgrade your RocksDB to Speedb

The Speedb’s proactive flushes feature innovates the existing RocksDB flushing mechanism. It allows the application to benefit from improved application performance over time, eliminate stalls, and make more efficient use of the memory allocated. The proactive flushes feature was introduced in Speedb version 2.2.0 and is enabled by default. Speedb Open-source is a drop-in replacement of RocksDB, rebasing on RocksDB's latest versions. By upgrading to Speedb OSS, you can benefit from RocksDB capabilities, plus additional innovative features for improved  performance stability enhanced resource utilization and better operational usability. 

Check it out and let us know if it is beneficial in your environment! If you like proactive flushes, star us on GitHub. 

https://github.com/speedb-io/speedb/releases/tag/speedb%2Fv2.2.0 

Related content: