Hang in There, Help is On the Way

I/O hangs and delays may lead to poor customer experience and churn. Speedb eliminates this issue and provides superb and consistent performance to the most write-intensive workloads

To a large extent, customer experience is no longer a differentiator for online businesses but rather a basic expectation. Any company that does not provide a nearly flawless experience will feel the impact of customer dissatisfaction on the bottom line. Customers today are intolerant of issues, especially application stalls and long response times. With so many alternatives, users simply switch to someone else’s app and never look back.

‍

‍

Customer experience management is a frustrating task with the number of unexpected problems that occur every second causing stalls, performance degradation, broken applications, and much more. Companies must make huge investments in improving the customer experience across the stack to remain competitive. However, as computing environments become increasingly more complex and interwoven, it is more likely that an overlooked component will break down and eventually impact the performance of the entire system.

Practically any component in a system can potentially become a bottleneck, from the storage and network layers through the CPU to the application GUI. In most cases, when the root cause is at the upper levels of the stack, the entire system might not be affected dramatically, and the problem can be fixed relatively easily. But when the root cause is buried deep in the system, finding it might not be that simple. At the same time, the deeper the root cause, the greater the impact on the system.

For example, performance hits are commonly related to I/O bottlenecks in the storage engine, AKA data engine, which is the deepest and “lowest” part of the software stack that sorts and indexes data. As such, it can be seen as the weakest link in the system as any I/O hang originated in this layer may trickle up the stack and cause huge delays. This is the stage where errors start to show and users begin to abandon.

The reason why data engine I/O bottlenecks are becoming increasingly common is the ever-growing volumes of data handled by modern systems. One of the main drivers for the continuous data explosion is the growth of unstructured data in the form of objects arriving from an increasing number and variety of sources, e.g., documents, audio and video files, IoT and sensor data. In particular, the growth of metadata associated with these objects is becoming a major issue as an increasingly large number of objects that may only be a few bytes large may now be holding a metadata of about the same size, and sometimes even more.

As the volume of metadata continues to grow, the shortcomings of existing data engines become apparent. Currently, available data engines are based on architectures that were not designed to support the scales of modern datasets. Most of them use Log-Structured Merge (LSM) tree-based key value stores (KVS) to keep metadata in memory. In an LSM tree-based KVS, key/value string pairs are arranged in sorted strings tables (SSTables). SSTables are append-only files, which means that they are never updated. Instead, when new or updated data comes in, additional SSTables are created. Then, multiple SSTables are typically merged and sorted into a single SSTable in a process called compaction, which allows for faster access and retrieval of data.

The problem with this method is that compaction processes typically involve significant I/O overhead. Hence, an LSM-tree based KVS may experience periodic I/O surges when moving large datasets, resulting in performance bottlenecks.

Accordingly, one of the main design objectives of the Speedb data engine was to eliminate I/O hangs. To accomplish that, we revamped the basic components of KVS. For example, we developed a new compaction method that dramatically reduces write amplification for large scale LSM, and a new flow control mechanism that eliminates spikes in user latency. The result is a data engine that can natively support write-intensive workloads, enabling our customers to achieve new levels of performance and consistency without compromising storage capacity and agility. By addressing this critical component, businesses can ensure that the entire system doesn’t grind to a halt every other week and protect their customer experience.

Hang in There, Help is On the Way

I/O hangs and delays may lead to poor customer experience and churn. Speedb eliminates this issue and provides superb and consistent performance to the most write-intensive workloads

Related content:

Speedb in 2023: A Year of Innovation and Advancements

How Snapshot Optimization Takes Transactional Databases to the Next Level

Dirty Data Manager

Under the hood of Kafka Streams - Tune your storage engine to boost the application performance

Speedb Cloud: The Game Changer for Efficient Cloud-Based Data Management

Key-Value Store vs Storage Engine

Speedb Launches Enterprise RocksDB Technical Support Program

Boosting Your Application with Speedb

Speedb v2.6 is Out

Speedb Public Roadmap

Speedb Basics: What is Speedb?

RocksDB - no more restarts: live configuration changes with Speedb

RocksDB memtables flushing - improving memory management

Modern Storage Engine Magic

Speedb may seem similar to RocksDB, but it’s a whole other animal

Kafka streams - Scaling performance using Speedb

Whether Monolithic or Microservices-based, Data Access is the Weak Link in Your Architecture

Sharding — No Longer a Necessary Evil

Speedb: The Storage Engine of the Future

Understanding RocksDB Leveled Compaction

What Factors Affect Performance in RocksDB?

How does RocksDB Memory Management work?

Performance stability using improved delayed write mechanism

Hang in There, Help is On the Way

A word from our CEO (part 1): Why we took the task of designing the next generation storage engine?

LSM vs B-Tree

Speedb

Speedb