From inception as computer file systems to their evolution into non-relational columnar data stores in the cloud, databases have undergone several dramatic changes over the years. Their usage as well as their purpose has evolved, but their primary function has always followed their original conception: the storage and categorization of data. On the same note, the underlying components of the database that handle the reading and writing of data to the physical storage have largely remained the same over the years.
While following tradition and the well-trodden roads of the past doesn’t necessarily lead to disaster, it doesn’t foster innovation either. For example, do you know why the modern railroad gauge (distance between rails) is 4 feet, 8.5 inches? This oddly specific number was used by the English, who designed railroads based on the same designs used on English wagons. Those wagons had the same wheel distance as older Roman chariots to avoid damage on the existing roads. While all roads do lead to Rome, sometimes building upon existing designs might mean we’re following an outdated structure not designed for the modern era.
The Crisis of Data Growth
So how does this apply to storage engines? Well, modern storage engines are built on those same principles from years ago. While the technology might have changed, a Storage engine is still an embedded Key Value Store (KVS) that sorts and indexes data. It can be installed either between the application and the storage or, increasingly, as a software layer within the application to execute different activities on live data while in transit. And, much like the railroads, following the structures is more about familiarity than innovation.
That isn’t to say that familiarity is bad. Many of today’s NoSQL databases are built on RocksDB, an open-source log-structured merge (LSM) tree-based KVS created at Facebook in 2012. But, in the world of technology, nine years is a long time to rely on a system without change.
Now, metadata threatens to overwhelm our databases. Driven by the continuous increase in the size and number of objects arriving from a variety of sources, metadata sprawl is expected to accelerate going forward. Traditional LSM tree-based data engines cannot keep up much longer. Due to their high write amplification, developers often have to trade-off between capacity, scale, and performance trying to meet hyperscale data processing and management needs. Consequently, they are spending their time tweaking databases, sharding, and handling other time-consuming operations instead of innovating.
A New Approach
It’s time for businesses to change how they approach database management. Today we are excited to announce the launch of Speedb, the next-generation storage engine. With a new architecture, businesses will no longer be held captive by their ever-growing store of metadata. A new solution was needed, one not built on the designs of the past, but one designed for our modern problems.
To do this, we completely redesigned the most basic components of the KVS from the ground up. We can eliminate issues with user latency, indexing, storage, and so much more.
Speedb was created to handle the problems of today and tomorrow without sacrificing quality or performance. We believe we have created the ultimate Storage engine, one not held back by the designs of the past and built with our new cloud-centric future in mind. While databases are here to stay, how we process, collect, and analyze our data needs to continue to evolve. Speedb is designed to scale and adapt to any challenge and we look forward to seeing it in action around the world.
While our applications may have evolved, modern, data-intensive workloads are still limited by the infrastructure of the past. To return to our railroad example, believe it or not, space shuttles are limited by those Roman chariots. To move the booster rockets from the factory in Utah to the launch pad they must be shipped by train through a tunnel only designed for the width of the train. Much like how the most advanced form of travel in the world is limited by the infrastructure of the past, modern databases and workloads are also held back by outdated infrastructure. It’s time we change that.