Catena: A High-Performance Time-Series Storage Engine

Catena: A High-Performance Time-Series Storage Engine

There are plenty of storage engines out there, but none of them seem to offer fast and efficient time series storage and indexing. The existing options like RRDtool and Whisper aren’t very fast, and the fast options like LevelDB aren’t specifically made for time series and can lead to harsh operational issues. Instead of hacking on something like LevelDB to suit his needs, Preetam Jinka, one of the team’s brainiacs, decided to write his own storage engine.

Preetam Jinka covers the unique characteristics of time series data, time series indexing, and the basics of log-structured merge (LSM) trees and B-trees. After establishing some basic concepts, he explains how Catena’s design is inspired by many of the existing systems today and why it works much better than its present alternatives.

Designing a storage engine is the easy part. Implementation is much more interesting. He covers how Catena uses advanced concurrency optimizations like lock-free lists, atomics, and precise locking. He will also show some neat tricks to take advantage of CPU caches and prefetchers in subtle ways. All of these combined allow Catena to easily store and index over 800,000 time series points per second on an average laptop.

This webinar will help you understand the unique challenges of high-velocity time-series data in general, and VividCortex’s somewhat unique workload in particular. You’ll leave with an understanding of why commonly used technologies can’t handle even a fraction of VividCortex’s workload, and what we’re exploring as we investigate alternatives to our MySQL-backed time-series database. 

Submit the form below to acces a copy of the webinar.