In an industry as fast-growing and interconnected as database technology’s, it’s exciting to track how innovations in one platform can ignite beneficial, rippling effects on other, surrounding systems. At VividCortex we frequently find ourselves faced with opportunities to improve our monitoring solutions based on how database technologies (such as MySQL, Redis, PostgreSQL, etc) develop and integrate new upgrades.
When those platforms that we monitor -- already powerful, tried and true -- equip themselves with new features, we work to discover how VividCortex can leverage those features and make our own technology even better. In the case of MySQL 5.7.8’s recent introduction of Generated Virtual Columns, we found the opportunity to use a new feature to make our queries simpler and more elegant, with results that are significantly faster and more efficient in how they use space.
Before having access to MySQL’s Generated Virtual Columns, we were already using a table that had an ID column for metric ID’s. As an inherent part of our use of that table, when we read from it we were interested in a huge number of those metric ID’s. Unfortunately, because those ID’s are generated from a hash, reading them wasn’t so simple as just selecting a particular range. Instead, we need to generate a huge list of ID’s and put them in IN (...) clauses when we query.
We developed a way to decrease the number of ID’s generated and lighten the load of the process; instead of specifying the raw ID’s themselves -- often a number in the thousands -- we found that a satisfactory solution in selecting ID’s that have a certain hash result (specifically, a modulo result). In other words, instead of using
SELECT * FROM table WHERE metric IN (_, _, _, ... [hundreds or thousands more])
we can use
SELECT * from table WHERE metric % 100 IN (1, 2, 3, 4)
In this expression, we’re only interested in metrics ID’s that have a remainder of 1,2,3, or 4 after dividing by 100. This specification makes handling our queries much easier… but it also means we have an indexing problem.
On one hand, our metric ID is part of our primary key, so specifying the ID directly in the IN clause would be very fast, as we can look up records directly in the primary key. However, with our modulo approach, we’d have to scan through and check each and every ID -- a slow, granular process. This process causes MySQL to look at more rows than is actually necessary, which is unnecessary work. This is where Generated Virtual Columns come in.
As of MySQL 5.7.8, users have had the ability to create secondary indexes on generated virtual columns -- for us, that means we can add a virtual generated column for our modulo result (metric % 100), which, significantly, uses no space directly. With this power at our fingertips, we updated one of our indexes to use the generated column -- something that was previously not possible. We also updated that index to include another column that we needed for our query, so it became a covering index (read about covering indexes in Baron Schwartz’s post about exploiting MySQL index optimizations.)
Of special interest to VividCortex, once we started exploring Generated Virtual Columns, we found it especially helpful to look at the differences in EXPLAIN plans in our Query Details page. Rather than manually experimenting with different queries and exhaustively checking latencies and other details, the information was all there, available, easily accessed, on VividCortex.
The final result is that our queries got a lot simpler and more efficient mainly due to MySQL's new virtual generated column support. MySQL became more flexible and powerful, VividCortex was able to leverage that power, and, as a result, when customers use our product, they’ll find a more streamlined solution, making minimal demands on space and time in their resources. If you’d like to see VividCortex in action on your own systems, be sure to request a demo.