Analyzing Related Metrics With VividCortex

Posted by Baron Schwartz on Mar 5, 2015 4:43:00 AM

Last week we announced our new query and metric listing and detail pages, which provide deep drilldown into individual queries and metrics for exploration and analysis. Today I want to show you one of the new features included in the Metric detail page. We have used it in a variety of scenarios for customers and for our internal analysis. We’ll cover some of the usage cases and success stories in future blog posts; here I’ll just give an overview of the capability.

To begin with, suppose you found a metric with an interesting shape and wanted to know something about it. This is very typical, by the way – you will see a bump or notch in a graph and wonder what it means, what else bumped or notched or behaved oddly around this time. Here’s our metric. In this case, CPU makes a good demo:

Analyzing_Related_Metrics

Look, ma! There’s a bump in that metric! Clicking anywhere on that metric leads to the details page, where we can see it up close and personal.

Analyzing_Related_Metrics_2

At the top, you see a bigger version of the sparkline that led you here, which is great for close inspection (reminder: we offer 1-second detail for all metrics).

Just below it is a “Find Related Metrics” section on the page. This shows all of the metrics that exist during the selected time period: query metrics, MySQL metrics, operating system metrics, hardware metrics, process metrics, and so on.

They’re clustered into groups that have similar shapes. Here we’re using the default, 40 clusters. You can change that to influence the quality of the results. We use K-Means clustering. It’s performed client-side in JavaScript and is fast and efficient, so you get an interactive, instant experience. (Many thanks to math, dataviz, and graphics genius Michael Holroyd for helping us select the algorithm.)

Each cluster of metrics is represented by a folder with the number of metrics it contains, and an average of the metrics inside it. The clusters are sorted by similarity to the metric at the top of the page, so similar-shaped clusters are first. Clicking on one of the clusters reveals the metrics inside it:

Analyzing_Related_Metrics_3

As you can probably guess, the metrics inside this cluster are often CPU metrics, but there’s also network and disk activity that is closely related to the CPU activity. Each of these metrics is also linked in turn to its own details page, so you can browse fluidly from metric to metric.

But similar-shaped metrics aren’t the whole story. Sometimes you don’t want to know what goes bump together in the night. You want to know what huddles down in fright when something else bumps. That’s easy to find, too. Scrolling down a bit, you’ll find clusters like this in the navbar:

Analyzing_Related_Metrics_4

Clicking on one of those reveals what has notches. There are various metrics with notches of various sizes, but interestingly, if you scroll down you’ll find that a number of them are query throughput metrics:

Analyzing_Related_Metrics5

So clearly, even though there is still idle CPU on this box, there is some contention for resources that is negatively impacting some queries!

I could continue showing you lots of screenshots, and all the other interesting shapes that occur around a CPU bump like this (there’s always more than just 1 or 2 patterns), but you should really experience this yourself. It’s so much better when you see it in action. You can browse all the metrics on your system, organized logically in groups that impart structure to the system’s activity. This helps you discover things you’d never have found otherwise, because really, who’s got time to look through 50,000 metrics per server? If you’re like most people you’ll say “my metrics are doing WHAT???” when you see this interface.

Analyzing_Related_Metrics_6

This is good for more than just impressing your friends. It helps you analyze problems faster than ever. It helps you form hypotheses and drill down to just the data you need to confirm or reject them. In other words, I believe we are one of the few (only?) products on the market that helps you discover the meaning of the metrics.

Now it’s your turn to discover something you’d never have known about your systems:

We love hearing from customers, and we would really appreciate your suggestions and feedback. Use the in-app messaging system to send us your comments. Happy clustering!

Recent Posts

Posts by Topic

see all