Posted by Baron Schwartz on Apr 14, 2014 2:21:00 AM

Customers love our Top Queries feature, which lets them rank queries by a metric such as overall execution time or count. This is a great way to examine entire families of similar queries together. We group queries by digesting out the literals, normalizing whitespace, and so forth.

Here is a view of queries on some of our primary database servers, over the last 4 days. What do you notice? I notice a strange pattern on query 5 and 6.


Is that query getting slower each day till it resets? Or is its response time consistent, and its execution count varies? We can click on the query to highlight it. When we do, the right-hand information pane fills with details about it.


Now we can see at a glance that the query is being executed at a consistent rate, but its latency and total time seem to increase and reset. We can also see how this query’s workload is distributed across hosts, and copy its full digest.

To get exact samples of this query’s SQL, we can click on the Samples link, which drills down into that query and shows its samples in a scatterplot.


The top of the screenshot shows the row we drilled into. When we “zoom in” in our interface, we like to bring some context along as well, so you can see where you came from as well as the detailed drill-down. Underneath that, we show individual samples of the query, plotted on a timeline, with the vertical axis in seconds of latency (execution time; response time). You can click on any of those samples to see the precise SQL below.

We’re working on expanding this view. Soon we’ll have EXPLAIN plans, profiles, notifications, errors extracted from the server’s response to the client, and more. When we have that finished, you’ll be able to see why this query has such a strange pattern: it is a full partition scan, and the partition gets full as the day passes.

