Surprising Behaviors of PostgreSQL Statistics Views

Posted by Baron Schwartz on Sep 16, 2016 1:19:26 PM

PostgreSQL provides a variety of statistics views that you can use to see what’s happening in the database. However, as with any complex system operating under a demanding workload, using these views may cause unwanted performance side effects, if you’re not careful. At the same time, the views have special behaviors that might surprise you if you’re not aware of them. If you’re planning to examine the PostgreSQL statistics views for any purpose, here’s what you should watch for.


How The Views Work

As the excellent documentation explains, a PostgreSQL server process that is asked for statistics creates a stats snapshot, and then “continues to use this snapshot for all statistical views and functions until the end of its current transaction.”

This is done for good reasons: it enables you to query the views several times and see a stable snapshot of their data, so the ground isn’t shifting under you as you work.

But there are at least two consequences to know about.

The Views Don’t Update

If you’re trying to monitor PostgreSQL activity over time and you don’t commit the transaction between queries to the stats views, you won’t see any fresh information. You’ll just continue to see your old snapshot.

In practice, this might take you some time to discover, especially if you happen to accidentally avoid this effect and don’t notice it at first. For example, you wouldn’t notice it if your database connection times out and re-establishes itself between queries. For another example, you might not notice it if you write a long-running monitoring cron job and you test it by hand in unrealistic ways — in that case, you might then deploy it and discover that it never catches any of the real issues you’re seeing.

The Views Cause Ever-Growing Data

Open transaction snapshots cause the server to keep the data that’s necessary for the transaction. That means the old data can’t be purged, and it starts to accumulate over time. As this happens, access to the statistics views will get progressively slower and slower. This can potentially pose a real performance threat to the server.

Here’s a graph of what this looks like in VividCortex:


As you can see, the queries against the statistics views are essentially getting linearly more expensive with time as the (very old) transaction that holds them causes the views to accumulate a lot of unpurged rows.

Solving The Issue

There are two good ways to solve the issue (assuming it is an issue for your specific use case). As the documentation states,

...If you want to see new results with each query, be sure to do the queries outside any transaction block. Alternatively, you can invoke pgstatclear_snapshot(), which will discard the current transaction’s statistics snapshot (if any). The next use of statistical information will cause a new snapshot to be fetched.

In practice, you can accomplish this by a variety of means, such as rolling back your transaction if you don’t have one you want, closing and reopening a connection, etc.

More worrisome, perhaps, is detecting this type of issue in the first place. For this, there’s no substitute for a good query-centric monitoring solution such as VividCortex, which will make the problem visually obvious as you’ve seen above. In addition, VividCortex's proprietary anomaly detection will find that type of activity and create events on it; you can subscribe to those events to be alerted about these and other anomalies in important types of queries. Don’t hesitate to request a free trial if you’d like to try these features firsthand while working with PostgreSQL statistics views.

Recent Posts

Posts by Topic

see all