This article is Part 2 of an ongoing series about "The Zen of System Performance." Part 1 is about the importance of viewing system performance from two separate but deeply interconnected perspectives. View all parts here. Baron Schwartz's presentation from MongoDB World 2017 also explores these topics — a video recording is available here.
If you work on an app’s backend, you’re probably used to looking at server-side metrics like utilization and backlog. Alas, this can give you major tunnel vision, because it can easily cause you to ignore the user’s experience. Why does this empathy matter?
Imagine you aren’t a developer for a second. You need to buy running shoes, and you’ve heard of "Pumped-Up Kicks," a hip, new online shoe retailer. You decide to check them out, but everything is sluggish. Super frustrating—and you don’t care why! You delete the app after a few failed attempts to find what you want.
A user who abandons after a few hiccups is actually typical. Users only care about how well the system performs for them. This realization is the key that unlocks the ideal way to think about performance server-side, too.
User performance has been repeatedly shown to have a direct impact on businesses' revenue, such as a 1% revenue increase per 100ms latency improvement (or, in another example, a loss of 10% of users for every second a site takes to load).
A user cares about system performance for the most obvious reason: He or she wants to use the system. While that might sound painfully simple, users' concerns are often overlooked. These oversights are most serious when they're made by service owners themselves. Service owners know that users are part of their systems, but do they understand users' true experiences? Do they consider what each individual user wants and reacts to?
In a previous article, we talked about how there are two main ways to look at system performance. Many service owners only see a system from one point of view: the server-side. By missing half the picture, they miss half of what makes system performance valuable.
The Users' Concerns
In a way, users are very selfish. This isn't a moral indictment! It's a reflection of the fact that most users are unaware of each other. Even if there are hundreds of thousands of users sending requests to the same system at the same time, the system usually doesn't reveal that fact.
This isn't to say that users don't affect each other — they do, invariably. The average user is just unaware of how their impact on the system impacts their fellow users, and vice versa. Users' experience of the system is limited by nature. Companies design it that way. When you're on Pumped-Up Kicks, you don't want to worry about all the other users' queries sent to the servers, or how those queries are affecting any of your own queries in the queue.
As a result, the entirety of a user's concern for performance can be boiled down to a question: How well does the system handle my request? And this can be divided even further into two sub-questions:
- Is my request handled quickly? ("I want my answer, and I want my answer fast.")
- Are all of my requests handled this way, consistently? (It only takes one negative experience to shape a user's opinion of the system they're using—even 1 slow request out of 100 can shape their appraisal. In other words, users don't care about average experience. Outliers can have an outsized impact.)
In the eyes of users, latency—consistent latency—is the principal definition of performance, measured in seconds/request. You can also think about this as "response time" or "residence time," but in the end it's the same. Users, from their perspective outside the black box that is the "mysterious" system, just want work to be done quickly and predictably—and that's how they measure it. Who can blame them?
The Challenge of the Users' Perspective
The users' perspective is unique. Unlike many aspects of system performance, the challenge presented by users is not one of monitoring or measurement. (The users' key metric is simple: latency. Or, in a layperson's parlance, "Is this thing working?") The real challenge is whether service owners can understand users and their focused, individual experiences, through empathy and perspective. Service owners must ask themselves, "What can I not see with my usual metrics and measurements? What is the experience of somebody on the other side of the system?"
Systems are built to serve users, after all. On the server-side, it's easy to focus on only server performance and related metrics, but it's important to keep a much bigger picture in mind. Users aren't just a piece of the system, or an obstacle for it, but an entire perspective for understanding it.
If you want to learn more about how a performance-driven mindset can influence your approach to system monitoring, watch a replay of our joint webinar with Datadog, "5 Tips for Determining the Most Impactful Metrics in Your App."