Etsy Keeps Their Global Community Engaged With A High-Performance Data Platform
Etsy is a marketplace where people around the world gather to make, sell, and buy unique goods. Etsy’s emphasis on creating a more meaningful economy by connecting independent vendors and shoppers has grown it into one of the Internet’s busiest ecommerce platforms.
As part of their DevOps culture, Etsy’s developers deploy code frequently—often dozens of times every day. In this type of environment, perhaps the most difficult systems to observe and optimize are the databases. Keeping the platform available, reliable, and fast means the databases must be high-performance around the clock, even under load and while teams deploy platform changes.
Etsy’s commitment to system performance and observability are well-known in the industry. They created and open-sourced StatsD, one of the most popular components in the typical open-source monitoring stack. They share their quarterly performance reports on their blog for the world to see, analyze, and learn from. Their engineers and leaders have published books on performance. It is no exaggeration to say that Etsy’s team members are thought-leaders in performance. They’ve shaped how engineers of all types think about system performance in their daily lives.
Etsy's DBAs rely on VividCortex to accomplish projects like version upgrades, schema changes, and evaluations of new platforms more effectively. For example, when Etsy evaluated a hardware change—switching from traditional spinning-disk storage to SSDs—the DBAs used VividCortex to A/B test the change and analyze exactly how the system performance would change with the new hardware. Similarly, when Etsy upgraded their databases to a new version of MySQL, VividCortex helped them analyze and confirm whether and how performance would change—including finding any buried needle-in-the-haystack performance regressions in particular types of queries.
And if a problem should appear in the databases, Etsy’s engineering teams need a way to find and diagnose it, fast. Etsy relies on VividCortex’s high-resolution, query-level performance metrics to surface and explain changes or problems hidden within the flurry of activity. And because VividCortex retains query performance metrics for long-term trend analysis, automatically aggregates them into top-level views, and displays them in summary dashboards, Etsy has immediate visibility into what’s going on now within all those databases, plus historical data about how it has changed over time.
With VividCortex, Etsy’s engineering teams equip themselves with high-resolution charts and metrics, intelligent SQL parsing and heuristics, and execution plan analytics. They have access to the complete picture of database performance, including information on errors, warnings, connections per second, commands, replication delay, and highly detailed query latency metrics. Demanding projects such as capacity planning exercises are more streamlined, because VividCortex is already instrumented to provide information on historical activity like disk usage or CPU, giving Jeremy Tinley, Sr. MySQL Ops Engineer, confidence in how future changes will pan out.
“There just aren’t other tools that give you what VividCortex does,” Jeremy told us. He noted that although manual monitoring methods like packet capture, pt-query-digest, and slow query logs are available, they don’t offer the same kind of visibility. Part of the reason is the impact they cause on systems, which used to cause companies like Etsy to capture only a portion of their database traffic for analysis. “If we sample fifty percent of our traffic, what are we going to miss?” Jeremy asked rhetorically. “Fifty percent is a significant amount. Being able to instead see everything is a huge boost in confidence.” To support Etsy’s community and marketplace, there are many monitoring options that might provide partial visibility. But for Etsy? “I think the difference is that we want all the information,” Jeremy said. “That’s what VividCortex provides.”
Read and download the full case study here: