Performance Vitals: a unified scoring system to guide performance health and prioritization

Tl:dr: This post explains how we measure client performance across products, cross-functional teams and at Coinbase.

By Leonardo Zizzamia, Senior Staff Software Engineer

Since 2018, when Coinbase's web team was made up of just a handful of engineers, a lot has happened. We could use pre-existing open-source tools back then to speed up the development of our product with a small team across one platform.

Coinbase will have engineers across four platforms in 2022: React Web and React Native, Server Side Rendering, Chrome Extension, and React Web. We had not previously standardized performance across the four platforms. This meant that we needed to address several issues: insufficient, incomplete data on some platforms, loss of efficiency when performance opportunities were not identified, and consistency across teams regarding prioritization.

We knew this and created the Performance Vitals system: A high-level scoring system which is transparent, trustworthy, and easy to comprehend. The ability to summarise the application's performance into a consistent score increases urgency and directs attention to each performance opportunity.

Google Web Vitals

To help clients perform, the Core Web Vitals standard is used by the Web developer community. We have adopted it and are actively using it at Coinbase.

Thresholds are used to distinguish vital metrics. They can be used to categorize performance measurements as " good", needs improvement, or " bad".

Here's an example of where one of the Web Vitals could be placed, Time to First Byte.

Coinbase uses the 85th percentile value for all measurements on a page or screen to classify the overall performance of client products. This is best practice. The site is considered to have "good" performance if 85% or more of the measurements are met. This metric is 10 times higher than the Google Web Vitals standard. It gives us enough bandwidth for potential regressions.

Perfume.js is our primary tool for capturing these metrics. It wraps the Performance Observer API and allows us to measure all Core Web Vitals. We are the main maintainer of this library so we took this opportunity to explore and develop new ways of attribution and web performance measurement.

We introduce a new, proprietary metric today called the Navigation Time Blocking Time ( ). The NTBT is a measure of how long an application can be prevented from processing code in the 2nd window following a user's navigation from page A to B. The NTBT metric measures the total blocking time for all long tasks in the 2 second window after the method has been invoked.

Below is an example of a NTBT performance marker in coinbase.com that helped a client engineer to track down the lengthy task and improve responsiveness when navigating among pages.

Perfume.js can also be used to enrich metrics with Navigator APIs information, which allows us to distinguish between high-end and low-end experiences.

After adopting and expanding Web Vitals, we decided to reuse this knowledge in our stack.

Performance Vitals

We also build web apps and provide data services. Reusing Web Vitals best practices, we created new metrics to support React Native apps and our Backend services. They are called " performance Vitals" and give us a complete view of the performance scores for all our applications, from downstream ("Browser & apps") to upstream ("Backend Services")

The chart below shows how the Performance Vitals are split from downstream to upward.

Creating Native Vitals

We created the first Vitals of App render Complete, and Navigation Total blocking time to evaluate performance for React Native.

App Render complete (ARC):Metaures the time taken from the start of the application to rendering all content. Based on official research by the Android community, the Good threshold of 5s was established.
Navigation Total Blocked Time (NTBT),Measures how long the application is blocked from processing code in the 2s window following a user's navigation from screen A to screenB.

To determine a mobile threshold, we used information from Web Vitals about Total Blocking Time to establish NTBT. We assumed that mobile will take longer than 200ms to achieve a good TBT. Therefore, we increased the standard from Web by 400ms to reach 400ms.

The video below shows how a product engineer can identify long-tasks and measure total blocking time while navigating between pages.

This metric is used to detect potential sluggishness of a user interface. It's usually caused by long tasks being executed on the main thread, blocking rendering or using too much processor power for background processes.

Coinbase created a React Native Core Vitals library in-house to measure the performance of Web. The goal is open source the community's discovery in the next quarter.

Creating Backend Vitals

We extended the Vitals standard, as we did with Web Vitals and React Native Vitals to backend services such as GraphQL or Backend Services.

These are the two metrics that we created first:

GraphQL Respond Time (GRT).Round trip times for the GraphQL services to respond to a request.
Upstream Response Tim (URT:Round trip for the API Gateway to service a backend services.

We considered many factors to determine the Good Score that would represent backend latency.

The system's response time is instant from the user's point of view, if it is lower than 1s.
Also, we need to consider that network costs can vary from 50ms to 500ms depending on where a user is located.
Based on points 1 & 2, GraphQL latency shouldn't exceed 500ms. This means that upstream services should respond in less than 300ms. GraphQL queries must wait for the slowest endpoint.
We concluded that the threshold to get a GRT Good Score is 500ms and that for URT Good Score it is 300ms.

Backend Vitals aims for 99.9% of measurements for every logged request in order to reach the "Good" threshold.

We will continue to improve our performance and revisit our Good scores every year, possibly even lowering them over the years so that we can lower latency for our users.

Three essential components make up the instrumentation for backend vitals. Our in-house analytics library is used to define metadata such as the product, platform and pages. Next, we use our in-house analytics library to define metadata. Finally, we integrate performance metrics with React Native or Web metadata.

Performance Vitals – discoverability and prioritization

It is easy to spot areas of opportunity by using the same metric and attribution system throughout different specialties at Coinbase. This aligns both frontend- and backend engineers in performance efforts.

All Performance Vitals are based upon real-time data from production applications. They can be found by standardized filter such as product name, platform and page, geo region, GraphQL operation and backend service.

This accuracy is especially important for Real Time Anomaly Detection. Each team can own the performance metrics of their product surface. This allows them to set up automated monitors to detect performance changes and alert when they occur.

If there is a performance regression, the percentage of the regression is used to determine whether it is critical to open an investigation and address the issue immediately.

Annual and Quarterly Planning

Performance Vitals are great for planning KRs. They measure performance from 0 to 100, and can be stored for up to a year. It is easier to set shared goals across teams by using the same language for all KRs in performance.

Here are some examples of how to frame your KRs:

[Year KR] Attain NTBT Good Score of 90 %, an increase from 70% in Coinbase Mobile App.
[Quarter KR] Increase LCP Good Score by 70% to 85% in Coinbase Web

Next

Performance Vitals is all about finding a common language. This could be standardizing filters, setting quarterly KRs or unifying a scoring scheme. A small group working on an API regression can be a big team, but speaking the same language will help all types of product prioritization.

We plan to share some of our future learnings, including how to measure and drive impact for Critical User Journeys. Also, we will be sharing more information about automation and internal processes that enable Coinbase employees to create performant products.

Performance Vitals: A unified scoring system that guides performance health and prioritization was first published on Medium's The Coinbase Blog. People are responding to the story by highlighting it and commenting.

—————————————————————————————————————————————————————————————–
By: Coinbase
Title: Performance Vitals: a unified scoring system to guide performance health and prioritization
Sourced From: blog.coinbase.com/performance-vitals-a-unified-scoring-system-to-guide-performance-health-and-prioritization-127f80268ee4?source=rss—-c114225aeaf7—4
Published Date: Fri, 26 Aug 2022 14:01:35 GMT

Related posts:

Leave a Reply Cancel reply