Guide: How to Find Meaningful Data With Google Analytics Site Speed
One day in 2011, Google Analytics users awoke to a new menu item called Site Speed in their GA dashboards. The tools under Site Speed present reports on how long it takes for site pages to load by logging data from a sample of visits, a method known in the performance monitoring industry as ?real user monitoring? (RUM). The tool is intended to provide a free option for GA users to understand the experience actual users have on their sites from a variety of browsers, geographies, and devices.
We at Yottaa were excited by the release of Site Speed — it was validating to know that our passion, optimizing user engagement (in part through increased speed and performance), was a priority at such an influential web company. In the effort to make the entire web a better experience for everyone, the level of attention and energy Google provides is a boon.
Unfortunately, since its inception the Site Speed feature has been a disappointment, and at worst has resulted in misleading data for its users. A few key issues hold Site Speed back, and in two years they have yet to be addressed. (Though to their credit, Google reps and evangelists have repeatedly acknowledged these issues). Here are four issues that limit the effectiveness of GA Site Speed, and suggestions for how you can get the most out of the tool.
Four Issues Limiting GA Site Speed Metrics
1. The sampling rate
For each Site Speed report, GA takes samples from a pool of traffic. If a given report includes fewer than 500,000 visits, 100% of traffic can be included in the pool. The number of actual data samples collected, however, defaults to 1% of the pool. That is to say, a report using the maximum pool of visits (500,000) will be on at most 5,000 RUM samples.
For sites that pull in massive traffic, the 1% rate is plenty for a statistically meaningful report, especially on un-segmented data over an increment like a day, week or month.
For a smaller site, however, or for the less-trafficked pages on a site, the 1% rate leads a problematically small sample, even when the report is broad in scope. See screen shot below.
This site receives over 100,000 unique visits/month, but this report on a randomly selected product detail page reveals a single RUM sample collected in the course of a month.
This sample size problem worsens when a user looks to drill into a report by segmenting the data, for example by country or by browser. If sample sizes are small to begin with, looking at data that’s been further segmented leads to insignificant reports.
The same site’s home page performance on a popular browser (FireFox): for most days 0 samples are collected, rendering the report insignificant.
As a result of these limitations, users looking at Site Speed for the first time are often shocked to see wild inconsistency in their reports. For instance, if a single sample happens to be collected from a user on a mobile device with a spotty 3G connection — resulting in a load time well over 60 seconds (a surprisingly common occurrence in GA Site Speed) — the average performance for that day will be alarmingly poor, despite no evident problems with the site.
For Chrome browser, the site’s home page shows day-to-day fluctuations in average load time from 2 seconds to 31 seconds.
For most sites, the sample rate may be manageable for longer-term reports, such as plotting load time averages month over month. But one of the great strengths of GA’s reporting features ? the ability to drill down into smaller segments ? is not useful for novice users of Site Speed. To learn how to add code to overcome the 1% rate, see next section.
2. GA reports use mean averages, not median
Another issue that limits the effectiveness of the data is the use of means, rather than medians, across all of GA’s Site Speed reports. Within the web performance niche ? as in any industry that collects “real world” data, rather than lab testing or synthetic testing ? median is the standard in reporting averages. All web activity is subject to the vagaries of the Internet, including connection failures, browser incompatibility, on-page errors, and more. Even if the sample size is fairly substantial, presenting data in a mean average is misleading when extreme outliers are as common as they are in RUM data.
3. Geographic noise
In Site Speed reports, there’s no easy way to segment traffic by geography. This means even if 100% of a site’s business comes from North America, performance samples from Africa, Asia, and Europe may appear. For sites that optimize performance for a specific region using a CDN or strategic server locations, these other samples amount to statistical noise. There is a way around this, by creating separate traffic profiles in your GA account, but doing so involves time and know-how.
Poor performance in other countries factor into this US retailer’s GA Site Speed reports
4. Mobile noise
Nor is there an easy way to segment mobile traffic and desktop traffic. This is perhaps an outgrowth of Google’s position that good performance ought to be treated as equally important for visitors on any device or platform. That position is admirably forward-thinking, and it’s a smart approach for any site owner to adopt, but in this case it comes at the exclusion of flexibility for site owners to decide.
Say a site receives less than 10% of its traffic from mobile; the GA Site Speed data for that site may, on a given day, be based on a much higher proportion of samples from mobile browsers. In that case the data would hardly be indicative of the ?average? visitor to the site. Conversely, if mobile performance is particularly important to a GA user, there’s no guarantee that a meaningful number of mobile samples will be collected on a given day from mobile devices.
Without the ability to segment and direct focus on or away from mobile, users may find they are stuck with very little relevant data.
How to Get Meaningful Performance Data with GA
Notwithstanding these issues, there’s a lot to like about GA’s Site Speed. Its interface is smooth and easy to use, especially for those accustomed to GA’s other reporting features. The data it collects is “real” data, which is far more useful than the synthetic performance tests used by some performance monitoring solutions. And most impressively, users can easily map site speed metrics against engagement metrics like conversion rate and bounce rate, enabling easy analysis of how performance affects business.
So how do you overcome the issues and get meaningful data from GA?
1.) GA Site Speed should not be the only performance data you collect. Use free testing tools like Websitetest.com and monitoring tools like Yottaa, Pingdom, Keynote, and Gomez to complete the picture. GA Site Speed data can complement this data, but it’s not comprehensive enough to provide actionable performance reports.
2.) Unless your site recieves tens of thousands of visits per day, you will absolutely want to raise the default sampling rate from 1%. This official page documents the code snipped that must be added to overcome the default rate. This blog post by Paul Mestemaker includes a rule of thumb for setting the sampling rate based on traffic, as well as other helpful information on gaining meaningful insights from GA.
3.) Go beyond the averages. In some views of the Site Speed reports you can drill down to see the individual data samples, and there use your own heuristic process to glean meaning from the samples. It’s hardly the most scalable solution, but it’s better than relying on mean averages if you’re creating an actionable report on site performance.
4.) If you have the time and the know-how, create traffic profiles for geography and device so that you can quickly view Site Speed data without noisy data from other traffic segments. You could, for example, create a profile including the countries that comprise 99% of your business, excluding all others. Similarly you could create a profile that comprises all known desktop browsers, excluding all mobile browsers, and vice versa.
That Google is focused on helping site owners improve performance is nothing but good for web users. Continuing updates like the Speed Suggestions feature released this week mean performance will play an increasing role in Google’s online initiatives. However, until a few improvements are made to Google Analytics Site Speed, it will remain a supplementary tool to more robust monitoring solutions.