Are Your Google Analytics Site Speed Metrics Accurate?
Last year we published a blog called How to Cut Through the Noise and Find Meaningful Data With Google Analytics Site Speed. In it we covered a few issues that make the standard output of site speed data in your GA portal noisy. They include the low default sampling rate, the use of mean averages (rather than medians) and others. These issues don’t mean the data is useless, but they do add a few steps to the process of getting meaningful output.
Let’s explore by example.
347 of the 601 samples for the first date range threw zeros, meaning over half of the samples reported no data for Average Page Load Time. Errors like these are a common occurrence in RUM data, due to connection timeouts, failed requests and other inconsistencies of testing on the live Internet. But it goes without saying that having over half the samples fail will not lead to an accurate reading. What’s more,
they are not identified as errors, so they are factored with equal weight as the real samples into the averages shown in the dashboard.
In the next date range, it’s a similar story: 301 of the 601 are zeros. For those keeping track at home, thats 648 failed samples out of 1203 total.
How do we fix this? Easy. Delete the zeros and re-run the average calculation. It’s not a perfect solution — for instance, samples are supposed to be collected at consistent time intervals, and deleting half of them will throw off that aspect of the testing methodology. But it will certainly be a more accurate output than it is with hundreds of errors weighing in the average.
Here are the averages for page load time with all of the zeros deleted:
These averages show that in the recent time period, May 7-31, the site actually had a faster average page load time, not slower. In fact, it’s 23% better — so if you factor in the 9.5% decline that was shown in the GA dashboard, thats a nearly 33% swing gained by just cleaning up the data.
So, what have we found?
- In Google Analytics site speed, errors occur frequently, and not consistently enough to make accurate comparisons of different time frames without some scrubbing
- Those errors (predictably) can throw off the averages substantially
- A little data analysis can go along way – don’t blindly trust your data, especially when the source is not purpose-built. (There are many solutions that focus exclusively on performance monitoring via RUM; for Google Analytics it’s a value-add feature).