How To Protect Your Website From Another Google Outage

A Google outage earlier this month caused site performance problems to ripple through the eCommerce ecosystem. On Sunday, June 2, many websites that rely on Google Cloud infrastructure or embed Google javascript (advertising, analytics, maps) saw slower site performance for several hours. The most devastating impact was felt by Shopify websites, which were down for 4 hours. 4 hours of lost sales that those Shopify sites will never have back.

What did the Google outage look like?

Yottaa actively monitors the performance of Google services across the 1500+ eCommerce sites that use our eCommerce acceleration cloud technology. We do this so we can  quickly identify performance anomalies (like this one), alert customers, and ensure sequencing rules are working correctly so customers don’t experience an issue.

On June 2, we saw a spike in performance starting at 2:57 pm for Google javascript calls. The impacted Google services were maps, advertising (Doubleclick, Google Ads, etc.), and the Google API. Below you can see the graphical depiction of “last byte” time for several Google services on our network. It shows a 30-60% increase in the amount of time required for Google to return the last byte of content requested from that service. 

Figure 1: Last Byte Time (ms) – Google advertising services

Google ad services like Doubleclick experience slow performance

Figure 2: Last Byte Time (ms) – Google API

Interestingly, we also saw an impact on one of the Google analytics tags (Classic). But it wasn’t what you might expect. Last byte times actually decreased during the outage.

Figure 3: Last Byte Time (ms) – apis.google.com, googleapis.com

Google analytics outage graph

Given that the Classic tag loaded faster during this window than it did the entire week prior, I imagine they had a “kill switch” in place. In other words, rather than letting a severity one problem slow websites when the tag doesn’t respond, Google probably responded immediately with a 500 error (i.e. a “fail fast” strategy).

Although website teams won’t see analytics data for that 4 hour window, their shoppers weren’t affected by the outage. Definitely a preferable outcome.

How often do 3rd party technology outages impact site performance?

Google reported that the outage was due to a planned configuration change in the Google network. Unfortunately, the change was applied to a larger number of regions than intended. This caused network congestion and delays when too much traffic was rerouted through the remaining servers.

What does this say about your site’s risk exposure from 3rd party vendors? Google has a top notch network team that was able to minimize the impact for many services (Shopify the big exception). But delays like this will often last for last hours and occur at the worst times. The  IBM / Coremetrics issue on Black Friday in 2018 is a recent example. If this can happen to Google, IBM and Amazon all within the last two years, it can happen to any vendor service used on your site. 

The typical eCommerce website uses between 30-40 third party javascripts to build their pages. A delay in just one javascript can slow down your entire site. eCommerce teams need to go into every day expecting delays, and take the proper precautions to ensure they don’t impact shoppers.

How to prevent 3rd party website technology problems

The best way to prevent performance problems from 3rd party services is to assume they could happen at any moment. Given that, here are steps you can take to ensure a similar outage doesn’t bring down your site this year.

1. Have a Network vs Single Provider Backup Plan Relying on a single cloud provider is a mistake. Make sure your site and all your mission critical vendors take advantage of multiple networks. That way an outage in one won’t disrupt all traffic to your site.

2. Sequence the Loading of 3rd Party Javascript Structure your web pages so that a slow loading page element never delays your pages. It’s not enough to just monitor performance, then intervene manually when a problem occurs. Because even a 2 minute delay between detection and resolution is 2 minutes of lost sales for your website. For example, Yottaa technology expects every 3rd party javascript to fail every time. It puts rules in place that sequence the loading of content so visitors receive an interactive page before it starts loading 3rd parties in the optimal sequence. If a 3rd party javascript never loads, it never impacts the shopper’s ability to use your site.

3. Monitor Performance at the 3rd Party Javascript Level to Detect Anomalies Even if a 3rd party javascript error isn’t delaying your pages, it still creates a problem for your shopping experience. You put that 3rd party on your site for a reason, and now it’s delivering an error instead of the value you expected. Monitor performance of each individual 3rd party javascript on your site so you can identify anomalies, SLA violations, and persistent problems before your shoppers see them. You can get a head start on understanding which 3rd parties will most likely impact site performance by downloading the 2018 eCommerce 3rd Party Index.

4. Have an Anomaly Resolution Plan – Make sure you have clear action plan to resolve performance anomalies when they arise. Because even a series of small anomalies, when left unaddressed, can add up to impact your revenue. Over time, strive to detect and resolve smaller and smaller anomalies, so you can deliver a more consistent site experience for all your site visitors. 

All eCommerce sites need to plan for performance anomalies

The Google outage exposes the interdependencies within the eCommerce ecosystem. A problem with one vendor can ripple through and impact sites that don’t even contract with that vendor directly. Make sure you have a plan and technology to manage 3rd party vendor anomalies before they take down your site. Because you don’t want to be that one eCommerce team sifting through performance alerts and log files on a Sunday afternoon while your better prepared competitors sail by you. 

Google Blog Fig 2

Search