yottaa sprocket Support Login

Valuable tips and solutions for process management, delivered straight
to your inbox.

Subscribe to Email Updates

8 Monitoring Tips to Keep Your Site Up & Performing Well

Ari Weil on Nov 21, 2012 3:01:00 PM

Holiday traffic is expected to rise dramatically (>20%) this season, and sites large and small seek the competitive edge to capture as much revenue as possible.  So if you're like many WebOps, WebDev and Marketing teams, you might be asking yourself questions like these:

  • Is my site ready for the holiday rush?
  • My site visitors from NYC on IE9 browsers are complaining about site problems, but everything looks fine to my developers…What can I do?
  • What are easy ways to keep track of site issues – user experience problems, third party tags, browser rendering slowness, network latency, server errors?
  • How do the best websites monitor and manage their site performance and availability?
  • What are the steps I can take to make my site fast and reliable, without lots of development time and money?

In this blog post, we summarize 8 best practices to help you be proactive and prevent site problems during the holiday season.  And, these are also captured in our new eBook below (download it here).

Tip #1: Set up monitoring for key pages

describe the imageSet up monitoring for the pages that matter to your business.  For example:

Web Site & Mobile Apps:

  • Home page
  • Key landing pages (e.g., User registration page)
  • Pages using different technologies (e.g., User forum, knowledge base)
  • Dynamically-generated pages
  • Pages from different origin servers

eCommerce Sites:

  • Product catalog page(s)
  • Product detail page(s)
  • Customer shopping cart page
  • Customer checkout page

SaaS Apps:

  • User dashboard page
  • Key pages behind user login

Tip #2: Monitor performance & set up alerts

alert 451For the key pages outlined above, set up 24/7 monitoring and configure Email/text/SMS alerts.  Here's some common issues to watch for, and which teams may care about what:

Configuring issue definitions:

  • Critical Error:  Recommend network connectivity and page status code>400 as critical errors (this is the default setting)
  • Error:  We recommend treating  HTML content check as an error
  • Warning:  We recommend treating page asset issues, user experience issues and network performance issues as warnings

Customizing Alert Escalations to different teams:

  • Senior Management: Consider text messaging for critical errors not resolved after a certain duration
  • Development Team: Consider email alerting for critical errors, errors and maybe warnings
  • Operations Team: Consider text messaging for critical errors and errors

Tip #3: Use real browsers to monitor user experience

map 451Only real browser monitoring provides visibility into your users’ experience.  With Firefox, IE and Chrome browsers from locations across the planet, you can capture your users’ experience and see exactly what your users see.  (Product plug: Yottaa's Site Monitoring service does this!)

Selecting your monitor agent:

  • Use HTTP agent for APIs and HTTP services
  • Use HTTP agent if you are only interested in service availability and network performance
  • Use browser agents if you are interested in front end as well as backend performance and availability, including monitoring third party assets/widgets
  • Pick the most popular browser for your site
  • IE9 is still the most popular browser in general, but your site may differ

Monitoring frequency:

  • For availability monitoring, try 5 min or less
  • For Operations Team, try 5 min or less
  • For Development Team, try 10 min to 30 min

Tip #4: Trend user experience

chart with problemSome suggestions for what to measure, and what to look for.

Key Metrics:

Though there are literally a couple dozen metrics that provide a great deal of insights into user experience and site performance, three key ones are:

  • Time to Interact
  • Time to Start Render
  • Time to Last Byte

Recommended Approach:

  1. describe the imageLook at the Summary for Average Values
  2. If the average values are way out of the range, you may have problems - in which case...look at the trend data
  3. Trending curves should be as flat as possible (consistent)
  4. If there are sudden spikes, these spikes are typically problems 
  5. Explore them - look at the data samples to see what happened
  6. A consistent, reliable user experience delivers confidence
  7. If the trending curve fluctuates, you may have a problem

Tip #5: See what your users see

amazon screen1In addition to measuring the various user experience metrics, it's invaluable to actually SEE what the users see, using screenshots and videos of actual browser sessions.  (Product plug: Yottaa's Site Monitoring service does this!)

This is key for two reasons:

  1. Viewing the video and screenshot filmstrips gives you a more accurate sense of the actual user experience;
  2. This technique allows you to correlate the visual capture with the user experience & site performance metrics

So check out the film strip, and consider: would you be happy as a site visitor?  And, play the film strip as a video to get a realistic replay of the actual user experience.

victoria screenshots

To get insights into where in the page loading sequence there may lurk problems, it can be particularly helpful to review the waterfall chart (more on that here and here).

describe the image

Analyzing the Page Load Waterfall:

  • Are there any particularly slow loading assets?
  • Are there any HTTP status code errors?
  • Are their obvious gaps in the waterfall? (Which typically means that there's heavy client side JavaScript execution blocking the entire browser)
  • Stats to Think About: How many requests are there? (Too many requests can mean a slow experience) How many third party assets? (Too many of them can mean an unreliable user experience)

Tip #6: Establish performance & SLA goals

uptimeIt's been said that, "You can't manage what you can't measure."  So one best practices is to explicitly define your performance goals, and/or establish the service level agreement (SLA) with your (internal or external) stakeholders.

describe the image

Tip #7: Capture & diagnose site & app issues

As part of your performance management strategy, you'll want to define and monitor the issues that matter to your business - with a scope that'll give you 360° visibility into all factors impacting your site's or app's performance:

  • Server issues
  • Page errors
  • Network problems
  • 3rd-party asset issues
issue definitions

Some tips on dealing with issues:

First, you'll want to configure issue definitions.  For example, here's one common approach:

  • Critical Error:  Recommend network connectivity and page status code>400 as critical errors (this is the default setting)
  • Error:  Recommend HTML content check as an error
  • Warning:  Recommend treat page asset issues, user experience issues and network performance issues as warnings

Periodically, check the Issues summary to see how many events each issue has generated.  The more events, the higher priority to tackle the issue.

Tip #8: Carry out statistically meaningful sampling

If you want confidence in your decisions, you'll need confidence in the measurements that drive those decisions.  Statistically meaningful testing lets you eliminate random noise.  And in addition to collecting a large enough sample size, another important dimensino is to schedule tests across the full range of geographies, browsers, and last mile connectivities that matter.

samples

So - get started today, for free!

Ari Weil

VP of Products | I am a hands-on, results-focused, resourceful and creative product leader with a track record for successful solutions from initial concept throughout the product lifecycle. Filling various management and operational roles at each company I've worked, I enjoy and thrive under pressure. Nirvana is a dynamic, fast-paced organization where creativity, quality, collaboration and customer focus are key to delivering truly impactful products. Specialties: Product management, lifecycle management, networking and communication, database administration, performance tuning, production deployment and support, some system administration background. Experienced and engaging public speaker and evangelist.