Yottaa’s Web Performance Optimization Blog
The Yottaa Web Performance blog covers various web performance topics - web page performance, website speed, page load time, web acceleration, performance benchmark test, YSlow score and web performance optimization. We also touch on Cloud Computing and Start Up activities from time to time.Authors
Categories
-
Follow @yottaa On Twitter:
Archives
Google Analytics: How to segment and filter out robot traffic
Posted on March 2nd, 2011 by cweekly
Google Analytics (“GA”) is the most popular web analytics tool on the Web, largely because it is both free and excellent. In the past, we have blogged about direct UX measurement in GA, and we are able to provide our beta customers with reports and data visualizations that combine Yottaa performance metrics with business metrics from GA. It’s exciting stuff if you share our passion for web performance and analytics.
Because performance monitoring systems like ours can impact GA reports, it feels right to spend a little time helping our users and the larger GA community with explicit instructions for filtering and/or segmenting traffic coming to your websites. These directions are specific to GA, but the principles are easily applied to other web analytics systems (such as Coremetrics Analytics or Omniture SiteCatalyst).
THE PROBLEM:
First, a description of the problem: some types of traffic to your website should simply never be counted in your reports. Internal traffic (from developers and testers) is one such category. Traffic generated by search engine crawlers like Googlebot is another. Similarly, traffic from automated solutions for testing or monitoring your site, such as Yottaa Insight, Keynote, Gomez and BrowserMob, should not appear in your metrics.
Because GA is implemented using JavaScript, simple crawlers that just follow links and don’t know how to execute JavaScript are automatically ignored by GA. However, there are increasingly sophisticated bots out there, including our own Yottaa Website Performance Monitoring robots, which can’t win at Jeopardy just yet but do know how to do things like run JavaScript and accept cookies. These smart bots are tougher to distinguish from normal human users, and by default GA will include the traffic they generate. Hence, the need to create custom segments and/or filters in GA. In both cases, you simply need to teach GA how to identify the bots, by looking at their browser type (which comes from the “User-Agent” header). Yottaa Website Performance Monitoring bots always identify themselves with “YottaaMonitor” in the User-Agent header.
GOOGLE ANALYTICS SEGMENTS AND FILTERS DEMYSTIFIED:
Before we dive into the step-by-step instructions, a quick overview of GA segments and filters may be helpful. Segments are merely ways of grouping users in your reports. Using segments will alter your view into your data, but will not change what’s actually being collected. Segments apply retroactively, which is to say that when you define a segment, you can then view all your historical data through the lens of the segment. So if you see traffic from Yottaa or other bots “polluting” your reports, have no fear, you can easily make it go away.
GA filters are more invasive than segments, in that they don’t just alter your view, they actually impact what is stored by GA and available for reporting. Filtering cannot be applied retroactively and only affects data collected after the filter has been created. Some GA users feel more comfortable first testing and refining custom segments to be sure they’ve got it right, then creating a filter (using the same rules or logic) once they’re sure. Alternately, you can create a duplicate profile for the same domain and only apply your filter to one of them, thus preserving collection of all “raw” data off to the side, while leveraging the power of filters in your main reporting profile. (See https://www.google.com/support/analytics/bin/answer.py?answer=55494 for more detail on this approach.)
Ok, without further ado, here’s what to do.
STEP-BY-STEP INSTRUCTIONS FOR CREATING A CUSTOM SEGMENT TO HIDE YOTTAA BOT TRAFFIC FROM YOUR GA REPORTS:
- Log in to GA
- Click “View Reports” for your site / profile
- In the left column under “My Customizations”, choose “Advanced Segments”:

- Choose “Create new custom segment”:

- Under Dimensions > Systems, choose “Browser”:

… and drag it to the “dimension or metric” area:

- Edit the Condition to “Does not contain” (Don’t check the “case sensitive” checkbox)
- Under “Value”, type “YottaaMonitor” (without the quotes):

- Name the segment something like “Humans (no bots)”
- Click “Test Segment” (it should match on some number of visits, if you’ve been getting traffic from Yottaa monitoring bots)
- Click “Create Segment”, and you’re done.
Note if you find unwanted traffic from other bots too, you can either (a) create an additional “and” condition, and define a 2nd rule, or (b) change the Condition to “Does not match regular expression” and define a regex value to match on multiple bot names, e.g. “.*(YottaaMonitor|OtherBotNameHere).*” (without the quotes).
That’s it for segments. Now on to filters:
STEP-BY-STEP INSTRUCTIONS FOR CREATING A FILTER TO PREVENT YOTTAA BOT TRAFFIC FROM BEING COLLECTED IN GA:
- Log in to GA
- Click “Analytics Settings”
- Click “Filter Manager>>” (in the bottom-right corner of the page)
- Click “+ Add Filter”

- Name your filter (e.g. “Exclude YottaaMonitor”)
- Choose “Custom filter”
- Filter Type: [Exclude]
- Filter Field select “Visitor Browser Program”
- Filter Pattern “.*YottaaMonitor.*” (without the quotes)
- Case Sensitive: No

- Select your relevant website profile(s) from “Available Website Profiles” on the left, and choose the “Add” button to move them to the “Selected Website Profiles” area on the right
- Click “Save Changes”
ROBOTS.TXT AND BLOCKING MONITORING:
Finally, a note about the “robots.txt” Robots Exclusion Standard. Yottaa bots respect the rules of the road and will obey instructions found in robots.txt files. However, we strongly recommend against outright blocking of our bots, as doing so will prevent highly useful, free performance metrics from being collected. Filtering bot traffic out of your analytics tool of choice is simple, and allows you to continue monitoring your site in yottaa.com while keeping your analytics clean.
WHAT ABOUT YOU?
Do you have experience in adding segments and filters in GA? Do you have experience with segments and filters in other web analytics systems? Will you implement these as described above? Were we clear in our instructions? Feedback is always welcomed.
-
http://www.freshwaterschool.org/aerator-fishing-water/ Aerator Fishing Water | Freshwater & Saltwater Fishing Gear & Accessories
-
http://www.guaranteed-web-traffic.org/free-website-traffic-in-90-minutes/ Free Website Traffic in 90 Minutes | Guaranteed Web Traffic
-
http://www.sfwebdesign.com/google-analytics-how-to-segment-and-filter-out-robot-traffic Google Analytics: How to segment and filter out robot traffic … | SFWEBDESIGN.com
-
http://makemoneyfast.newwealthstreams.com/2011/watch-parallel-advisors-independent-wealth-management/ [WATCH]: Parallel Advisors Independent Wealth Management | Make Money Fast
-
http://www.moneymakingwebsitesecrets.org/green-living/green-living-tips/earthworks-friendly Earthworks Friendly | Green Living Tips | Information and Free Resources |
-
http://chris.weekly.org/blog/2011/03/23/blogging-for-real-over-at-yottaa Chris.Weekly.org – A Web Space » Blogging for real over at Yottaa
-
http://twitter.com/jvz Matt
-
Jun
-
http://www.CaseyCheshire.com/ Casey Cheshire
-
Jonathan
-
Anonymous
-
http://www.pegox.com/ Ravi