Why we filter out spam bots from Google Analytics data

on 20th July 2021

(Last updated 23rd April 2024)

We recently took on a client whose previous agency was very focused on using Google Analytics to drive development decisions. It was good to have a new client that understood the importance of measuring performance with quantitative data such as Google Analytics.

There was one major issue though. Their previous agency had used ‘raw’ Google Analytics data for their insights and recommendations. Not only were they using raw data to measure the site’s performance, they had also used this raw data as a performance metric for some of their commercial partners.

Now this would not have been a problem if we also provided raw Google Analytics data as a means to evaluate a website's performance. But we do not. And there are some very important reasons why we don't. And they involve robots, or ‘bots’. 

What is bot traffic and how much does it inflate site traffic?

Bot traffic is the result of any website request made by an automated process rather than a human interaction, and can make up a staggering 40% of your website's traffic

As well as artificially inflating a website's traffic, bots also do not behave as real humans do. Bot traffic results in:

  • Abnormally high page views
  • Abnormally high bounce rates
  • Abnormally high or low session duration
  • Spikes in traffic

By reviewing our client’s historic raw GA data, we realised they had a very skewed view of how real people were using the website. As well as presuming ALL their website traffic was made up of humans, they were also assuming that metrics usually used to explore users behaviour were accurate. In reality, their traffic was vastly inflated by bot traffic, and metrics such as dwell times, page views and bounce rates etc. were highly misleading.

How we clean up GA data

There are a few things we always do when setting up GA for the first time, in this case, when we inherit a GA account.

  • We make sure there is a view for raw data. If there isn’t one, we set it up. This gives us a ‘control’ view of all data
  • We filter out bot traffic in the view settings, and create a Google Analytics view that excludes these bots
  • We filter out our own, and our clients IP addresses
  • We monitor numbers regularly, and set up custom alerts, so that if traffic spikes (or dips) overnight we can spot bot traffic quickly.

It can't be guaranteed that your website traffic will be completely bot-free, but by keeping an eye on analytics with automatic monitoring and notifications, and making the data as accurate as possible, you can be confident that when you are analysing your analytics you are getting a much more realistic picture on how real people are behaving on your website.

Below are some practical guides on how to filter out spam and bot traffic, and IP addresses from Google Analytics:

How to filter out spam referrals from Google Analytics 4

How to exclude internal traffic from Google Analytics using an IP address filter

 

 

This article was posted in Development, SEO