Bots are a pain in the butt. I look after a number of GA4 properties, and have spent more time than I’d like to admit identifying and dealing with bots. GA4 automatically filters known bots and spiders, but for the most part that means bots that are polite enough to identify themselves. Every week, I encounter new bots that are not so polite.
Below is a series of articles I wrote describing methods I’ve found for identifying bot traffic, filtering it from GA4 reports and excluding it from GA4 altogether.
- Hunting for Bots: follow along as I describe my hunting expedition. One down, infinity to go.
- How to identify bot traffic in GA4: some of the attributes we look at to determine if a traffic surge is in fact a bot. These attributes can then be used to configure filters.
- How to remove bot traffic from GA4 reports: unfortunately, there is no way to remove bot traffic from historical data in GA4. But you can filter it from reports.
- How to filter bot traffic from GA4: the only built-in functionality for filtering traffic in GA4 requires that you know the IP address(es) you want to filter. This is rarely sufficient for filtering bots. In this article, I describe how you can set the GA4 traffic_type parameter in Google Tag Manager based on identifying characteristics of a bot.
If bot traffic is a really big problem on your site, there are a couple of other things you should consider:
- Cloudflare can prevent bots from reaching your site in the first place. I expect this is true of other CDNs, but I have direct experience with Cloudflare and it works well.
- If you use the GA4 BigQuery export as your reporting back end, you can remove bot traffic after-the-fact. This is one of various benefits of the BigQuery export data. And if you are not sure how to get started, we can help.