If you’ve seen suspicious activity in your Google Analytics property, you know how frustrating it is. Your historical data is polluted with non-human traffic, making it hard to understand what’s really happening on your site. Your first step is to identify the signature attributes of the bot. Next, you’ll want to use those characteristics to filter out the traffic in reporting. And if the traffic is continuing to hit your website, setting up a filter so it’s not being logged in your GA4 data is a smart move. This article describes how to do that, leveraging GA4’s “Internal traffic filters” in a way you may not have realized is possible.
In Universal Analytics, we had the ability to filter traffic from a property view based on a variety of dimensions. In GA4 our options for filtering at the property level are a lot more limited. GA4 lets us create ‘Internal traffic filters’ and that’s about it. This feature is poorly named – we can actually use it to filter any traffic based on IP address, internal or otherwise.
What is less obvious and less well-documented is that the actual filtering takes place based on an optional ‘traffic_type’ parameter. This method of filtering traffic requires two steps:
But you don’t actually need to do step 1! If you set a value for the traffic_type parameter in your Google Tag, you can create a Traffic filter without creating an Internal traffic rule. This gives you A LOT more power to exclude traffic using Google Tag Manager (GTM). At a high level, this process looks like this:
I walk through a real-world example of this approach in my article Hunting for Bots. In that case, I used JavaScript in GTM to identify a specific browser version and screen resolution that were associated with a bot.
I also use this technique a lot to filter out dev traffic. It is often the case that developers work on a version of a site that has a different domain name or URL structure. To exclude dev activity from GA4, I create a regex lookup variable in GTM that outputs “dev” or “staging” based on this URL pattern. Then I follow steps 2 and 3 above to exclude the traffic. Note that when the traffic_type variable has a value of null, nothing happens – no harm, no foul.
This month we cover an intriguing study on search behavior and a fascinating trove of…
Musings on "what matters is measurable", new Looker Studio features, a cool free GA4 audit…