How to Filter Bot Traffic from GA4

Contents
    Add a header to begin generating the table of contents

    Prevent bot traffic from being collected by GA4

    If you’ve seen suspicious activity in your Google Analytics property, you know how frustrating it is. Your historical data is polluted with non-human traffic, making it hard to understand what’s really happening on your site. Your first step is to identify the signature attributes of the bot. Next, you’ll want to use those characteristics to filter out the traffic in reporting. And if the traffic is continuing to hit your website, setting up a filter so it’s not being logged in your GA4 data is a smart move. This article describes how to do that, leveraging GA4’s “Internal traffic filters” in a way you may not have realized is possible.

    In Universal Analytics, we had the ability to filter traffic from a property view based on a variety of dimensions. In GA4 our options for filtering at the property level are a lot more limited. GA4 lets us create ‘Internal traffic filters’ and that’s about it. This feature is poorly named – we can actually use it to filter any traffic based on IP address, internal or otherwise.

    What is less obvious and less well-documented is that the actual filtering takes place based on an optional ‘traffic_type’ parameter. This method of filtering traffic requires two steps:

    1. Add an Internal traffic rule – when you do this you specify an IP address or range of addresses. The rule sets the value of the traffic_type parameter for all incoming events that match the IP address(es).
    2. Add a Traffic filter – if you set the Type of the filter type to ‘Internal traffic’, you can label or exclude traffic based on the value of the traffic_type parameter.

    But you don’t actually need to do step 1! If you set a value for the traffic_type parameter in your Google Tag, you can create a Traffic filter without creating an Internal traffic rule. This gives you A LOT more power to exclude traffic using Google Tag Manager (GTM). At a high level, this process looks like this:

    1. Create a traffic_type variable in Tag Manager using the full capabilities of javascript in GTM.
    2. Add a traffic_type parameter to your GA4 Google Tag that takes the value of your variable.
    3. Add a Traffic filter in GA4 based on the value you set.

    I walk through a real-world example of this approach in my article Hunting for Bots. In that case, I used JavaScript in GTM to identify a specific browser version and screen resolution that were associated with a bot.

    I also use this technique a lot to filter out dev traffic. It is often the case that developers work on a version of a site that has a different domain name or URL structure. To exclude dev activity from GA4, I create a regex lookup variable in GTM that outputs “dev” or “staging” based on this URL pattern. Then I follow steps 2 and 3 above to exclude the traffic. Note that when the traffic_type variable has a value of null, nothing happens – no harm, no foul.

    Subscribe
    Notify of
    guest

    0 Comments
    Oldest
    Newest Most Voted
    Inline Feedbacks
    View all comments

    GA4 Path Analysis with BigQuery

    This article details the process of building two BigQuery tables for path analysis, with a focus on creating Looker Studio reports that visualize user journeys through page views and events. It accompanies a GitHub repository featuring code for automating these transformations using Google Dataform. I’m often frustrated by GA4’s limited ability to visualize user journeys

    ➔ Read more

    Don't Miss a Beat

    Marketing analytics insights, delivered to you.

    Two monthly emails featuring our latest guides and discoveries.

    have you registered?

    Our next free digital marketing seminar is coming soon!

    [MEC id="946"]