One of the concerns we are hearing from our clients is, “why don’t GA4 and Universal Analytics match?!?” The first thing that enters my head when I hear this question is, “did you actually think Google Analytics was truth?”
I wrote a blog post that gets into the mechanics of why web-based tracking is never an accurate reflection of what actually happened, but the TL;DR version is that digital analytics is good at capturing events that get sent to the analytics server. Metrics like “users” and “session duration” are fictions based on choices made by developers. Engage in a technical conversation with an analyst, and you are likely to hear, “it’s directionally accurate” after not too long.
That said, there are some practical reasons why you are likely to see differences between GA4 and Universal. Some are fixable and some are not. Following are the top causes of discrepancy we’ve encountered as we have set up and troubleshot hundreds of GA4 properties.
- GA4 and UA are configured differently – people often make the assumption that GA4 is the problem, but in my experience the number one cause of discrepancy is that GA4 and Universal Analytics are set up differently.
- Filters – GA4 doesn’t have views like Universal Analytics, and only supports filtering by IP address. Actually filtering traffic by IP in GA4 is also quite confusing.
- Tag coverage – it’s fairly common that Universal Analytics tags have been implemented directly on a site versus via Tag Manager. We often find that there are gaps between the page coverage of UA tags and GA4.
- Different event triggers and rules – since GA4 is an entirely new platform and requires a fresh start, it’s easier to get things right. I’m working with one client right now whose UA setup predates the Internet, I’m pretty sure.
- 3rd-party widgets/add-ons – in my experience, chat widgets, ecommerce platforms, form plugins, etc. that have built-in GA tracking features behave a little or a lot differently. For example (and I still can’t believe this), Shopify calculates revenue differently for GA4 vs. Universal Analytics with its native integration. Some tools also use the GA4 measurement protocol to send event data to GA4. With the measurement protocol, it is very easy to get session attribution wrong. This results in inflated session and user counts, as well as discrepancies in conversion attribution.
- Cookie-consent – I’ve been troubleshooting a number of cookie-consent setups recently and so far I haven’t encountered one that’s set up properly. Different consent behavior between UA and GA4 can result in significant differences.
- GA4 and UA calculate things differently – sessions and pageviews are not that different between the two, but how users are identified has changed and any metric related to engagement is totally different. These two videos cover a lot of the changes:
- GA4 vs. Universal Analytics – Key Differences
- GA4 Engagement Metrics: Session Duration and Average Engagement Time
- Thresholding – GA4 applies thresholding when it deems there is a risk of compromising personal information.
For example, dimensions such as search queries, age and gender could theoretically narrow down to individual users. I can live with thresholding when reporting on demographics data – that makes sense to me, but thresholding can also result in event counts zeroing out even when you are not using demographic dimensions.
This video has a good explanation of thresholding and a clever trick for getting around it (sometimes): - Google Analytics 4 Events Not Showing Up in Reports
- Google help on thresholding
- Spam filtering – I’ve found that Universal Analytics does a better job of spam filtering, especially if the ‘Exclude all hits from known bots and spiders’ setting is enabled. I’ve seen situations where there is a huge traffic spike in GA4 resulting from spam, and no corresponding spike in UA. Sometimes, you can apply a report-level filter to get rid of it, if the traffic comes from a specific browser, location or other identifiable dimension.
- Cardinality – cardinality can be an issue when you have a dimension with a lot of unique values, such as Page path. Google indicates that a row limit will kick in when a dimension has more than 500 values, grouping values below row 500 into (other), but I’ve seen an (other) row show up with a lot fewer rows than that.
- Sampling – reports based on sampled data were an annoying problem in Universal Analytics, but I rarely run into it in GA4. Google says that reports only get sampled when they are based on > 10 million events, which is pretty hard to get to for most sites. If you do run into sampling, you can just shorten your date range to make it go away.
- Estimation – data discrepancies are always annoying, but this one really takes the cake. GA4 User and Session metrics are based on estimations! This can cause discrepancies between two different GA4 reports, much less between GA4 and UA. If you feel like making your brain hurt a little, take a gander at this excellent article on how GA4 estimation works.
A different issue that often gets bunched together with data discrepancies is the fact that the GA4 data source for Looker Studio works differently from the Universal Analytics data source. It can be tricky to get reports to show the data you want, which can lead to some pretty hacky workarounds. We have a lot of expertise with building GA4 dashboards, so reach out if this is a problem you are struggling with.