GA4 doesn’t have a metric that is comparable to the Page Value metric from Universal Analytics. In this post, we will recreate the metric using GA4 data in Google BigQuery.
Google defined the UA metric as:
Page Value is the average value for a page that a user visited before landing on the goal page or completing an Ecommerce transaction (or both). This value is intended to give you an idea of which page in your site contributed more to your site’s revenue. If the page wasn’t involved in an ecommerce transaction for your website in any way, then the Page Value for that page will be $0 since the page was never visited in a session where a transaction occurred.
— from How Page Value is calculated
The free integration with BigQuery is one of the things that got me excited about GA4 in the early days after it’s release. I love playing with data and it seemed like the ultimate playground. But actually making GA4 data useful in BigQuery was a much heavier lift than I expected. If you are in the same place as I was, be patient and do the work. I’m only part way to where I want to be, but I am starting to feel unbound by the constraints GA4, UA or any other tool impose on how I analyze and use website visitor data.
A couple of resources I can’t recommend too highly for learning the ins-and-outs of GA4 data in BigQuery:
This query is actually a series of subqueries that work together to produce the result we are looking for. I’ve separated out each subquery below so I can explain what is happening along the way. These subqueries won’t work by themselves, so you can copy the full query from here.
Keep in mind that analysis in BigQuery does incur costs, though in my experience they tend to be pennies per month unless I am working with very large datasets.
The ‘with’ clause below is a way to organize subqueries for readability. I could have glommed them all together into one statement with a bunch of inline subqueries – the syntax ‘with x as (query)’ does exactly the same thing. When done this way, you can reference ‘x’ as a table in subsequent queries.
The first subquery is just a sneaky trick for setting a date range in one place versus having to update it in multiple places.
with dates as (
select
'20230812' as start_date,
-- the next line gets yesterday
format_date('%Y%m%d',date_sub(current_date(), interval 1 day)) as end_date
),
The query below gets all events with associated revenue and timestamps. Note that it is getting value in USD. The ecommerce.purchase_revenue field stores revenue in the GA4 property default currency, but I chose to use event_value_in_usd instead, because it also aggregates non-ecommerce conversion values.
I am concatenating user_pseudo_id and ga_session_id to get session_id. This is because ga_session_id is really just a timestamp, so not necessarily unique to a user. Weird.
Make sure to change the table reference to your table. The syntax ‘events_*’ followed by ‘where _table_suffix …’ is because the GA4 events table is date-partitioned. Each partition reference ends in ‘_YYYYMMDD’. Using ‘_table_suffix’ in the where clause enables you to query across partitions.
p as (
select
concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) as session_id,
event_timestamp,
sum(event_value_in_usd) as event_value
from `your-gcp-project.your-ga4-dataset_123456.events_*`
where _table_suffix between (select start_date from dates) and (select end_date from dates)
and event_value_in_usd is not null
group by session_id, event_timestamp),
The next query gets all page views with timestamps.
q as (
select
concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) as session_id,
event_timestamp,
(select value.string_value from unnest(event_params) where key = 'page_location') as page_location
from `your-gcp-project.your-ga4-dataset_123456.events_*`
where _table_suffix between (select start_date from dates) and (select end_date from dates)
and event_name = 'page_view'
),
In the following subquery I am doing several odd/tricky things:
r as (
select
concat(q.session_id,p.event_timestamp) as pseudo_session_id,
p.event_value,
q.page_location,
(case when (q.event_timestamp
The next query is where I divvy up event value for each page view. I accomplish this with a window function. Window functions are too big a topic to fully explain here, but the basic idea is that a window function can do operations on multiple rows in a dataset in contrast with a standard select statement which operates row-by-row.
s as (
select
pseudo_session_id,
page_location,
event_value / (count(page_location) over (partition by pseudo_session_id)) as page_revenue
from r
where is_before_revenue_event),
We’re getting close now. This query just sums up revenue by page. At this point, this is just an accumulation of page value. We need to take into account how many times each page has been viewed overall to replicate the Universal Analytics Page Value metric. We will do that in the next query.
t as (
select
page_location,
sum(page_revenue) as page_revenue
from s
group by page_location),
This query gets a count of all page views by page_location. We’ll use this as the denominator for page_value in the last query.
u as (select
page_location,
count(page_location) as views
from q
group by page_location)
Bringing it all together, we divide the sum of value for each page by the total number of times the page was viewed to get page_value. Boom!
select
u.page_location,
u.views,
t.page_revenue/u.views as page_value
from u
left join t on u.page_location = t.page_location
order by views desc
A few things to know/keep in mind:
One of the things I love about working with GA4 data in BigQuery is that once you have figured out how to replicate a GA4 dimension or metric (or a metric from UA in this case), you can tweak your query to better suit your needs. Here are a few things I’ve done or thought about doing with this query:
So far we’ve gone over how to generate a page value metric, but you probably also want to incorporate it into your reporting. If you are proficient in BigQuery, you probably have ideas of how you can do that, but if not, Looker Studio is a great option. The BigQuery connector in Looker Studio has a CUSTOM QUERY option that allows you to execute this query directly from Looker Studio. To use it:
with dates as (
select
@DS_START_DATE as start_date,
@DS_END_DATE as end_date
),
What’s great about this approach is that you can update the dates with a standard Looker Studio date range control. But also keep in mind that this will run the query every time you view the data. If you have a high volume of GA4 events or your dashboard gets a lot of views, this method could incur some BigQuery costs.
Learn more about GA4 reporting in Looker Studio in our tutorials, including how to report on individual GA4 conversions in Looker Studio and how to create a GA4 scroll tracking report in Looker Studio.
Two Octobers upskills digital marketers of all skill levels. Check out our GA4 and Looker Studio small-group training program.
Nico loves marketing analytics, running, and analytics about running. He's Two Octobers' Head of Analytics, and loves teaching. Learn more about Nico or read more blogs he has written.
This month we cover an intriguing study on search behavior and a fascinating trove of…
Musings on "what matters is measurable", new Looker Studio features, a cool free GA4 audit…