Adswerve

Unnest Your BigQuery Google Analytics 4 Data With Adswerve’s New Open-Source Tool


October 7, 2021

In one of our previous posts, we shared info about an open-source tool from Adswerve that unnests BigQuery (BQ) Google Analytics (GA) 360 data. We developed it for Universal Analytics (UA). Now we’re excited to announce a similar tool for the next generation of GA, Google Analytics 4 (GA4)! 

Background and Problem

Like UA, GA4 data can be exported into BQ. However, now this feature is free. And While GA4 data in BQ has a simpler structure than UA data, it still contains nested and repeated fields:

  • Nested data represents a struct, which is short for ‘Data Records Structure’. 
    • A struct is a flexible container that can hold data of different types (e.g., there can be strings and numbers in one struct).
    • In BQ, a struct can be described as a column inside of a column. The parent and child columns are separated by a dot. For example: event_params.key and event_params.value.
  • Repeated data represents an array. 
    • An array holds data of the same data types (e.g., only strings, numbers, or structs).
    • In BQ, an array can be described as a row inside of a row.

GA4 data in BQ has fields that are nested and repeated, or an array of structs. See the example below. It’s from our test GA4 data:

SELECT user_properties FROM
`{project}.analytics_{ga4_property_number}.events_{date_shard}` 
WHERE event_timestamp = 1626813980436000

  • There is a column user_properties with columns nested inside of it: key and value. There are additional columns nested inside the value column.
  • This is only one row of data, but there are rows inside of this row because it is repeated data.

Schema with nested and repeated data can make it harder to query the data (it takes a learning curve to get used to it, even if you are already experienced in SQL). And it may be incompatible with other data systems.  For example, you can’t import data into a spreadsheet.

Solution 

Our GA4 Flattener solves the above problems by flattening/unnesting GA4 data or converting all nested and repeated fields to regular fields (i.e., not a struct or an array, but a string or a number etc.). As a result of flattening, your GA4 data becomes familiar to analyze by using relational join queries. For flat data, imagine a spreadsheet table.

Below is the same data as in the previous example, but this time the data has been flattened:

SELECT * FROM `{project}.analytics_{ga4_property_number}.flat_user_properties_{date_shard}`
WHERE event_id like "%1626813980436000"

GA4 Flattener keeps your original GA4 data with nested and repeated fields, but it also writes four flat tables into the same BQ dataset.

Get Our GA4 Flattener Tool

We hope that our GA4 flattener tool will help you analyze GA4 data in BQ and extract actionable insights from it. You can find the repository, description and installation instructions on github.com, specifically at https://github.com/adswerve/google_analytics_flattener_ga4. We encourage all to join the open-source community and install this product.

Please don’t hesitate to reach out with any questions.