How to connect GA4 to BigQuery

One of the crucial differences between Universal Analytics and Google Analytics 4 is the free native GA4 BigQuery integration that GA4 has. 

This integration is a step forward for Google Analytics as a product. BigQuery allows you to work with the raw data and conduct any research with the data you are interested in. 

Of course, the core element of this is to connect them right. Therefore, I wrote this article to show you how to connect Google Analytics 4 with BigQuery and answer all relevant questions.

What’s BigQuery?

BigQuery is a cloud data warehouse created by Google. In short, it’s a place where you can store your data in a tabular-like format and use SQL language to query it. You can write the data into BigQuery, and you can read this data. 

It's how your GA4 data will look in BigQuery after you connect Google Analytics 4 to BigQuery.
GA4 data in BigQuery. It’s what you get once you connect Google Analytics 4 to BigQuery.

You can also export your BigQuery data into Google Sheets or .CSV file and work with this as you would work with any data in Excel.

Why do you need to connect GA4 to BigQuery?

There are at least two benefits of connecting your GA4 account with BigQuery:

  • As I mentioned above, you can access the raw data. It means the GA4 interface does not limit you, and if you know SQL or have the data team, you can get any report you are interested in because you have access to granular data.
  • BigQuery allows you to overcome GA4 data freshness and have real-time analytics. It means you can use BigQuery and Looker Studio (previous Data Studio) to visualise your data and take actions faster than relying on just the GA4 interface. 
  • As a sequence of the first point – access to the raw data, you can build machine-learning models for your product or use your data for a better forecast. 

As a result, you can hear how many companies (including big ones, for instance, Spotify) have started to benefit from using BigQuery. It’s indeed a valuable asset for any business. Moreover, if you rely entirely on GA4, you can consider BigQuery to be able to meet with your team on Monday and overview the previous week’s performance. Otherwise, you won’t be able to do it due to GA4 data freshness.

How much does GA4&BigQuery cost?

So far, we talked about benefits; let’s talk about associated costs now.

The BigQuery and GA4 integration is free, but you must pay for BigQuery. BigQuery costs depend on two parts: a) storage and b) analysis pricing

If the first depends on the amount of data you store in BigQuery, the second refers to the cost to process queries, including SQL queries, user-defined functions, scripts, and certain data manipulation language (DML) and data definition language (DDL) statements that scan tables.

Google provides a free tier. Thus, you can store up to 10gb data every month for free, and the first 1 TB of query data processed monthly is free.

It’s also worth mentioning that if you don’t touch data for 90 days, you will pay less for the storage. It’s called long-term storage, according to BigQuery documentation.

How to use BigQuery with GA4 for free?

Although I wrote above that there are associated costs for using BigQuery and a free tier, you can store your data without adding billing information to the Google Cloud project at all.  In this case, your data will have an expiration period of 60 days (limitations). It means that your BigQuery dataset will have only data about the last 60 days. 

Also, you won’t be able to use data streaming mode in BigQuery integration since it’s allowed only for Google Cloud accounts with billing enabled. Because of that, you won’t have the real-time analytics benefit I mentioned previously in this article.

Why should you pay for BigQuery using GA4 for free?

As you see, the last 60 days’ limitation and no real-time analytics are huge limitations.  I encourage you to consider adding billing details to your account. 

The free tier will likely be enough for your monthly usage if you are a small business. However, constantly monitor your costs, and set up alerts in Google Cloud.

For instance, I pay less than $1 monthly for my website, but I also use Google Cloud Functions, Schedulers and other Google products.

How much it will cost you if you connect Google Analytics 4 with BigQuery
Showing how much it costs me monthly after I connected Google Analytics 4 with BigQuery.
Please refer to this article to learn how to reduce BigQuery costs.

How to integrate Google Analytics 4 with BigQuery?

Now, after we are on the same page, you understand the benefits of using BigQuery and which format (free or paid) you stick to. Let’s look at how to connect Google Analytics 4 with BigQuery. Let’s go through each step and pair them.

  1. Create a Google Cloud Platform account

  2. Create a new project on Google Cloud (name it your organization or website) and enable billing in your project (optional, according to what you selected in the previous chapter of this article)

  3. Open the GA4 Admin page and click on “BigQuery links”, click on “Link” button

  4. Choose your Google Cloud (BigQuery) project (that was created in the second step), select Data Location and click “Next.”

  5. Configure data streams and events you want to store in BigQuery (data streams, frequency, etc.)

  6. Review the settings provided and click the “Submit” button. 

You linked your GA4 property with BigQuery successfully. 

When will you see the first data in BigQuery?

After you link GA4 and BigQuery, you should wait at least 24 hours to see the first data. You can see the data faster by enabling the billing and using streaming integration mode.

To find your data, go to Cloud Console – > BigQuery -> SQL workspace.

GA4 Data in BigQuery
Google Analytics 4 Data in BigQuery after I connected them and waited for 24 hours

BigQuery GA4 schema: 4 tables

When you export your GA4 data to BigQuery, you will likely see four tables within your project; this depends on what you chose in your GA4 settings when you made the integration or the data your GA4 property is collecting. These tables can be grouped into two groups: the Events and the User Schema. Let’s look at them in detail.

Events Table Schema

The event schema in a Google Analytics 4 (GA4) BigQuery project defines the structure and fields of the event data that is stored in the BigQuery tables. It specifies the information captured and available for analysis in the GA4 property.

Events Daily Schema

This table is your GA4 data table that contains both event data and some user data. It might be the only data table in your project, depending on your choice of data export during the GA4 BigQuery integration setup. The data schema is named “events_numberofdays,” e.g., events_258, as seen in the image below, shows that the number of days of exported data is 258. When you click on the schema, you can see the individual tables for each day.

GA4 BigQuery Events Daily Schema
GA4 & BigQuery Events Daily Schema

The schema of a Google Analytics 4 (GA4) BigQuery event data table includes the following fields:

  • event_timestamp: The timestamp of when the event occurred.
  • event_name: The name of the event.
  • event_bundle_sequence_id: The sequence ID of the event bundle.
  • event_bundle_sequence_index: The index of the event within the bundle.
  • event_bundle_id: The ID of the event bundle.
  • user_id: The unique ID of the user.
  • user_pseudo_id: The pseudonymous ID of the user.
  • user_first_touch_timestamp: The timestamp of the user’s first touch.
  • user_ltv: The lifetime value of the user.
  • device: Information about the user’s device, such as device category, brand, and model.
  • geo: Information about the user’s geographic location, such as country, region, and city.
  • app_info: Information about the app, such as app ID, version, and name.
  • traffic_source: Information about the source of the traffic, such as source, medium, and campaign.
  • stream_id: The ID of the data stream.
  • platform: The platform on which the event occurred, such as web, iOS, or Android.
  • event_params: Additional parameters associated with the event, such as event value, currency, and items.
  • user_properties: Additional properties associated with the user, such as user type, language, and age.
  • user_first_visit_timestamp: The timestamp of the user’s first visit.
  • user_engagement_time_msec: The engagement time of the user in milliseconds.
  • event_previous_timestamp: The timestamp of the previous event.
  • event_value_in_usd: Indicates the value associated with the event in USD
  • privacy_info: The privacy_info fields store data related to a user’s consent status when consent mode is turned on.
  • event_dimensions: Contains dimensions associated with the event.
  • ecommerce: The ecommerce record stores details pertaining to ecommerce events that have been configured on a website or app.
  • items: The items record holds information about individual items that are part of an event. It is repeated for each item included in the event.
  • collected_traffic_source: The collected_traffic_source record stores the data related to the traffic source that was collected along with the event.
  • is_active_user: It indicates the activity status of a user. 

These are some of the common fields in a GA4 BigQuery event data table schema. The actual schema may vary depending on the specific configuration and events tracked in your GA4 property.

Events Intraday Schema

When you select the streaming option during the integration, real-time events data are stored in this table. So, each time you view the table, it shows the events data for that current day.

GA4 BigQuery Streaming Export Type
GA4 BigQuery Streaming Export Type

Users Data Tables Schema

The user data tables in a Google Analytics 4 (GA4) BigQuery project store information about individual users and their interactions with your app or website. They contain user data along with audience and prediction data, which are not available in the events table.

Pseudonymous Users Schema

The Pseudonymous_users table is the user table. The data schema is named “pseudonymous_user_numberofdays,” e.g., pseudonymous_user_53, as seen in the image below, shows that the number of days of exported data is 53. When you click on the schema, you can see the individual tables for each day.

GA4 & BigQuery Linking Pseudonymous Users Schema
GA4 & BigQuery Linking Pseudonymous Users Schema

The schema of a Google Analytics 4 (GA4) BigQuery pseudonymous user table includes the following fields:

  • pseudo_user_id: The pseudo-ID of the user.
  • user_info: Contains user details like last active, first active
  • stream_id: The stream id the data was exported from
  • device: Information about the user’s device, such as device category, brand, and model.
  • geo: Information about the user’s geographic location, such as country, region, and city.
  • audiences: audience data.
  • user_ltv: The lifetime value of the user
  • user_properties: Additional properties associated with the user, such as user type, language, and age.
  • predictions: The predicated user’s data.
  • privacy info: Privacy information associated with with the user
  • occurrence date: Date when the record was updated in the table

These are some of the common fields in a GA4 BigQuery user table schema. The actual schema may vary depending on the specific configuration and events tracked in your GA4 property.

Users Schema

Suppose you are collecting User ID in your GA4 property, and you selected daily export during the GA4 BigQuery integration set up under user data. In that case, this table will be populated and listed in your BigQuery project.

GA4 & BigQuery Users Schema
GA4 & BigQuery Users Schema

The schema of a Google Analytics 4 (GA4) BigQuery User table includes the following fields:

  • pseudo_user_id: The pseudo-ID of the user.
  • user_info: Contains user details like last active, first active
  • device: Information about the user’s device, such as device category, brand, and model.
  • geo: Information about the user’s geographic location, such as country, region, and city.
  • audiences: audience data.
  • user_properties: Additional properties associated with the user, such as user type, language, and age.
  • user_ltv: The lifetime value of the user
  • predictions: The predicated user’s data.
  • privacy info: Privacy information associated with with the user
  • occurrence date: Date when the record was updated in the table
  • last updated at: Date when the record change was triggered

How to visualise GA4 BigQuery data in Google Looker Studio?

To visualise GA4 BigQuery data in Google Looker Studio, you must select BigQuery as a data source. 

To do this, simply go to the Looker Studio welcome page and click on the create a data source option.

Creating Data Source in Looker Studio
Creating Data Source in Looker Studio

Click on BigQuery and select the BigQuery project you would like to connect to by identifying its project number, navigate the dataset, and then select the table you want to use.

Connecting BigQuery to Looker STudio
Connecting BigQuery to Looker Studio

You can now use Looker’s drag-and-drop interface to create charts, graphs, tables, and other visualisations based on your GA4 BigQuery data.

BigQuery&GA4 integration limits

There are several limits and considerations when integrating BigQuery with Google Analytics 4 (GA4). Here are some of the key limits to be aware of:

  1. Storage limits: BigQuery has storage limits based on the pricing tier you choose. Free-tier users have a limit of 10 GB, while paid users can store up to petabytes of data.
  1. Query limits: BigQuery has query limits based on the amount of data processed daily. Free-tier users have a limit of 1 TB per day, while paid users have higher limits based on their pricing plan.
  1. Concurrent queries: BigQuery limits the number of concurrent queries you can run. The limit varies based on your pricing plan.
  1. Data transfer costs: Moving data between GA4 and BigQuery may incur data transfer costs, depending on the amount of data transferred. Standard GA4 properties have a maximum limit of 1 million events for Daily (batch) exports, while there are no restrictions on the number of events for Streaming exports.

Refer to Google’s documentation for further details.

Do you need help with setting up GA4&BigQuery linking?

Optimise your GA4 data by linking your GA4 with BigQuery. Our agency specialises in seamless integration of GA4 with BigQuery, as well as other cloud-based data warehousing and Data Analysis solutions. With a deep understanding of data security, regulatory compliance, and the handling of sensitive data, our experienced consultants are well-equipped to meet your needs. Contact us for expert guidance and seamless integration.

Conclusions

Google Analytics 4 has an in-built integration with BigQuery, where you can store the raw website data and use it for further research and ad-hoc analyses. Also, this integration allows you to transform GA4 into real-time analytics. 

Of course, it’s not free if you want to stick to real-time, but even if you want to play with that and see how it works, you can use Google Cloud sandbox and see the raw GA4 data for the last 60 days for free.
This new feature is a game changer for many companies, and I encourage you to activate it once you set up GA4.


Frequently Asked Questions

What are the benefits of linking GA4 with BigQuery?

The main benefits of using BigQuery with GA4 are the following:
1) Access to raw website data to conduct ad-hoc analysis
2) Ability to get real-time analytics by using streaming integration
3) Opportunity to build machine-learning models using data in BigQuery to advance your product

How much does GA4&BigQuery cost?

The integration is free, and you can use BigQuery sandbox for free. However, you should add billing details if you want more than the last 60 days’ data. Then you pay for storage and analysis. The first refers to the data size, and the second refers to the cost to process queries, including SQL queries, user-defined functions, scripts, and certain data manipulation language (DML) and data definition language (DDL) statements that scan table

When should I select a streaming frequency for GA4&BigQuery integration?

If you want to have real-time analytics, you should select a streaming frequency. Keep in mind that there are associated costs with that.

How to integrate GA4 with BigQuery?

You should follow the steps below to integrate GA4 and BigQuery:
1) Create a Google Cloud Platform account
2) Create a new project on Google Cloud (name it your organisation or website)
3) Enable billing in your project (optional, according to what you selected in the previous chapter of this article)
4) Open GA4 Admin page and click on “BigQuery links” 
5) Click on “Link” button, choose your Google Cloud project (that was created in the second step) 
6) Select Data Location
7) Configure data streams and events you want to store in BigQuery
8) Select the data frequency (Daily or Streaming). 
9) Review the setup and click “Submit” button.

Leave a comment

Your email address will not be published. Required fields are marked *