IP Addresses and Google Analytics 4: What You Should Know

September 14, 2023 | Samantha Barnes
IP Addresses and Google Analytics 4: What You Should Know

As we get closer to decisive changes around user tracking, privacy regulation concerns have grown to be the top focus of web analytics. Headlines about multi-million-dollar lawsuits related to data are multiplying, especially against large tech corporations like Google and Facebook. Policymakers and watchdog organizations are cracking down on protecting user rights now more than ever before. However, it's a tricky subject since the definition of what makes a feature compliant can be complex and sometimes a result of interpretation. In GA4, one specific piece of information is getting more attention: IP addresses. 

Before recent years, the only time users had to think about IP addresses in Google Analytics was in terms of filtering out internal traffic. Now, it has become a primary focus for teams involving legal, information security, and professionals in specific verticals, like healthcare. Even though GA4 doesn't store or log IP addresses, there's concern about this gray area of compliance.

It might be tempting to shut down anything and everything that's related to IP addresses, but before making important decisions about data collection, decision-makers should have a general understanding about how GA4 handles them. The good news is that GA4's IP addresses are anonymized by default, but if you need to take extreme steps, you can change the way GA4 data is sent and remove the identifier completely.

However, keep in mind that Google Analytics has far less information about the device ID than you do. Every website collects IP addresses, so they're on the server that's powering your site right now. Ironically, one of their main uses is for cyber security.

IP Addresses: A Summary

Think of IP addresses like phone numbers: every device connected to the internet must have an IP address in order to connect. (Some IP addresses do contain letters, to be clear.) However, there are far more IP addresses than phone numbers. With so many existing IP addresses, it might seem like it would be more granular and accurate than a phone number, but the opposite is true.

Internet service providers supply customers with an IP address that is subject to change. It is tied to an internet connection, not to a device. Sometimes it refreshes when you restart your computer or turn the router off and on. Users will have a different IP address when connected to different WiFi networks or cell phone towers (if accessing the internet through your mobile service). IP addresses aren't stored or logged anywhere in GA (it's impossible to get IP address information from GA4's collected data), but they are used temporarily for geolocation.

IP Geolocation and Google Analytics

IP addresses contain location information. It's how networks answer the question "where is this device located?" before connecting. Accuracy depends on the level of location identification. A device's country is correct 95–99 percent of time; region (including U.S. states and Canadian Provinces) 55–80 percent of the time; and cities are correct 50–75 percent of the time. 

It's helpful to think about it like the difference between precise and accurate. IP addresses may be precise and show that several devices are near each other, but not accurate because it associates them to the wrong location. In GA4, location data is even less accurate by design. 

What makes an IP address accurate or inaccurate? Here’s an example: 

IP addresses have sections. Below is an IP address (you can see your own by typing "What is my IP address" into the Google search bar.)

73.67.46.24

Each section makes it more specific. It's like phone numbers, which have a country code, an area code, a prefix, and a line number. By default, GA hides or “masks” the last portion. The address in the example would change to the number below:

73.67.46.0

By masking the number, the address loses accuracy. Anonymizing the IP address was a setting in the older version of Google Analytics (Universal Analytics), but it's now the default in GA4.

Controlling Location Granularity in GA4

blog signals

If that's still too specific, you can take it a step further and zoom out even more. GA4 has settings that give more control over geolocation data collection.

The option to edit granularity is in the Admin section under Data Settings > Data Collection.

 

location list

When the option for granular location information is disabled, city data is no longer available, and the highest level of accuracy is region (i.e. state or territory.)

location list

Different countries and states can have different granularity by selecting the gear icon for more settings.



Completely Excluding IP Addresses in GA4

One gray area is the fact that IP addresses are matched in temporary memory prior to being thrown out. This is before the data hits any processing servers. (As a note, deriving geolocation for the EU is done with servers located only within the EU.)

The IP address isn't collected and it's never logged, but the fact that it's being transmitted still might make some legal teams uneasy. There's a way around that as well, but it's the most drastic step. To prevent any device IP information from Google Analytics hits completely, you can host it on your own server in Google Cloud with Server-side Google Tag Manager (ssGTM).

Server-side Google Tag Manager allows users to catch the data before sending it anywhere, including Google's servers. Google Analytics will load on the page and send data to your server in Google Cloud instead. Since the data isn't coming directly from the user's device, you can redact the IP address. The setting is within the default GA4 tag template in server-side containers. With this option set to "true," users' IP addresses will be removed from all events associated with the tag.

image

The IP address is not part of the network call. Below is a before and after the IP anonymization from server-side GTM:

Before

"image"

After 

"image"

Quick Tips About IP Addresses 

  • IP addresses are not tied to specific devices, but rather to an internet connection 
  • GA4 doesn't store or log IP addresses
  • By default, IP addresses are anonymized in GA4
  • There's an option to mask city-level data to make users' locations even less identifiable
  • Server-side tagging can be used so that the IP address never makes it to the temporary lookup phase on Google's servers

If there are worries about device identification or compliance, there are several paths that can be taken to fit any policy around IP addresses. Google is focused on user protection - even if you haven't touched any settings in GA4, you have some measures in place by default to keep your users' information safe. 

This isn't an answer to the loaded question of whether Google Analytics is fully compliant with every privacy regulation. However, details about identifiers and about which data is sent and stored can help make informed decisions. 

To learn more about privacy features in GA4, check out this in-depth blog post by Sean Power.