Remove (other) From Content Reports, Even On Huge Websites

September 24, 2014 | Alex Moore
Update October 2015: The workaround outlined in this blog post may no longer circumvent the high cardinality restriction. Additionally, restrictive character-length limits are now being imposed on custom dimensions, which were not present in the past.

 

blog-other-content-reports

It happens all the time. One day you notice a big, ugly surprise at the top of your top Site Content > All Pages reports: “This report includes a high-cardinality dimension, and some data has been grouped into (other).”

high-cardinality warning

The dreaded other

The dreaded (other), also known as the high-cardinality limit.

(other) appears in your content reports when you have more than 50,000 unique pages (75,000 for Premium) that are viewed in any given day. The 50,000th unique page that day will appear as “(other)”, and any other unique pages will be consolidated there.

50000 pages

Stop. It may look like you have 50,000 pages in your reports. But ask yourself: do I really have 50,000 (or 75,000) totally unique pages? That is, do I have 50,000 pages with content exclusive and separate from any other page?

(If so, prepare to have your mind blown a bit further down the page.)

No way… We don’t really have 50,000 pages.

For small and medium-size websites, there are three quick things you can do to try to remove (other) from your content reports.

1. Exclude Query Parameters. Take a look at your top page paths and hunt for repeats. Perhaps you have URLs with page paths that pass along session information, or language information, or previous page information. The end result might look a bit like this:

Top page paths

The quick fix: Determine which URL parameters are meaningless from an analytics perspective. Will you ever pull a report comparing /about/ by session id, side-by-side? The answer is: most likely not. (There are better reports for doing such a thing.)

Exclude query parameters

To combine the duplicates, you’ll simply go into your Admin, under View Settings, and add a comma-separated list of the query parameters you want to exclude. Your page paths will begin to consolidate (moving forward), and your daily unique total might just slip under 50k.

Strip query parameters site search

2. Strip site search. Do you have a search engine within your website? In addition to tracking usage within GA, you should make sure that you are excluding those search query parameters from your page path reports.

In your View Settings, look under Site Search Settings. That one little checkbox makes all the difference. By marking “Strip query parameters out of URL” you’ll make sure that you don’t have a bunch of page paths that look like this:

non stripped site search parameters

…instead, they’ll look like this:

consolidated site search

3. Consolidate homepages and trailing slashes. Perhaps you have a web server that serves up homepages that look like / sometimes and /index.php other times. Or maybe you have page paths that sometimes look like /about and other times look like /about/. Jonathan Weber’s blog post can set you on the right (page) path.

Give your Analytics another day to collect, and observe the results. In the end, your page paths should look a little more like this:

Fixed URLs in Content Reports

Ummm, we have a huge website, with way more than 50,000 unique pages. What about us?

If all of the above STILL cannot bring your daily unique page paths to under 50,000 (or 75,000), I have some good news for you. Using Universal Analytics, you can, in fact, see every single unique page that has been viewed on your website.

That’s right. You can set a custom dimension, pass the value of the page path to it, and completely avoid the high-cardinality limit of 50,000 unique rows. You can have hundreds of thousands of unique page paths per day, and you can see them all without sacrificing any long-tail query parameters.
Mind. Blown.

Mind. Blown.

Here’s how it all works.

Full page path

Step 1: Create a custom dimension called “Full Page Path”. Look in the Admin, in the Property Settings, under Custom Definitions, and click “Custom Dimensions”.

(This will only work if you have Universal Analytics.) You need to create a new dimension, called “Full Page Path”. You’ll set this at the “Hit” scope, because you want it to fire on every pageview. Click the big blue Create button, and take note of the “Index” (a number between 1-200). We are going to use this in a few minutes.

If you’re not using Google Tag Manager, skip to Step 2b.

Step 2a: Tag Manager method. You are now going to actually set this custom dimension’s value on each pageview of the website. To do so, inside Google Tag Manager, you will modify (or create) your Google Analytics Universal Analytics tag. Take a look at the screenshot below. The setting we’ll need is under “More Settings” and (you guessed it) “Custom Dimensions”. IMPORTANT: Make sure that your Index is set to the index from Step 1!

gtm cd url and query

Notice that we are using a macro here called {{url path and query}}. This little guy needs to be created separately, but doing so is pretty straightforward.

new macro

url path and query macro

Custom Javascript:

function() {
    var url = (document.location.pathname+document.location.search).substr(1);
    if (url.indexOf('/') === 1) {
        return url;
    }
    else {
        return "/" + url;
    }
}

Step 2b: JavaScript method (no Google Tag Manager): Even without Google Tag Manager, this isn’t too difficult. Find the line in your code on every page that looks like this:

ga('send', 'pageview');

You are going to want to make one tiny tweak to this line.

ga('send', 'pageview', {
  'dimension1':  '/'+(location.pathname+location.search).substr(1)
});

Important: You need to replace dimension1 with your particular index from Step 1!

This needs to be on every single page across the entire website.

Step 3: Create a custom report with your new custom dimension!

custom report full page path

Wait a day. And then…

page path custom report long tail

full page path rows

Yup. They’re all there. All 304,018 (in this example). This is a stunning workaround.

The downside: This data will likely be sampled (unless you’re using Premium). But even at a moderate sample rate, you’ll still get more content data out of this method than you would have with a majority of your page paths hidden behind “(other)”!

You can also create custom dimensions for page paths without query parameters and entire URLs! Share your own results in the comments!