Regular Expressions Part II: Dots
This is part II of my journey into Regular Expressions for Google Analytics, whereby I am learning them (they are abbreviated as RegEx, or maybe in the plural, RegExen) and teaching them at the same time. I have rewritten this old post to include only the dot, like the one at the end of this sentence. This is to make the post easier and create building blocks for future posts.
Google Analytics says this about dots:
. matches any one character
This is exactly what they mean, but it is so out of context, I couldn’t wrap my head around it. (Match any one character that comes from where? I asked myself…)
They mean that you can create a RegEx like this
.ate
and it will match hate, fate, sate, or any four character expression. For that matter, it will match 8ate (there were no rules saying that the character has to be a letter.) It won’t match just ate, because it wants one character to substitute for the dot.
This is why we don’t (usually) ask Google Analytics to match a regular expression that looks like this:
homepage.com
because the dot is a wild card that stands for any one character (right?), so this will also match homepagescom and homepage4com and homepagedcom. Instead, we need to use a backslash to turn the Regular Expression dot into a Plain Old dot. (This is a good time to read Regular Expressions Part I, backslashes, if you haven’t already.) Anyway, we would express it like this: homepage.com.
And that’s why you see backslashes and dots together so often.
Backslashes
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes –
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
{Braces}
Minimal Matching
Lookahead
Robbin