A guide to regular expressions (with use cases)

Maybe you’ve heard of Regex but aren’t quite sure how it can be used in SEO or whether it fits into your own strategy.

Regular expressions, or “regex,” are like an inline text search programming language that allow you to include complex search strings, partial matches and wildcards, case-insensitive searches, and other advanced statements.

You can imagine they are looking for a pattern and not a specific string of text.

Therefore, they can help you find entire sets of search results that, at first glance, seem to have little in common.

Regex expressions are a language all their own, and when you first see them, they can look quite foreign.

However, they are fairly easy to learn and can be used in JavaScript, Python, and other programming languages, making them a versatile and powerful SEO tool.

In this guide, you’ll learn about common regex operators, how to use advanced regex filters for SEO, how to use regex in Google Analytics and Google Search Console, and more.

You can also find examples of regex used in different ways in SEO.

What does regex look like?

A regular expression typically contains a combination of text that will match exactly in the search results, along with several operators that act more like wildcards to match a pattern rather than an exact text match.

This can include a single character wildcard, a match for one or more characters, or a match for zero or more characters, as well as optional characters, nested subexpressions in parentheses, and “or” functions.

By combining these different operations, you can create a complex expression that can produce very far-reaching yet very specific results.

Common regex operators

Some examples of common regex operators are:

. A wildcard match for any single character.

.* A match for zero or more characters.

.+ A match for one or more characters.

d A match for any single numeric digit 0-9.

? Inserted after a character to make it an optional part of the expression.

| A vertical bar or “pipe” sign indicates an “or” function.

^ Used to mark the beginning of a character string.

$ Used to mark the end of a string.

( ) Used to nest a subexpression.

Inserted before an operator or special character to “escape” it.

Some programming languages ​​like JavaScript allow “flags” to be inserted after the regex pattern itself, and these can further affect the result:

g Returns all matches instead of just the first one.

i Returns case-insensitive results.

m Enables multiline mode.

s Activates ‘dotall’ mode.

u Enables full Unicode support.

y Searches the specific text position (‘sticky’ mode).

As you can see, these operators and flags build together into a complex logical language that gives you the power to get very specific results across large, unordered data sets.

How do you use regex for SEO?

Regex can be used to examine the queries that use different user segments, which queries are common for certain content areas, which queries drive traffic to specific parts of your site, and more.

In this article, Hamlet Batista demonstrated how to use regex in Python, for example to parse server log files.

And in this video, Chris Long showed you how to use Regex to extract the position, element, and name of the breadcrumbs associated with each URL on your website as part of a scalable keyword research and segmentation process.

Google encourages SEO professionals to share examples of using regex on Twitter with the hashtag #performanceregex.

Here are a few tips from SEO Twitter (you’ll find it’s a pretty quiet hashtag – add your own examples if you have any!):

Using Regex in Google Analytics

One of the most common uses of regex for SEO is in Google Analytics, where regular expressions can be used to set up filters so you only see the data you want to see.

In this sense, the expression is used to exclude results rather than generate a set of inclusive search results.

For example, if you want to exclude data from IP addresses on your local network, you can filter out 192.168.*.* to remove the entire range from 192.168.0.0 to 192.168.255.255.

Advanced Regex SEO Filters

As a more complex example, let’s imagine you have two labels: regex247 and regex365.

You may want to filter results matching any combination of URLs containing these brand names, e.g. e.g. regex247.biz or www.regex365.org.

One way to do this is with a fairly simple ‘or’ expression:

.*regex247.*|.*regex365.*

This would remove all matching URLs from your analytics data, including subfolder paths and specific page URLs appearing on those domain names.

A word of warning

It’s worth noting that – similar to your robots.txt file – a poorly written regex expression can easily filter out most or all of your data by including an unqualified wildcard match.

The good news is that in many SEO cases the filter is not applied to your data until the reporting phase and you can restore full visibility of your data by editing or deleting your regex expression.

You can also test regular expressions with a number of online testing tools to see if they produce the intended result – this allows you to “sandbox” your regex expressions before deploying them across your entire dataset.

To create regex filters in Google Analytics, first navigate to the type of report you want to create (eg behavior > Website Content > All sites or acquisition > All traffic > source/medium).

Underneath the chart at the top of the data table, find and click the search box progressive to view the advanced filter options.

Here you can include or exclude data based on a specific dimension or metric. After choosing your dimension, select from the drop-down list Matching RegExp and then enter your expression in the text box.

“Or” and “And” in Google Analytics Regex

To create an “or” expression in Google Analytics, simply insert the pipe character (the vertical bar symbol |) between the appropriate segments of your expression.

Google Analytics regular expressions do not support ‘and’ statements within a single regular expression; However, you can simply add another filter to achieve this.

Just click below your first regex Add a dimension or metric and enter your next regex. This allows you to stack any number of expressions that will be processed as a single logical “and” statement when filtering your data.

Using Regex in Google Search Console

In 2021, Google Search Console began supporting Regex’s Re2 syntax, allowing webmasters to include and exclude dates in the UI.

For all metacharacters supported by Google Search Console, see this RE2 regex syntax reference on GitHub.

At the time of writing, there is a character limit of 4096 characters (which is usually enough…).

Examples you can use in Search Console are filtering for search queries that contain a specific brand and the variations users might enter, e.g. e.g. Facebook:

.*facebook.*|face*book.*|fb.*|fbook.*|f*book.*

Filter out users who find your site using terms with “commercial” intent:

.*(best|top|alternative|alternative|vs|versus|review*).*

Related: Google Search Console adds new regex filter options

Why is regex important for SEO?

Why is all this important?

Well, it’s about taking control of your data and filtering out the parts of it that aren’t helping you improve your SEO β€” whether that’s specific pages or parts of your site, traffic from a specific source, or a specific medium or from your own site network data.

You can create very simple regex expressions to achieve a basic “include” or “exclude” filter, or write longer expressions that work similar to programming code to achieve complex and very specific results.

And with the right regex for each campaign, you can verify that your SEO efforts are meeting your goals, ambitions, and results – a powerful way to demonstrate a positive ROI on your future SEO investments.

More resources:


Featured Image: Optura Design/Shutterstock


Follow us on Facebook | Twitter | YouTube


WPAP (907)

Leave a Comment

ajax-loader