Resources

Introducing the new REGEX function set in Excel

Regular expressions are now possible in Excel thanks to a new set of functions!

Excel has long possessed formula-based ways to extract and manipulate text, such as the LEN, FIND, REPLACE, SUBSTITUTE, LEFT, RIGHT, and MID functions. In more recent times, the likes of TEXTBEFORE, TEXTAFTER, and TEXTSPLIT have arrived.

However, for complex cases, it can be a nightmare having to combine several of these to achieve a certain task.

Thankfully, the REGEX functions provide you with a flexible and versatile alternative to construct special sequences to do this.

The set includes REGEXTEST, REGEXEXTRACT, and REGEXREPLACE.

REGEX is indeed short for 'regular expression'. And depending on your location, you may pronounce it as 'REJ-EX' (like I do) or 'REG-GEX', but it really doesn't really matter!

There are so many possibilities that it's impossible to cover everything, so I'm only going to touch the surface in this post, but it'll equip you with the basics so you can immediately start taking advantage of these functions.

Note: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE are only available to Microsoft 365 Insiders on the Beta Channel as of July 2024.

REGEXTEST

REGEXTEST checks if the supplied text matches a regex pattern and returns TRUE or FALSE.

REGEXTEST(text,pattern,[case_sensitivity])

  • text: The cell or range containing the text string(s) to check.
  • pattern: The regular expression to match.
  • [case_sensitivity]: Determines if the match is case sensitive. (OPTIONAL)
    • 0 - Case-sensitive match (default)
    • 1 - Case-insensitive match

Example

A list of codes are displayed in A3:A12. There are ones composed of numbers, letters, and a mixture of both.

To return TRUE for the codes that contain a number and FALSE for the ones that don't, use the following formula:

=REGEXTEST(A3:A12,"[0-9]")

The range is housed in the text argument, and the regular expression is defined in pattern and must be enclosed between quotation marks ("").

The square brackets ([]) define a character class and a range specifier (-) is sandwiched in between 0 and 9 to save having to put [0123456789]. 

Using a similar concept in C3:C12, the following returns TRUE for those that have an alphabetic character:

=REGEXTEST(A3:A12,"[a-zA-Z]")

Although there are only uppercase letters in this example, using `[a-zA-Z]` ensures lowercase ones are covered as well for any future values.

REGEXEXTRACT

REGEXEXTRACT lets you extract one or more parts of supplied text that match a regex pattern. 

REGEXEXTRACT(text,pattern,[return_mode],[case_sensitivity]) 

  • text: The cell or range containing the text string(s).
  • pattern: The regular expression to match.
  • [return_mode]: What should be extracted? (OPTIONAL)
    • 0 - First match(default)
    • 1 - All matches
    • 2 - Capture groups of first match
  • [case_sensitivity]: Determines if the match is case sensitive. (OPTIONAL)
    • 0 - Case-sensitive match (default)
    • 1 - Case-insensitive match

Example

In A3:A12, a collection of text strings are present. Each contains an email address in a different position.

To extract the email addresses and display in their own cells, use the following formula:

=REGEXEXTRACT(A3:A12,"[\w\.-]+@[\w\.-]+\.\w+")

Yes, that pattern looks like a random assortment of characters, but it isn't. Let's break it down:

  • `[\w\.-]`: Matches any alphanumeric character, underscore (_), dot (.), or hyphen (-). The `\w` is shorthand for the first two, whereas the backslash in `\.` removes the dot's special meaning to ensure it's a literal character.
  • `+`: This is a quantifier that means 'one or more' occurrences of any of the aforementioned characters.
  • `@`: Matches the literal `@` character in the email address.
  • `[\w\.-]`: Already explained.
  • `+`: Already explained.
  • `\.`: Already explained.
  • `\w`: Matches any alphanumeric character or underscore (_).
  • `+`: Already explained.

REGEXREPLACE 

REGEXREPLACE searches for a regex pattern within the text you supply and replaces it with different text.

REGEXREPLACE(text,pattern,replacement,[occurrence],[case_sensitivity])

  • text: The cell or range containing the text string(s).
  • pattern: The regular expression to match.
  • replacement: Specifies what text should replace the match.
  • [occurrence]: Determines which pattern instance should be replaced. (OPTIONAL)
    • 0 - All occurrences (default)
    • n - nth occurrence from the start
    • -n - nth occurrence from the end
  • [case_sensitivity]: Determines if the match is case sensitive. (OPTIONAL)
    • 0 - Case-sensitive match (default)
    • 1 - Case-insensitive match

Example

A3:A12 contains a set of fixed length account numbers.

To mask the first four characters with asterisks, use the following formula:

=REGEXREPLACE(A3:A12,"^(\d{4})","****")

Let's break down the pattern:

  • `^`: The caret is an anchor that matches only the start of a string.
  • `()`: Groups together part of the regular expression.
  • `\d`: Matches any digit between 0 and 9. It is shorthand for [0-9].
  • `{4}`: The "****" in replacement specifies the characters to override the first four digits of each number.

Confused by those patterns?

You might have looked at the patterns for the REGEXTRACT and REGEXREPLACE examples and freaked out. Don't worry, though, as I have a secret to reveal…

No, I didn't happen to put them together by memorising the entire regular expression language. Heard of ChatGPT? I took advantage of its vast knowledge to generate them, and you should too.

For the REGEXEXTRACT example, I asked:

what is the regex for extracting the email address from "support@example.com is the best way to  get in touch with our help desk."?

It not only gave me the correct regular expression I could easily copy and paste into Excel, but it gave a breakdown as well to aid my understanding. How good is this?

Likewise, for the REGEXREPLACE example, I asked:

using google sheets' regexreplace function, what is the regex for masking the first 4 digits from the account number 93721027?

Why did I mention Google Sheets?

The great thing about regular expressions is they are cross-compatible with other platforms like Sheets and many programming languages. That's why you haven't got to wait for ChatGPT to catch up with these new Excel functions — take advantage now!

Conclusion

It's clear that Excel's new game-changing REGEX functions are going to drastically streamline certain text extraction and manipulation tasks. You've only seen a handful of examples in this post, but the possibilities are endless.

I sense there will be less of a need for the traditional text functions as time moves forward. However, let's not get ahead of ourselves — these are still in beta after all. And even when they do reach the production version, compatibility constraints mean many users will be reluctant to use them for shared workbooks. But that is nothing new for any arrival in Excel for Microsoft 365.

The key takeaway is you don't need to spend hours and hours revising regular expressions to gain encyclopedic knowledge. As I've already alluded to, ChatGPT can do much of the heavy lifting for you. It is worth trying to understand what it generates, though, so you can prune a pattern if it is unnecessarily verbose.

On that note, if you really want to gain an appreciation of how extensive the regular expression library is, ask the following:

please give me a regex cheat sheet

What will you be using regex for?

Watch the video

For more Excel tips and tricks like this, check out our Video Tutorials page.

Andrew Moss
About the Author

Andrew Moss

Andrew is a technical writer who specialises in Microsoft Excel and data analysis. He has published hundreds of articles and social media posts aimed at helping people enhance their skills and spreading good practices.

Conversation

0 Comments

As you comment, let's keep the discussion friendly and respectful to add value to the conversation.

Thank you! Your comment has been sent!
Oops! Something went wrong while submitting the form.
Commenter profile pic.
Author Name
Comment Time

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere. uis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Reply
Delete
Commenter profile pic.
Author Name
Comment Time

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere. uis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Reply
Delete

Get Tickets

Benefit from an award-winning 3-day conference plus access to exclusive networking and additional Masterclass events.

Get Tickets
Close Cookie Popup
Cookie Preferences
By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Cookies helping us understand how this website performs, how visitors interact with the site, and whether there may be technical issues.
Cookies used to deliver advertising that is more relevant to you and your interests.
Cookies allowing the website to remember choices you make (such as your user name, language, or the region you are in).