← Back to challenges

RegEx XX: Capturing Groups

PythonHardregexformatting

Instructions

A Capturing Group will match the characters or expressions within the parenthesis (). The matches will also be stored, and can be backreferenced by a backlash followed by its number. For example, \1 will access the first capturing group that appears in the expression.

txt1 = "foo, you are such a foo"
txt2 = "foo, you are such a bar"
txt3 = "bar, you are such a bar"
txt4 = "bar, you are such a foo"
pattern = r"(\w+),.*\1"

bool(re.match(pattern, txt1)) ➞ True
bool(re.match(pattern, txt2)) ➞ False
bool(re.match(pattern, txt3)) ➞ True
bool(re.match(pattern, txt4)) ➞ False

Capturing groups are often used along with quantifiers. Quantifiers will use the capturing group as a whole.

re.findall("(go)+", "gogogo") ➞ ["gogogo"]

Write a regular expression to match MAC-addresses. MAC-addresses consists of 6 two-digits hexadecimal numbers separated by a colon. Use a capturing group in your expression.

txt = "01:32:54:67:89:AB "
pattern = "yourregularexpressionhere"

bool(re.match(pattern, txt)) ➞ True

Warning: the function re.findall() will output any capturing groups separately. If you don't want re.findall() to output a capture group, consider using non-capturing groups. Non capturing groups are formatted (?:x), where "x" is the character or expression you want to match but not capture. Non capturing groups won't be remembered and can't be accessed later in your expression.

Notes

  • You don't need to write a function, just the pattern.
  • Do not remove import re from the code.
  • You can find all the challenges of this series in my Basic RegEx collection.
python3
Loading editor…
to run
Walks through the solution with reasoning and edge cases.