- How to use backreferences to access the captures.
- Grouped expressions are captured by default.
- The regex engine captures it by default.
- Matched data in parenthesis is stored in memory for later use.
- For example, we have a regex with
/a(ppl)e/ which matches “apple” and captures “ppl”.
- The actual data that was matched is stored, not the expression.
- Another example is
/a(ppl|ngl)e/ matches “angle” and captures “ngl”.
- The regex engine captures the matching text, even if you don’t use it.
- Backreferences refer to captured data.
- Backreference Metacharacters
\1 Backreference to first capture.
\2 Backreference to second capture.
\3 Backreference to third capture.
- Most regex engines support
\1 ~ \9
- Some engines support
\10 ~ \99 (not recommended)
- Difficult to tell the difference between a backreference of
\99 from a backreference of \9, followed by the regular character of 9.
- Some engines us
$1 through $9 syntax instead.
- A backreference can be used in two ways:
- Can be used inside the same expression or after the match.
- Cannot use backreferences inside character classes.
- They are different concepts.
- A third example is
/(apples) to \1/ matches “apples to apples”
- A fourth example is
/(ab):(cd):(ef):\3:\2:\1/ matches “ab:cd:ef:ef:cd:ab”.
- Three expressions are captured.
- Puts them in reverse order.
- A fifth example is alternation
/<(i|em)>.+?<\/\1>/ matches “regex” or “regex”
- However
/<(i|em)>.+?<\/\1>/ does not match “regex</em>”
- A sixth example is to find duplicate words with
/\b(\w+)\b\s+\1/ which will highlight “the the” in the below pattern.
Paris in the
the spring.