Regular expression syntax summary
Here are some
basic elements of regular expressions in Python:
-
.
-
Match any character except a newline.
-
^
-
Match the start of the source string.
-
$
-
Match the end of the source string.
-
*
-
Match 0 or more repetitions of the preceding expression.
-
+
-
Match 1 or more repetitions of the preceding expression.
-
[…]
-
Match any of a set of characters. Characters may be listed
individually; for example,
[ace]
will match the
characters 'a', 'c', or 'e'. A set may also contain character
ranges; for example, [0-9a-dA-D]
will match any decimal
digit, the lowercase characters from 'a' to 'd', or the uppercase
characters from 'A' to 'D'. A set may also contain character
classes such as \d or \w. -
[^…]
-
Match any character that is
not in
the given set.
-
?
-
Match 0 or 1 repetitions of the preceding expression.
-
expr1
|
expr2 -
Matches either expr1 or expr2.
-
(…)
-
Indicates the start and end of a group.
-
\number
-
Matches the contents of the group of the same number. Groups are
numbered starting from 1.
-
\b
-
Match an empty string at the beginning or end of a word. A word is
defined as a sequence of characters matched by \w.
-
\d
-
Match any decimal digit.
-
\s
-
Match any whitespace character.
-
\w
-
Match any alphanumeric character, including underscores.