Regular expression syntax summary
Here are some
basic elements of regular expressions in Python:
-
.-
Match any character except a newline.
-
^ -
Match the start of the source string.
-
$ -
Match the end of the source string.
-
*
-
Match 0 or more repetitions of the preceding expression.
-
+
-
Match 1 or more repetitions of the preceding expression.
-
[…] -
Match any of a set of characters. Characters may be listed
individually; for example,
[ace] will match the
characters 'a', 'c', or 'e'. A set may also contain character
ranges; for example, [0-9a-dA-D] will match any decimal
digit, the lowercase characters from 'a' to 'd', or the uppercase
characters from 'A' to 'D'. A set may also contain character
classes such as \d or \w. -
[^…] -
Match any character that is
not in
the given set.
-
? -
Match 0 or 1 repetitions of the preceding expression.
-
expr1
| expr2 -
Matches either expr1 or expr2.
-
(…)
-
Indicates the start and end of a group.
-
\number -
Matches the contents of the group of the same number. Groups are
numbered starting from 1.
-
\b -
Match an empty string at the beginning or end of a word. A word is
defined as a sequence of characters matched by \w.
-
\d -
Match any decimal digit.
-
\s -
Match any whitespace character.
-
\w -
Match any alphanumeric character, including underscores.