We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Regular expressions are a notation for describing sets of character strings. When a particular string is in the set described by a regular expression, we often say that the regular expression matches the string.
The simplest regular expression is a single literal character. Except for the metacharacters like *+?()|, characters match themselves. To match a metacharacter, escape it with a backslash: \+ matches a literal plus character.
*+?()|
\+
Two regular expressions can be alternated or concatenated to form a new regular expression: if e1 matches s and e2 matches t, then e1|e2 matches s or t, and e1e2 matches st.
|
The metacharacters *, +, and ? are repetition operators: e1* matches a sequence of zero or more (possibly different) strings, each of which match e1; e1+ matches one or more; e1? matches zero or one.
*
+
?
The operator precedence, from weakest to strongest binding, is first alternation, then concatenation, and finally the repetition operators. Explicit parentheses can be used to force different meanings, just as in arithmetic expressions. Some examples: ab|cd is equivalent to (ab)|(cd); ab* is equivalent to a(b*).
ab|cd
(ab)|(cd)
ab*
a(b*)
The syntax described so far is most of the traditional Unix egrep regular expression syntax. This subset suffices to describe all regular languages: loosely speaking, a regular language is a set of strings that can be matched in a single pass through the text using only a fixed amount of memory. Newer regular expression facilities (notably Perl and those that have copied it) have added many new operators and escape sequences, which make the regular expressions more concise, and sometimes more cryptic, but usually not more powerful.
This page lists the regular expression syntax accepted by RE2. Note that this syntax is a subset of that accepted by PCRE, roughly speaking, and with various caveats.
It also lists some syntax accepted by PCRE, PERL, and VIM.
.
[xyz]
[^xyz]
\d
\D
[[:alpha:]]
[[:^alpha:]]
\pN
\p{Greek}
\PN
\P{Greek}
xy
x
y
x|y
x*
x+
x?
x{n,m}
n
m
x{n,}
x{n}
x*?
x+?
x??
x{n,m}?
x{n,}?
x{n}?
x{}
x{-}
x{-n}
x=
Implementation restriction: The counting forms x{n,m}, x{n,}, and x{n} reject forms that create a minimum or maximum repetition count above 1000. Unlimited repetitions are not subject to this restriction.
x*+
x++
x?+
x{n,m}+
x{n,}+
x{n}+
(re)
(?P<name>re)
(?<name>re)
(?'name're)
(?:re)
(?flags)
(?flags:re)
(?#text)
(?|x|y|z)
(?>re)
re
re@>
%(re)
i
^
$
s
\n
U
Flag syntax is xyz (set) or -xyz (clear) or xy-z (set xy, clear z).
xyz
-xyz
xy-z
z
\z
\Z
\A
\b
\w
\W
\B
\g
\G
(?=re)
(?!re)
(?<=re)
(?<!re)
re&
re@=
re@!
re@<=
re@<!
\zs
\ze
\%^
\%$
\%V
\%#
\%'m
\%23l
\%23c
\%23v
\a
\007
\f
\014
\t
\011
\012
\r
\015
\v
\013
\*
\123
\x7F
\x{10FFFF}
\C
\Q...\E
...
\1
\010
\cK
\001
\e
\033
\g1
\g{1}
\g{+1}
\g{-1}
\g{name}
\g<name>
\g'name'
\k<name>
\k'name'
\lX
X
\ux
\L...\E
\K
$0
\N{name}
\R
\U...\E
\X
\%d123
\%xFF
\%o123
\%u1234
\%U12345678
A-Z
[:foo:]
foo
\p{Foo}
Foo
\pF
F
[\d]
[^\d]
[\D]
[^\D]
[[:name:]]
[:name:]
[^[:name:]]
[:^name:]
[\p{Name}]
\p{Name}
[^\p{Name}]
\P{Name}
[0-9]
[^0-9]
\s
[\t\n\f\r ]
\S
[^\t\n\f\r ]
[0-9A-Za-z_]
[^0-9A-Za-z_]
\h
\H
\V
[[:alnum:]]
[0-9A-Za-z]
[A-Za-z]
[[:ascii:]]
[\x00-\x7F]
[[:blank:]]
[\t ]
[[:cntrl:]]
[\x00-\x1F\x7F]
[[:digit:]]
[[:graph:]]
[!-~]
[A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_
`
{|}~]
[[:lower:]]
[a-z]
[[:print:]]
[ -~]
[ [:graph:]]
[[:punct:]]
[!-/:-@[-
{-~]
[[:space:]]
[\t\n\v\f\r ]
[[:upper:]]
[A-Z]
[[:word:]]
[[:xdigit:]]
[0-9A-Fa-f]
C
Cc
Cf
Cn
Co
Cs
L
LC
L&
Ll
Lm
Lo
Lt
Lu
M
Mc
Me
Mn
N
Nd
Nl
No
P
Pc
Pd
Pe
Pf
Pi
Po
Ps
S
Sc
Sk
Sm
So
Z
Zl
Zp
Zs
Adlam
Ahom
Anatolian_Hieroglyphs
Arabic
Armenian
Avestan
Balinese
Bamum
Bassa_Vah
Batak
Bengali
Bhaiksuki
Bopomofo
Brahmi
Braille
Buginese
Buhid
Canadian_Aboriginal
Carian
Caucasian_Albanian
Chakma
Cham
Cherokee
Chorasmian
Common
Coptic
Cuneiform
Cypriot
Cypro_Minoan
Cyrillic
Deseret
Devanagari
Dives_Akuru
Dogra
Duployan
Egyptian_Hieroglyphs
Elbasan
Elymaic
Ethiopic
Georgian
Glagolitic
Gothic
Grantha
Greek
Gujarati
Gunjala_Gondi
Gurmukhi
Han
Hangul
Hanifi_Rohingya
Hanunoo
Hatran
Hebrew
Hiragana
Imperial_Aramaic
Inherited
Inscriptional_Pahlavi
Inscriptional_Parthian
Javanese
Kaithi
Kannada
Katakana
Kawi
Kayah_Li
Kharoshthi
Khitan_Small_Script
Khmer
Khojki
Khudawadi
Lao
Latin
Lepcha
Limbu
Linear_A
Linear_B
Lisu
Lycian
Lydian
Mahajani
Makasar
Malayalam
Mandaic
Manichaean
Marchen
Masaram_Gondi
Medefaidrin
Meetei_Mayek
Mende_Kikakui
Meroitic_Cursive
Meroitic_Hieroglyphs
Miao
Modi
Mongolian
Mro
Multani
Myanmar
Nabataean
Nag_Mundari
Nandinagari
New_Tai_Lue
Newa
Nko
Nushu
Nyiakeng_Puachue_Hmong
Ogham
Ol_Chiki
Old_Hungarian
Old_Italic
Old_North_Arabian
Old_Permic
Old_Persian
Old_Sogdian
Old_South_Arabian
Old_Turkic
Old_Uyghur
Oriya
Osage
Osmanya
Pahawh_Hmong
Palmyrene
Pau_Cin_Hau
Phags_Pa
Phoenician
Psalter_Pahlavi
Rejang
Runic
Samaritan
Saurashtra
Sharada
Shavian
Siddham
SignWriting
Sinhala
Sogdian
Sora_Sompeng
Soyombo
Sundanese
Syloti_Nagri
Syriac
Tagalog
Tagbanwa
Tai_Le
Tai_Tham
Tai_Viet
Takri
Tamil
Tangsa
Tangut
Telugu
Thaana
Thai
Tibetan
Tifinagh
Tirhuta
Toto
Ugaritic
Vai
Vithkuqi
Wancho
Warang_Citi
Yezidi
Yi
Zanabazar_Square
\i
\I
\k
\F
\p
\P
[ \t]
[^ \t]
\x
\o
[0-7]
\O
\l
\L
\u
\U
\_x
\c
\m
\M
(?{code})
(??{code})
(?n)
(?+n)
+n
(?-n)
-n
(?C)
(?R)
(?0)
(?&name)
(?P=name)
(?P>name)
(?(cond)true|false)
(?(cond)true)
(*ACCEPT)
(*COMMIT)
(*F)
(*FAIL)
(*MARK)
(*PRUNE)
(*SKIP)
(*THEN)
(*ANY)
(*ANYCRLF)
(*CR)
(*CRLF)
(*LF)
(*BSR_ANYCRLF)
(*BSR_UNICODE)