4/25/22, 10:21 PM Regex Cheat Sheet \s Most engines: "whitespace character": space, tab, newline, carriage return, vertical tab a\sb\sc ab c ab a\sb\sc c \D\D\D ABC \W\W\W\W\W *-+=) \S\S\S\S Yoyo \s .NET, Python 3, JavaScript: "whitespace character": any Unicode separator \D \W \S One character that is not a digit as defined by your engine's \d One character that is not a word character as defined by your engine's \w One character that is not a whitespace character as defined by your engine's \s (direct link) Quantifiers Quantifier Legend + One or more {3} Exactly three times {2,4} Two to four times {3,} Three or more times * Zero or more times ? Once or none Example Sample Match Version \w-\w+ Version A-b1_1 \D{3} ABC \d{2,4} 156 \w{3,} regex_tutorial A*B*C* AAACC plurals? plural (direct link) More Characters Character Legend . Any character except line break . Any character except line break \. A period (special character: needs to be escaped by a \) \ Escapes a special character \ Escapes a special character Example a.c .* a\.c \.\*\+\? \$\^\/\\ \[\{\(\)\}\] Sample Match abc whatever, man. a.c .*+? $^/\ [{()}] (direct link) Logic Logic | (…) \1 \2 (?: … ) Legend Alternation / OR operand Capturing group Contents of Group 1 Contents of Group 2 Non-capturing group Example Sample Match 22|33 33 A(nt|pple) Apple (captures "pple") r(\w)g\1x regex (\d\d)\+(\d\d)=\2\+\1 12+65=65+12 A(?:nt|pple) Apple (direct link) More White-Space Character Legend \t \r \n Tab Carriage return character Line feed character \r\n Line separator on Windows \N \h \H \v Perl, PCRE (C, PHP, R…): one character that is not a line break Perl, PCRE (C, PHP, R…), Java: one horizontal whitespace character: tab or Unicode space separator One character that is not a horizontal whitespace .NET, JavaScript, Python, Ruby: vertical tab Perl, PCRE (C, PHP, R…), Java: one vertical whitespace character: line feed, carriage return, vertical tab, form feed, paragraph or line separator Perl, PCRE (C, PHP, R…), Java: any character that is not a vertical whitespace Perl, PCRE (C, PHP, R…), Java: one line break (carriage return + line feed pair, and all the characters matched by \v) \v \V \R https://www.rexegg.com/regex-quickstart.html Sample Match T\t\w{2} T ab see below see below AB AB\r\nCD CD \N+ ABC Example 2/6 4/25/22, 10:21 PM Regex Cheat Sheet (direct link) More Quantifiers Quantifier Legend + The + (one or more) is "greedy" ? Makes quantifiers "lazy" * The * (zero or more) is "greedy" ? Makes quantifiers "lazy" {2,4} Two to four times, "greedy" ? Makes quantifiers "lazy" Example Sample Match \d+ 12345 \d+? 1 in 12345 A* AAA A*? empty in AAA \w{2,4} abcd \w{2,4}? ab in abcd (direct link) Character Classes Character Legend […] One of the characters in the brackets […] One of the characters in the brackets Range indicator [x-y] One of the characters in the range from x to y […] One of the characters in the brackets [x-y] One of the characters in the range from x to y [^x] One character that is not x Example [AEIOU] T[ao]p [a-z] [A-Z]+ [AB1-5w-z] [ -~]+ [^a-z]{3} [^x-y] One of the characters not in the range from x to y [^ -~]+ [\d\D] One character that is a digit or a non-digit [\d\D]+ [\x41] Matches the character at hexadecimal position 41 in the ASCII table, i.e. A [\x41-\x45] {3} Sample Match One uppercase vowel Tap or Top One lowercase letter GREAT One of either: A,B,1,2,3,4,5,w,x,y,z Characters in the printable section of the ASCII table. A1! Characters that are not in the printable section of the ASCII table. Any characters, including new lines, which the regular dot doesn't match ABE (direct link) Anchors and Boundaries Anchor Legend ^ Start of string or start of line depending on multiline mode. (But when [^inside brackets], it means "not") $ End of string or end of line depending on multiline mode. Many engine-dependent subtleties. Beginning of string \A (all major engines except JS) Very end of the string \z Not available in Python and JS End of string or (except Python) before final line break \Z Not available in JS Beginning of String or End of Previous Match \G .NET, Java, PCRE (C, PHP, R…), Perl, Ruby Word boundary \b Most engines: position where one side only is an ASCII letter, digit or underscore Word boundary \b .NET, Java, Python 3, Ruby: position where one side only is a Unicode letter, digit or underscore \B Not a word boundary Example ^abc .* .*? the end$ \Aabc[\d\D]* Sample Match abc (line start) this is the end abc (string... ...start) the end\z this is...\n...the end the end\Z this is...\n...the end\n Bob.*\bcat\b Bob ate the cat Bob.*\b\кошка\b Bob ate the кошка c.*\Bcat\B.* copycats (direct link) POSIX Classes Character Legend [:alpha:] PCRE (C, PHP, R…): ASCII letters A-Z and a-z [:alpha:] Ruby 2: Unicode letter or ideogram [:alnum:] PCRE (C, PHP, R…): ASCII digits and letters A-Z and a-z [:alnum:] Ruby 2: Unicode digit, letter or ideogram https://www.rexegg.com/regex-quickstart.html Example [8[:alpha:]]+ [[:alpha:]\d]+ [[:alnum:]]{10} [[:alnum:]]{10} Sample Match WellDone88 кошка99 ABCDE12345 кошка90210 3/6 4/25/22, 10:21 PM Regex Cheat Sheet [:punct:] PCRE (C, PHP, R…): ASCII punctuation mark [[:punct:]]+ ?!.,:; [:punct:] Ruby: Unicode punctuation mark [[:punct:]]+ ‽,:〽⁆ (direct link) Inline Modifiers None of these are supported in JavaScript. In Ruby, beware of (?s) and (?m). Modifier (?i) (?s) Legend Example Case-insensitive mode (?i)Monday (except JavaScript) DOTALL mode (except JS and Ruby). The dot (.) matches new line characters (\r\n). Also (?s)From A.*to Z known as "single-line mode" because the dot treats the entire input as a single line (?m) Multiline mode (except Ruby and JS) ^ and $ match at the beginning and end of every line (?m) In Ruby: the same as (?s) in other engines, i.e. DOTALL mode, i.e. dot matches line breaks (?m)From A.*to Z (?x) Free-Spacing Mode mode (except JavaScript). Also known as comment mode or whitespace mode (?n) .NET, PCRE 10.30+: named capture only (?d) Java: Unix linebreaks only (?^) PCRE 10.32+: unset modifiers Sample Match monDAY From A to Z 1 2 3 From A to Z (?m)1\r\n^2$\r\n^3$ (?x) # this is a # comment abc # write on multiple abc d # lines [ ]d # spaces must be # in brackets Turns all (parentheses) into non-capture groups. To capture, use named groups. The dot and the ^ and $ anchors are only affected by \n Unsets ismnx modifiers (direct link) Lookarounds Lookaround Legend (?=…) Positive lookahead (?<=…) Positive lookbehind (?!…) Negative lookahead (?<!…) Negative lookbehind Example Sample Match (?=\d{10})\d{5} 01234 in 0123456789 (?<=\d)cat cat in 1cat (?!theatre)the\w+ theme \w{3}(?<!mon)ster Munster (direct link) Character Class Operations Class Operation Legend Example […-[…]] .NET: character class subtraction. One character that is in those on the left, [a-z-[aeiou]] but not in the subtracted class. […-[…]] .NET: character class subtraction. […&& […]] […&& […]] […&& [^…]] […&& [^…]] Java, Ruby 2+: character class intersection. One character that is both in those on the left and in the && class. Java, Ruby 2+: character class intersection. Java, Ruby 2+: character class subtraction is obtained by intersecting a class with a negated class Java, Ruby 2+: character class subtraction Sample Match Any lowercase consonant An Arabic character that is not a non-digit, i.e., an Arabic digit An non-whitespace character that is a non[\S&&[\D]] digit. [\S&&[\D]&&[^a-zA- An non-whitespace character that a nonZ]] digit and not a letter. An English lowercase letter that is not a [a-z&&[^aeiou]] vowel. [\p{InArabic}&& An Arabic character that is not a letter or a [^\p{L}\p{N}]] number [\p{IsArabic}-[\D]] (direct link) Other Syntax https://www.rexegg.com/regex-quickstart.html 4/6