10g Regular Expressions Syntax: source_char – any character string with a datatype of CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB pattern (max 512 bytes) Operator \ * + ? | ^ $ . [ ] [^ ] () {m} {m,} {m,n} \n [..] [: :] [==] Description The backslash character can have four different meanings depending on the context. It can: Stand for itself Quote the next character Introduce an operator Do nothing Matches zero or more occurrences Matches one or more occurrences Matches zero or one occurrence Alternation operator for specifying alternative matches Matches the beginning of a string Matches the end of a string Matches any character in the supported character set except NULL Bracket expression for specifying a matching list that should match any one of the expressions represented in the list A non-matching list expression specifies a list that matches any character except for the expressions represented in the list Grouping expression, treated as a single sub-expression Matches exactly m times Matches at least m times Matches at least m times but no more than n times The back-reference expression (n is a digit between 1 and 9) matches the nth sub-expression enclosed between '(' and ')' preceding the \n Specifies one collation element, and can be a multi-character element (for example, [.ch.] in Spanish) Specifies character classes (for example, [:alpha:]), it matches any character within the character class (see table below) Specifies equivalence classes. For example, [=a=] matches all characters having base letter 'a' (Explanations taken, in part, from Oracle on-line help) Page 1 of 3 Character classes CHARACTER CLASS SYNTAX MEANING [:alnum:] All alphanumeric characters [:alpha:] All alphabetic characters [:blank:] All blank space characters [:cntrl:] All control characters (nonprinting) [:digit:] All numeric digits [:graph:] All [:punct:], [:upper:], [:lower:], and [:digit:] characters [:lower:] All lowercase alphabetic characters [:print:] All printable characters [:punct:] All punctuation characters [:space:] All space characters (nonprinting), such as carriage return, newline, vertical tab, and form feed [:upper:] All uppercase alphabetic characters [:xdigit:] All valid hexadecimal characters match_pattern: lets you change the default matching behavior i c n m x specifies case-insensitive matching specifies case-sensitive matching allows the period (.), which is the match-any-character wildcard character, to match the newline character – if you omit this parameter, the period does not match the newline character 'm' treats the source string as multiple lines (Oracle interprets ^ and $ as the start and end, respectively, of any line anywhere in the source string, rather than only at the start or end of the entire source string – if you omit this parameter, Oracle treats the source string as a single line) ignores whitespace characters (by default, whitespace characters match themselves) (Explanations taken, in part, from Oracle on-line help) Page 2 of 3 Example: Assume you have a table with an area code / phone number combination field. Select the records that have the exact format (123)123-4567. SELECT FROM WHERE Areacode_Phone “Valid Area Code and Phone Numbers” Customer_Table REGEXP_LIKE (Areacode_Phone, '^\([0-9]{3}\)[[:digit:]]{3}-[0-9]{4}$'); Note: ^ means the pattern starts looking at the start of the string The open and close brackets need the quoting character (\) in front of them because they are “special” characters (i.e. can be used else in pattern matching) [0-9] and [[:digit:]] looks for any digit between 0 and 9 (two ways of doing exactly the same thing) {3} looks for exactly three instances of the preceding pattern (i.e. digits) $ means the pattern looks for the end of the string (Explanations taken, in part, from Oracle on-line help) Page 3 of 3