As of Oracle 10g, the SUBSTR has been enhanced to support searches for regular expressions. The new function is named REGEXP_SUBSTR. You can use the new function to perform advanced searches against a string.
source_string: is the text to search within
pattern: is the regular expression
position: the position in the string to start searching from (default = 1)
occurrence: the occurrence to search for (default = 1)
match_parameter: can include one or more of the following modifiers to change the
default matching algorithm:
'c' - case-sensitive matching (default);
' ' – case-insensitive matching;
'n' – allow match-any-character operator;
'm' - treat source string as multiple line.
returns the string as VARCHAR2 or CLOB data in the same character set as the source_string.
Lets look at an example:
SELECT regexp_substr( '655-236-4567', '-[^-]+' ) "REGEXP_SUBSTR"
/* '- tells oracle to start the string at the '-' character. [^-] tells oracle to continue until it finds another '-' character. */
Note that if you add an extra '-' at the end of the regular expression you will get the trailing '-' as part of the returned string:
SELECT regexp_substr( '655-236-4567', '-[^-]+-' ) "REGEXP_SUBSTR"
A good understanding for Character Class Syntax is critical for effective use of the regular expression search capabilities.
Character Class Syntax Meaning
[:alnum:] All alphanumeric characters
[:alpha:] All alphabetic characters
[:blank:] All blank space characters
[:cntrl:] All control characters (nonprinting)
[:digit:] All numeric digits
[:graph:] All [:punct:], [:upper:], [:lower:], and [:digit:] characters
[:lower:] All lowercase alphabetic characters
[:print:] All printable characters
[:punct:] All punctuation characters
[:space:] All space characters (nonprinting)
[:upper:] All uppercase alphabetic characters
[:xdigit:] All valid hexadecimal characters
SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
What if you want to include the punctuation and some more characters?
SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:][^,]+,' ) "REGEXP_SUBSTR"
/* This is telling Oracle to start at the punct and include all characters until u reach a comma (include the comma as well). */
What if you want to find the number in the string?
SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:digit:]]+' ) "REGEXP_SUBSTR"
Bookmark/Search this post with