Search Strings REGEXP_SUBSTR

Your rating: None Average: 4.3 (17 votes)

As of Oracle 10g, the SUBSTR has been enhanced to support searches for regular expressions. The new function is named REGEXP_SUBSTR. You can use the new function to perform advanced searches against a string.

Syntax:

REGEXP_SUBSTR(source_string, pattern
                            [,position
                               [, occurrence
                                   [, match_parameter]
                               ]
                             ]
                            )
Where:
source_string: is the text to search within
pattern: is the regular expression
position: the position in the string to start searching from (default = 1)
occurrence: the occurrence to search for (default = 1)
match_parameter: can include one or more of the following modifiers to change the
default matching algorithm:
'c' - case-sensitive matching (default);
' ' – case-insensitive matching;
'n' – allow match-any-character operator;
'm' - treat source string as multiple line.

REGEXP_SUBSTR returns the string as VARCHAR2 or CLOB data in the same character set as the source_string.

Lets look at an example:

SELECT regexp_substr( '655-236-4567', '-[^-]+' ) "REGEXP_SUBSTR"
FROM dual;
/* '- tells oracle to start the string at the '-' character.  [^-] tells oracle to continue until it finds another '-' character. */

REGEXP_SUBSTR
-236

Note that if you add an extra '-' at the end of the regular expression you will get the trailing '-' as part of the returned string:

SELECT regexp_substr( '655-236-4567', '-[^-]+-' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
-236-

A good understanding for Character Class Syntax is critical for effective use of the regular expression search capabilities.

Character Class   Syntax Meaning
[:alnum:]             All alphanumeric characters
[:alpha:]              All alphabetic characters
[:blank:]              All blank space characters
[:cntrl:]                All control characters (nonprinting)
[:digit:]                All numeric digits
[:graph:]             All [:punct:], [:upper:], [:lower:], and [:digit:] characters
[:lower:]              All lowercase alphabetic characters
[:print:]               All printable characters
[:punct:]              All punctuation characters
[:space:]             All space characters (nonprinting)
[:upper:]             All uppercase alphabetic characters
[:xdigit:]              All valid hexadecimal characters

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
:

What if you want to include the punctuation and some more characters?

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;
/* This is telling Oracle to start at the punct and include all characters until u reach a comma (include the comma as well). */

REGEXP_SUBSTR
: Adam,

What if you want to find the number in the string?

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:digit:]]+' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
28

Please try to help out with unanswered topics on the forum. Chances are you have had the same issue/question some time in your IT career!

Comments

Have a question? Please ask it on the forum instead.
Guest's picture
Guest (not verified)
Re: Search Strings REGEXP_SUBSTR

Hi Lepa,
Here are some errors in the two query,please tell me the reason,thanks.
SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
FROM dual;
Result:p ----why not Employee Name and Age: Adam

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;
Result:ployee Name and Age: Adam, ----why not Employee Name and Age: Adam,

Pavony Alejandro's picture
Pavony Alejandro (not verified)
Re: Search Strings REGEXP_SUBSTR

Hi, I have the next case.
I need your colaboration. Thanks:
I have the next select:
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,1) param from dual; -- work good, extract value: 'a'
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,2) param from dual; -- work good, extract value: 'b'
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,3) param from dual; -- work bad, extract value: 'd' ** why no extract "null" ?
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,4) param from dual; -- work bad, extract value: 'null' ** no found the 4th parameter.

Guest's picture
Guest (not verified)
Re: Search Strings REGEXP_SUBSTR

Pavony Alejandro
Look this solution:
select replace(regexp_SUBSTR('AA|BBB|C||DDDDDD||E|12|','[^|]*[|$]',1,1),'|','') FROM DUAL;
select replace(regexp_SUBSTR('AA|BBB|C||DDDDDD||E|12|','[^|]*[|$]',1,2),'|','') FROM DUAL;
select replace(regexp_SUBSTR('AA|BBB|C||DDDDDD||E|12|','[^|]*[|$]',1,3),'|','') FROM DUAL;
select replace(regexp_SUBSTR('AA|BBB|C||DDDDDD||E|12|','[^|]*[|$]',1,4),'|','') FROM DUAL;

Guest's picture
Guest (not verified)
Re: Search Strings REGEXP_SUBSTR

Hi!

Thank you for your post.

Can you give an example that includes occurence?

Suppose I have the below string:

'Adam (1), John (2), Mary (3)'

and I want it to return:

1, 2, 3

Can REGEXP_SUBSTR help me here?

Guest's picture
Guest (not verified)
Re: Search Strings REGEXP_SUBSTR

Sorry. I know numbers cxan be extracted using :digit.

Suppose you have alphabets.

Consider the below string:

'Adam (A), John (B), Mary (C)'

and I want it to return:

A, B, C

Can REGEXP_SUBSTR or REGEXP_REPLACE help me here?

ugs's picture
ugs (not verified)
Re: Search Strings REGEXP_SUBSTR

There is a sytanx mistake in the sql.
[] is missed at [:punct:]. It should be [[:punct:]]
Correct Sql:
1.SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
FROM dual;
2.SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:punct:]][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;

ugs's picture
ugs (not verified)
Re: Search Strings REGEXP_SUBSTR

sorry!
Correct Sql:
1.SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:punct:]'] ) "REGEXP_SUBSTR"
FROM dual;
2.SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:punct:]][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;
..

Rich's picture
Rich (not verified)
Re: Search Strings REGEXP_SUBSTR

I am trying to use regexp_substr to return a string of text from beginning at one word and ending at another word. for example (the below text is a sample from a column in my table, column name 'comment')

FINDINGS: Today's examination is limited due to exam. There is no aortic dissection. IMPRESSION:The exam doesn't find any postiitive results. We cannot see the PE. I cannot determine the cause.

FINAL REPORT

when i use regexp_substr(comment, 'IMPRESSION: [^FINAL]+') It returns all text starting at IMPRESSION and ending after the '.' becuase the letter I is in the word FINAL. What i want is the whole string starting form IMPRESSION and ending at FINAL. Any suggestions?

Guest's picture
Guest
Re: Search Strings REGEXP_SUBSTR

can any one assists for new line character???

Guest's picture
Guest
Re: Search Strings REGEXP_SUBSTR

I'm trying to select the digits between the $ and the space before the /

Field:
Global Adjustment\Rajustement global\ 7.000330 kWh @ $0.118653 /kWh

SQL:
REGEXP_SUBSTR(BCALCLN.DESCR_ON_BILL, '[^$]*$') PRICE,

Returns:
0.118653 /kWh

How do I omit the ' /kWh' so it RETURNS ONLY 0.118653

Guest's picture
Guest
Re: Search Strings REGEXP_SUBSTR

Hi I want to split string by #`

Please help

sample string
#`FP1039~#`STRISK~#`STD~#`ST##DEFPROFPORT~#`FD-1~#`FD-1~#`FD-1~#`FD-1~#`ST~#`

Post new comment

CAPTCHA
The question below is to prevent automated spam submissions.
1 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.