RegexMatch (String function): Difference between revisions
m (1 revision) |
m (→Examples) |
||
Line 42: | Line 42: | ||
==Examples== | ==Examples== | ||
< | ===Finding the first position of one of several characters=== | ||
A common programming problem is to "scan" a string, and find the first position which is one of several characters. This can be readily accomplished with <var>RegexMatch</var>; here is an example of this: | |||
<p class="code">%regex = '[aeiou]'; * Scan for any vowel | |||
%str = 'That quick brown fox' | |||
%i = %str:regexMatch(%regex) | |||
if %i then | |||
printText Before vowel: {%str:Left(%i - 2)} | |||
printText The vowel: {%str:Char(%i-1)} | |||
printText After vowel: {%str:Substring(%i)} | |||
</p> | |||
The result of the above fragment is: | |||
<p class="output">Before vowel: Th | |||
The vowel: a | |||
After vowel: t quick brown fox | |||
</p> | |||
Note that the position returned by <var>RegexMatch</var> is the position of the character after the first successful match. | |||
In many cases, this programming problem is better performed using the <var>[[StringTokenizer class|StringTokenizer]]</var>. | |||
===Finding the first position which isn't one of several characters=== | |||
A similar programming task is finding the position of the first character which is not one of a set of characters; this is readily accomplished with <var>RegexMatch</var> (and is not as amenable to processing with the <var>StringTokenizer</var>. Here is an example: | |||
<p class="code">%regex = '[' '5F':HexToString 'aeiou]'; * Scan for any non-vowel | |||
%i = %str:regexMatch(%regex) | |||
%str = 'albatross' | |||
if %i then | |||
printText Before non-vowel: {%str:Left(%i - 2)} | |||
printText The non-vowel: {%str:Char(%i-1)} | |||
printText After non-vowel: {%str:Substring(%i)} | |||
</p> | |||
The result of the above fragment is: | |||
<p class="output">Before non-vowel: a | |||
The non-vowel: l | |||
After non-vowel: batross | |||
</p> | |||
===Using a few other regex features=== | |||
The following example tests whether the regex <b>'\*bc?[5-8]'</b> matches <b>'a*b6'</b> (it does, and note that it matches "in the middle" of the string). | |||
<p class="code">begin | <p class="code">begin | ||
%rc float | %rc float | ||
Line 49: | Line 84: | ||
%regex = '\*bc?[5-8]' | %regex = '\*bc?[5-8]' | ||
%string = 'a | %string = 'a*b6' | ||
%rc = %string:regexmatch(%regex) | %rc = %string:regexmatch(%regex) | ||
Line 60: | Line 95: | ||
</p> | </p> | ||
The regex matches the input string; the example result is: | The regex matches the input string; the example result is: | ||
<p class="output">'\*bc?[5-8]' matches 'a | <p class="output">'\*bc?[5-8]' matches 'a*b6' | ||
</p> | </p> | ||
This regex demonstrates the following: | This regex demonstrates the following: | ||
Line 67: | Line 102: | ||
<li>An optional character ('<code>c?</code>') may fail to find a match, but this does not prevent the success of the overall match. | <li>An optional character ('<code>c?</code>') may fail to find a match, but this does not prevent the success of the overall match. | ||
<li>The character class range ('<code>[5-8]</code>') matches the '<code>6</code>' in the input string. | <li>The character class range ('<code>[5-8]</code>') matches the '<code>6</code>' in the input string. | ||
</ul | </ul> | ||
==See also== | ==See also== | ||
{{Template:String:RegexMatch footer}} | {{Template:String:RegexMatch footer}} |
Revision as of 20:55, 18 July 2011
Position after match of regex (String class)
The RegexMatch intrinsic function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the rules of regular expression matching.
Syntax
%number = string:RegexMatch( regex, [Options= string], - [CaptureList= stringlist]) Throws InvalidRegex
Syntax terms
%number | A variable to return the position of the character after the last character matched, or a zero if no characters in the method object string match the regular expression. | ||||||||
---|---|---|---|---|---|---|---|---|---|
string | The input string, to which the regular expression regex is applied. | ||||||||
regex | A string that is interpreted as a regular expression and is applied to the method object string to determine whether the regex matches string. | ||||||||
Options | This is an optional, but nameRequired, parameter supplying a string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not as you prefer. For more information about these options, see "Common regex options".
| ||||||||
CaptureList |
Exceptions
RegexMatch can throw the following exceptions:
- InvalidRegex
- If the regex parameter does not contain a valid regular expression. The exception object indicates the position of the character in the regex parameter where it was determined that the regular expression is invalid, and a description of the nature of the error.
Usage notes
- It is strongly recommended that you protect your environment from regular expression processing demands on PDL and STBL space by setting, say,
UTABLE LPDLST 3000
andUTABLE LSTBL 9000
. See "User Language programming considerations". - For information about additional methods that support regular expressions, see "Regex Processing".
- RegexMatch is something of a misnomer. It does not determine if a string matches a regular expression, it determines if a string contains a substring that matches a regular expression. RegexMatch behaves more like a matching method if the regular expression is "anchored" (begins with a caret ('
&circ;
') and ends with a dollar sign ('$
')), or if the C option indicates XML Schema mode. - RegexMatch is available as of Sirius Mods Version 7.2.
Examples
Finding the first position of one of several characters
A common programming problem is to "scan" a string, and find the first position which is one of several characters. This can be readily accomplished with RegexMatch; here is an example of this:
%regex = '[aeiou]'; * Scan for any vowel %str = 'That quick brown fox' %i = %str:regexMatch(%regex) if %i then printText Before vowel: {%str:Left(%i - 2)} printText The vowel: {%str:Char(%i-1)} printText After vowel: {%str:Substring(%i)}
The result of the above fragment is:
Before vowel: Th The vowel: a After vowel: t quick brown fox
Note that the position returned by RegexMatch is the position of the character after the first successful match.
In many cases, this programming problem is better performed using the StringTokenizer.
Finding the first position which isn't one of several characters
A similar programming task is finding the position of the first character which is not one of a set of characters; this is readily accomplished with RegexMatch (and is not as amenable to processing with the StringTokenizer. Here is an example:
%regex = '[' '5F':HexToString 'aeiou]'; * Scan for any non-vowel %i = %str:regexMatch(%regex) %str = 'albatross' if %i then printText Before non-vowel: {%str:Left(%i - 2)} printText The non-vowel: {%str:Char(%i-1)} printText After non-vowel: {%str:Substring(%i)}
The result of the above fragment is:
Before non-vowel: a The non-vowel: l After non-vowel: batross
Using a few other regex features
The following example tests whether the regex '\*bc?[5-8]' matches 'a*b6' (it does, and note that it matches "in the middle" of the string).
begin %rc float %regex longstring %string longstring %regex = '\*bc?[5-8]' %string = 'a*b6' %rc = %string:regexmatch(%regex) if %rc then printText '{%regex}' matches '{%string}' else printText '{%regex}' does not match '{%string}' end if end
The regex matches the input string; the example result is:
'\*bc?[5-8]' matches 'a*b6'
This regex demonstrates the following:
- To match a string, a regex pattern must merely "fit" a substring of the string.
- Metacharacters, in this case star ('
*
'), must be escaped. - An optional character ('
c?
') may fail to find a match, but this does not prevent the success of the overall match. - The character class range ('
[5-8]
') matches the '6
' in the input string.