$RegexMatch: Difference between revisions
mNo edit summary |
|||
(17 intermediate revisions by 6 users not shown) | |||
Line 2: | Line 2: | ||
<span class="pageSubtitle">Whether string matches regex</span> | <span class="pageSubtitle">Whether string matches regex</span> | ||
<p class=" | <p class="warn"><b>Note:</b> Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the <var>$RegexMatch</var> function is the <var>[[RegexMatch (String function)|RegexMatch]]</var> <var>String</var> function.</p> | ||
This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching ( | This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in [[Regex_processing#Regex_rules|Regex rules]]). | ||
<var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation. | <var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation. | ||
==Syntax== | ==Syntax== | ||
<p class="syntax" | <p class="syntax">[%rc =] $RegexMatch(inStr, regex, [options], [%status]) | ||
</p> | |||
===Syntax terms=== | ===Syntax terms=== | ||
Line 24: | Line 24: | ||
<tr><th>options</th> | <tr><th>options</th> | ||
<td>An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing#Common regex options| | <td>An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing#Common regex options|Common regex options]]. | ||
</td></tr> | </td></tr> | ||
<tr><th>%status</th> | <tr><th>%status</th> | ||
<td>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible: | <td>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible: | ||
<table class=" | <table class="thJustBold"> | ||
<tr><th>1</th> | <tr><th>1</th> | ||
<td>A successful match was obtained.</td></tr> | <td>A successful match was obtained.</td></tr> | ||
<tr><th>0</th> | <tr><th>0</th> | ||
<td>No match: | <td>No match: <var class="term">inStr</var> was not matched by <var class="term">regex</var>. </td></tr> | ||
<tr><th>-1<i>nnn</i></th> | |||
<td>The pattern in | <tr><th>-1<i>nnn</i></th>' | ||
<td>The pattern in <var class="term">regex</var> is invalid. <var class="term">nnn</var>, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. | |||
'''Note:''' If you omit this argument and a negative <var class="term">%status</var> value is to be returned, the run is cancelled. </p> | <p class="note"> '''Note:''' If you omit this argument and a negative <var class="term">%status</var> value is to be returned, the run is cancelled. </p></td></tr> | ||
</td></tr></table> | </table> | ||
</td></tr> | </td></tr> | ||
</table> | </table> | ||
Line 55: | Line 45: | ||
==Usage notes== | ==Usage notes== | ||
<ul> | <ul> | ||
<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[Regex processing# | <li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[Regex processing#SOUL programming considerations|SOUL programming considerations]]. </li> | ||
<li> | |||
<li> | <li><var>$RegexMatch</var> is considered <var>Longstring</var>-capable. Its string inputs and outputs are considered <var>[[Longstrings]]</var> for expression-compilation purposes, and they have standard <var>Longstring</var> truncation behavior: truncation by assignment results in request cancellation. For more information, see [[Longstrings#Longstrings and $functions|Longstrings and $functions]]. </li> | ||
<li> | <li>If <var class="term">%rc</var> is zero, either <var class="term">regex</var> did not match <var class="term">inStr</var>, or there was an error in the regex. The <var class="term">%status</var> argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match. </li> | ||
<li> | <li>For information about additional methods and $functions that support regular expressions, see [[Regex processing]]. </li> | ||
</ul> | </ul> | ||
Line 95: | Line 83: | ||
This regex demonstrates the following: | This regex demonstrates the following: | ||
<ul> | <ul> | ||
<li>To match a string, a regex pattern must merely "fit" a substring of the string. | <li>To match a string, a regex pattern must merely "fit" a substring of the string. </li> | ||
<li>Metacharacters, in this case star (<code>*</code>), must be escaped. | |||
<li>An optional character (<code>c?</code>) may fail to find a match, but this does not prevent the success of the overall match. | <li>Metacharacters, in this case star (<code>*</code>), must be escaped. </li> | ||
<li>The character class range (<code>[5-8]</code>) matches the < | |||
<li>An optional character (<code>c?</code>) may fail to find a match, but this does not prevent the success of the overall match. </li> | |||
<li>The character class range (<code>[5-8]</code>) matches the <code>6</code> in the input string. </li> | |||
</ul> | </ul> | ||
==Products authorizing {{PAGENAMEE}}== | |||
<ul class="smallAndTightList"> | <ul class="smallAndTightList"> | ||
<li>[[Sirius | <li>[[Sirius Functions]] </li> | ||
</ul> | </ul> | ||
[[Category:$Functions|$RegexMatch]] | [[Category:$Functions|$RegexMatch]] | ||
[[Category:Regular expression processing]] |
Latest revision as of 17:06, 21 January 2022
Whether string matches regex
Note: Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $RegexMatch function is the RegexMatch String function.
This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in Regex rules).
$RegexMatch accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.
Syntax
[%rc =] $RegexMatch(inStr, regex, [options], [%status])
Syntax terms
%rc | a number that is either 0 (if the regular expression was invalid or no match was found) or the position of the character after the last character matched. | ||||||
---|---|---|---|---|---|---|---|
instr | The input string, to which the regular expression regex is applied. This is a required argument. | ||||||
regex | A string that is interpreted as a regular expression and is applied to the inStr argument to determine whether the regex matches inStr. This is a required argument. | ||||||
options | An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see Common regex options. | ||||||
%status | The fourth argument is optional; if specified, it is set to an integer status value. These values are possible:
'
|
Usage notes
- It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say,
UTABLE LPDLST 3000
andUTABLE LSTBL 9000
. For further discussion of this, see SOUL programming considerations. - $RegexMatch is considered Longstring-capable. Its string inputs and outputs are considered Longstrings for expression-compilation purposes, and they have standard Longstring truncation behavior: truncation by assignment results in request cancellation. For more information, see Longstrings and $functions.
- If %rc is zero, either regex did not match inStr, or there was an error in the regex. The %status argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match.
- For information about additional methods and $functions that support regular expressions, see Regex processing.
Examples
The following example tests whether the regex \*bc?[5-8]
matches the string a*b6
. If the return code is 0 (no match), the status variable is checked for more information.
Begin %rc float %regex Longstring %String Longstring %Options string len 10 %status float %Options = %regex = '\*bc?[5-8]' %String = 'a\*b6' %rc = $RegexMatch (%String, %regex, %Options, %status) If (%rc EQ 0) then Print 'Status from $RegexMatch is ' %status Else Print %regex ' matches ' %String End If End
The regex matches the input string; the example result is:
\*bc?[5-8] matches a\*b6
This regex demonstrates the following:
- To match a string, a regex pattern must merely "fit" a substring of the string.
- Metacharacters, in this case star (
*
), must be escaped. - An optional character (
c?
) may fail to find a match, but this does not prevent the success of the overall match. - The character class range (
[5-8]
) matches the6
in the input string.