$RegexMatch: Difference between revisions

From m204wiki
Jump to navigation Jump to search
No edit summary
m (minor formatting)
Line 2: Line 2:
<span class="pageSubtitle">Whether string matches regex</span>
<span class="pageSubtitle">Whether string matches regex</span>


<p class="warn"><b>Note: </b>Most Sirius $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $RegexMatch function is the <var>[[RegexMatch (String function)|RegexMatch]]</var> <var>String</var> function.</p>
<p class="warn"><b>Note:</b> Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the <var>$RegexMatch</var> function is the <var>[[RegexMatch (String function)|RegexMatch]]</var> <var>String</var> function.</p>


This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in [[Regex_processing#Regex_rules|Regex rules]]). The function is available as of Version 6.9 of the <var class="product">[[Sirius Mods]]</var>.  
This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in [[Regex_processing#Regex_rules|Regex rules]]).  


<var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.
<var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.
Line 24: Line 24:


<tr><th>options</th>
<tr><th>options</th>
<td>An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing#Common regex options|"Common regex options"]].
<td>An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing#Common regex options|Common regex options]].


<table class="syntaxTable">
<table>
<tr><th><var>I</var></th>
<tr><th><var>I</var></th>
<td>Do case-insensitive matching between <var class="term">instr</var> and <var class="term">regex</var>.</td></tr>
<td>Do case-insensitive matching between <var class="term">instr</var> and <var class="term">regex</var>.</td></tr>
<tr><th><var>S</var></th>
<tr><th><var>S</var></th>
<td>Dot-All mode: a dot (<tt>.</tt>) can match any character, including carriage return and linefeed.</td></tr>
<td>Dot-All mode: a dot (<tt>.</tt>) can match any character, including carriage return and linefeed.</td></tr>
<tr><th><var>M</var></th>
<tr><th><var>M</var></th>
<td>Multi-line mode: let anchor characters match end-of-line indicators '''wherever''' the indicator appears in the input string. <var>M</var> mode is ignored if <var>C</var> (XML Schema) mode is specified.</td></tr>
<td>Multi-line mode: let anchor characters match end-of-line indicators '''wherever''' the indicator appears in the input string. <var>M</var> mode is ignored if <var>C</var> (XML Schema) mode is specified.</td></tr>
<tr><th><var>C</var></th>
<tr><th><var>C</var></th>
<td>Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information, see [[Regex processing#XML Schema mode|"XML Schema mode"]].
<td>Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information, see [[Regex processing#XML Schema mode|XML Schema mode]].
</td></tr></table>
</td></tr></table>
</td></tr>
</td></tr>
Line 40: Line 43:
<tr><th>%status</th>
<tr><th>%status</th>
<td>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible:
<td>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible:
<table class="syntaxTable">
<table class="thJustBold">
<tr><th>1</th>
<tr><th>1</th>
<td>A successful match was obtained.</td></tr>
<td>A successful match was obtained.</td></tr>
<tr><th>0</th>
<tr><th>0</th>
<td>No match: '''inStr''' was not matched by '''regex'''.</td></tr>
<td>No match: <var class="term">inStr</var> was not matched by <var class="term">regex</var>. </td></tr>
<tr><th>-1<i>nnn</i></th>
 
<td>The pattern in '''regex''' is invalid. <i>nnn</i>, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to Version 7.0 of the <var class="product">Sirius Mods</var>, an invalid regex results in a <var class="term">%status</var> value of <tt>-1</tt>.
<tr><th>-1<i>nnn</i></th>'
<p>
<td>The pattern in <var class="term">regex</var> is invalid. <var class="term">nnn</var>, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1.
'''Note:''' If you omit this argument and a negative <var class="term">%status</var> value is to be returned, the run is cancelled. </p>
<p class="note"> '''Note:''' If you omit this argument and a negative <var class="term">%status</var> value is to be returned, the run is cancelled. </p></td></tr>
</td></tr></table>
</table>
</td></tr>
</td></tr>
</table>
</table>
Line 55: Line 59:
==Usage notes==
==Usage notes==
<ul>
<ul>
<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[Regex processing#SOUL programming considerations|SOUL programming considerations]].
<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[Regex processing#SOUL programming considerations|SOUL programming considerations]]. </li>
 
<li><var>$RegexMatch</var> is considered <var>Longstring</var>-capable. Its string inputs and outputs are considered <var>[[Longstrings]]</var> for expression-compilation purposes, and they have standard <var>Longstring</var> truncation behavior: truncation by assignment results in request cancellation. For more information, see [[Longstrings#Longstrings and $functions|"Longstrings and $functions"]].


<li>If <var class="term">%rc</var> is zero, either <var class="term">regex</var> did not match <var class="term">inStr</var>, or there was an error in the regex. The <var class="term">%status</var> argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match.  
<li><var>$RegexMatch</var> is considered <var>Longstring</var>-capable. Its string inputs and outputs are considered <var>[[Longstrings]]</var> for expression-compilation purposes, and they have standard <var>Longstring</var> truncation behavior: truncation by assignment results in request cancellation. For more information, see [[Longstrings#Longstrings and $functions|Longstrings and $functions]]. </li>


<li>For information about additional methods and $functions that support regular expressions, see [[Regex processing|"Regex processing"]].
<li>If <var class="term">%rc</var> is zero, either <var class="term">regex</var> did not match <var class="term">inStr</var>, or there was an error in the regex. The <var class="term">%status</var> argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match. </li>


<li><var>$RegexMatch</var> is available as of Version 6.9.
<li>For information about additional methods and $functions that support regular expressions, see [[Regex processing]]. </li>
</ul>
</ul>


Line 95: Line 97:
This regex demonstrates the following:
This regex demonstrates the following:
<ul>
<ul>
<li>To match a string, a regex pattern must merely "fit" a substring of the string.  
<li>To match a string, a regex pattern must merely "fit" a substring of the string. </li>
<li>Metacharacters, in this case star (<code>*</code>), must be escaped.  
 
<li>An optional character (<code>c?</code>) may fail to find a match, but this does not prevent the success of the overall match.  
<li>Metacharacters, in this case star (<code>*</code>), must be escaped. </li>
<li>The character class range (<code>[5-8]</code>) matches the <tt>6</tt> in the input string.
 
<li>An optional character (<code>c?</code>) may fail to find a match, but this does not prevent the success of the overall match. </li>
<li>The character class range (<code>[5-8]</code>) matches the <code>6</code> in the input string. </li>
</ul>
</ul>


==Products authorizing {{PAGENAMEE}}==  
==Products authorizing {{PAGENAMEE}}==  
<ul class="smallAndTightList">
<ul class="smallAndTightList">
<li>[[List of $functions|Sirius functions]]
<li>[[Sirius Functions]] </li>
</ul>
</ul>
<p>
</p>


[[Category:$Functions|$RegexMatch]]
[[Category:$Functions|$RegexMatch]]

Revision as of 19:51, 7 August 2018

Whether string matches regex

Note: Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $RegexMatch function is the RegexMatch String function.

This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in Regex rules).

$RegexMatch accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.

Syntax

[%rc =] $RegexMatch(inStr, regex, [options], [%status])

Syntax terms

%rc a number that is either 0 (if the regular expression was invalid or no match was found) or the position of the character after the last character matched.
instr The input string, to which the regular expression regex is applied. This is a required argument.
regex A string that is interpreted as a regular expression and is applied to the inStr argument to determine whether the regex matches inStr. This is a required argument.
options An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see Common regex options.
I Do case-insensitive matching between instr and regex.
S Dot-All mode: a dot (.) can match any character, including carriage return and linefeed.
M Multi-line mode: let anchor characters match end-of-line indicators wherever the indicator appears in the input string. M mode is ignored if C (XML Schema) mode is specified.
C Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information, see XML Schema mode.
%status The fourth argument is optional; if specified, it is set to an integer status value. These values are possible: '
1 A successful match was obtained.
0 No match: inStr was not matched by regex.
-1nnnThe pattern in regex is invalid. nnn, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1.

Note: If you omit this argument and a negative %status value is to be returned, the run is cancelled.

Usage notes

  • It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, UTABLE LPDLST 3000 and UTABLE LSTBL 9000. For further discussion of this, see SOUL programming considerations.
  • $RegexMatch is considered Longstring-capable. Its string inputs and outputs are considered Longstrings for expression-compilation purposes, and they have standard Longstring truncation behavior: truncation by assignment results in request cancellation. For more information, see Longstrings and $functions.
  • If %rc is zero, either regex did not match inStr, or there was an error in the regex. The %status argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match.
  • For information about additional methods and $functions that support regular expressions, see Regex processing.

Examples

The following example tests whether the regex \*bc?[5-8] matches the string a*b6. If the return code is 0 (no match), the status variable is checked for more information.

Begin %rc float %regex Longstring %String Longstring %Options string len 10 %status float %Options = %regex = '\*bc?[5-8]' %String = 'a\*b6' %rc = $RegexMatch (%String, %regex, %Options, %status) If (%rc EQ 0) then Print 'Status from $RegexMatch is ' %status Else Print %regex ' matches ' %String End If End

The regex matches the input string; the example result is:

\*bc?[5-8] matches a\*b6

This regex demonstrates the following:

  • To match a string, a regex pattern must merely "fit" a substring of the string.
  • Metacharacters, in this case star (*), must be escaped.
  • An optional character (c?) may fail to find a match, but this does not prevent the success of the overall match.
  • The character class range ([5-8]) matches the 6 in the input string.

Products authorizing $RegexMatch