$RegexMatch: Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
 
(18 intermediate revisions by 6 users not shown)
Line 2: Line 2:
<span class="pageSubtitle">Whether string matches regex</span>
<span class="pageSubtitle">Whether string matches regex</span>


<p class="warning">Most Sirius $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $RegexMatch function is the [[RegexMatch (String function)]].</p>
<p class="warn"><b>Note:</b> Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the <var>$RegexMatch</var> function is the <var>[[RegexMatch (String function)|RegexMatch]]</var> <var>String</var> function.</p>


This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching (information about the rules observed is provided in ). The function is available as of Version 6.9 of the <var class="product">[[Sirius Mods]]</var>.  
This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in [[Regex_processing#Regex_rules|Regex rules]]).  
 
<var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.


==Syntax==
==Syntax==
<p class="syntax"><section begin="syntax" /> [%rc =] $RegexMatch(inStr, regex, [options], [%status])
<p class="syntax">[%rc =] $RegexMatch(inStr, regex, [options], [%status])
<section end="syntax" /></p>
<p class="caption">$RegexMatch Function
</p>
</p>
<p class="caption">'''%rc''', if specified, is a number that is either 0 if the regular expression was invalid or no match was found, or the position of the character '''after''' the last character matched.</p>


<var>$RegexMatch</var> accepts two required and two optional arguments, and it returns a numeric value. It is also callable . Specifying an invalid argument results in request cancellation.
===Syntax terms===
<ul>
<table class="syntaxTable">
<tr><th>%rc</th>
<td>a number that is either 0 (if the regular expression was invalid or no match was found) or the position of the character '''after''' the last character matched. </td></tr>


<li>The first argument is the input string, to which the regular expression '''regex''' is applied. This is a required argument.
<tr><th>instr</th>
<li>The second argument is a string that is interpreted as a regular expression and is applied to the '''inStr''' argument to determine whether the regex matches '''inStr'''. This is a required argument.
<td>The input string, to which the regular expression <var class="term">regex</var> is applied. This is a required argument. </td></tr>
<li>The third argument is an optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see
<table class="syntaxTable">
<tr><th>I</th>
<td>Do case-insensitive matching between '''string''' and '''regex'''.</td></tr>
<tr><th>S</th>
<td>Dot-All mode: a dot (".") can match any character, including carriage return and linefeed.</td></tr>
<tr><th>M</th>


<tr><th>regex</th>
<td>A string that is interpreted as a regular expression and is applied to the <var class="term">inStr</var> argument to determine whether the regex matches <var class="term">inStr</var>. This is a required argument. </td></tr>


<td>Multi-line mode: let anchor characters match end-of-line indicators '''wherever''' the indicator appears in the input string. M mode is ignored if C (XML Schema) mode is specified.</td></tr>
<tr><th>options</th>
<tr><th>C</th>
<td>An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing#Common regex options|Common regex options]].
<td>Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information,
</td></tr>
</td></tr></table>


<li>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible:
<tr><th>%status</th>
<table class="syntaxTable">
<td>The fourth argument is optional; if specified, it is set to an integer status value. These values are possible:
<tr><th>&amp;amp;thinsp.1</th>
<table class="thJustBold">
<tr><th>1</th>
<td>A successful match was obtained.</td></tr>
<td>A successful match was obtained.</td></tr>
<tr><th>&amp;amp;thinsp.0</th>
<td>No match: '''inStr''' was not matched by '''regex'''.</td></tr>
<tr><th>-1<i>nnn</i></th>
<td>The pattern in '''regex''' is invalid. <i>nnn</i>, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to Version 7.0 of the <var class="product">[[Sirius Mods]]</var>, an invalid regex results in a '''status''' value of <tt>-1</tt>. <p>'''Note: ''' If you omit this argument and a negative '''status''' value is to be returned, the run is cancelled. </p>
</td></tr></table>


</ul>
<tr><th>0</th>
<td>No match: <var class="term">inStr</var> was not matched by <var class="term">regex</var>. </td></tr>


==Notes==
<tr><th>-1<i>nnn</i></th>'
<td>The pattern in <var class="term">regex</var> is invalid. <var class="term">nnn</var>, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. 
<p class="note"> '''Note:''' If you omit this argument and a negative <var class="term">%status</var> value is to be returned, the run is cancelled. </p></td></tr>
</table>
</td></tr>
</table>


==Usage notes==
<ul>
<ul>
<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[Regex processing#SOUL programming considerations|SOUL programming considerations]]. </li>
<li><var>$RegexMatch</var> is considered <var>Longstring</var>-capable. Its string inputs and outputs are considered <var>[[Longstrings]]</var> for expression-compilation purposes, and they have standard <var>Longstring</var> truncation behavior: truncation by assignment results in request cancellation. For more information, see [[Longstrings#Longstrings and $functions|Longstrings and $functions]]. </li>


<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <tt>UTABLE LPDLST 3000</tt> and <tt>UTABLE LSTBL 9000</tt>. For further discussion of this,
<li>If <var class="term">%rc</var> is zero, either <var class="term">regex</var> did not match <var class="term">inStr</var>, or there was an error in the regex. The <var class="term">%status</var> argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match. </li>
<li>$RegexMatch is considered Longstring-capable. Its string inputs and outputs are considered Longstrings for expression-compilation purposes, and they have standard Longstring truncation behavior: truncation by assignment results in request cancellation. For more information,
 
<li>If '''%rc''' is zero, either '''regex''' did not match '''inStr''', or there was an error in the regex. The '''%status''' argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match.  
<li>For information about additional methods and $functions that support regular expressions, see [[Regex processing]]. </li>
<li>For information about additional methods and $functions that support regular expressions,
</ul>
</ul>


==Examples==
==Examples==
The following example tests whether the regex <code>\*bc?[5-8]</code> matches the string <code>a*b6</code>. If the return code is 0 (no match), the status variable is checked for more information.


The following example tests whether the regex <tt>\*bc?[5-8]</tt> matches the string <tt>a*b6</tt>. If the return code is 0 (no match), the status variable is checked for more information.
<p class="code">Begin
 
%rc float
<p class="code"> Begin
%regex Longstring
%rc float
%String Longstring
%regex Longstring
%Options string len 10
%String Longstring
%status float
%Options string len 10
%status float
   
   
%Options = ''
%Options = ''
%regex = '\*bc?[5-8]'
%regex = '\*bc?[5-8]'
%String = 'a\*b6'
%String = 'a\*b6'
   
   
%rc = $RegexMatch (%String, %regex, %Options, %status)
%rc = $RegexMatch (%String, %regex, %Options, %status)
If (%rc EQ 0) then
If (%rc EQ 0) then
    Print 'Status from <var>$RegexMatch</var> is ' %status
  Print 'Status from <var>$RegexMatch</var> is ' %status
Else
Else
    Print %regex ' matches ' %String
  Print %regex ' matches ' %String
End If
End If
End
End
</p>
</p>


The regex matches the input string; the example result is:
The regex matches the input string; the example result is:
<p class="code"> \*bc?[5-8] matches a\*b6
<p class="code">\*bc?[5-8] matches a\*b6
</p>
</p>


This regex demonstrates the following:
This regex demonstrates the following:
<ul>
<ul>
<li>To match a string, a regex pattern must merely "fit" a substring of the string. </li>
<li>Metacharacters, in this case star (<code>*</code>), must be escaped.  </li>


<li>To match a string, a regex pattern must merely "fit" a substring of the string.
<li>An optional character (<code>c?</code>) may fail to find a match, but this does not prevent the success of the overall match. </li>
<li>Metacharacters, in this case star (<tt>*</tt>), must be escaped.
<li>An optional character (<tt>c?</tt>) may fail to find a match, but this does not prevent the success of the overall match.  
<li>The character class range (<code>[5-8]</code>) matches the <code>6</code> in the input string. </li>
<li>The character class range (<tt>[5-8]</tt>) matches the <tt>6</tt> in the input string.
</ul>
</ul>


<var>$RegexMatch</var> is available as of Version 6.9.
==Products authorizing {{PAGENAMEE}}==
 
<ul class="smallAndTightList">
<ul class="smallAndTightList">
<li>[[Sirius functions]]
<li>[[Sirius Functions]] </li>
</ul>
</ul>
<p class="caption">Products authorizing $RegexMatch
</p>


[[Category:$Functions|$RegexMatch]]
[[Category:$Functions|$RegexMatch]]
[[Category:Regular expression processing]]

Latest revision as of 17:06, 21 January 2022

Whether string matches regex

Note: Many $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $RegexMatch function is the RegexMatch String function.

This function determines whether a given pattern (regular expression, or "regex") matches within a given string according to the "rules" of regular expression matching. (Information about the rules observed is provided in Regex rules).

$RegexMatch accepts two required and two optional arguments, and it returns a numeric value. It is also callable. Specifying an invalid argument results in request cancellation.

Syntax

[%rc =] $RegexMatch(inStr, regex, [options], [%status])

Syntax terms

%rc a number that is either 0 (if the regular expression was invalid or no match was found) or the position of the character after the last character matched.
instr The input string, to which the regular expression regex is applied. This is a required argument.
regex A string that is interpreted as a regular expression and is applied to the inStr argument to determine whether the regex matches inStr. This is a required argument.
options An optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see Common regex options.
%status The fourth argument is optional; if specified, it is set to an integer status value. These values are possible: '
1 A successful match was obtained.
0 No match: inStr was not matched by regex.
-1nnnThe pattern in regex is invalid. nnn, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1.

Note: If you omit this argument and a negative %status value is to be returned, the run is cancelled.

Usage notes

  • It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, UTABLE LPDLST 3000 and UTABLE LSTBL 9000. For further discussion of this, see SOUL programming considerations.
  • $RegexMatch is considered Longstring-capable. Its string inputs and outputs are considered Longstrings for expression-compilation purposes, and they have standard Longstring truncation behavior: truncation by assignment results in request cancellation. For more information, see Longstrings and $functions.
  • If %rc is zero, either regex did not match inStr, or there was an error in the regex. The %status argument returns additional information. If it is negative, it indicates an error. If it is zero, it indicates there was no error, but the regex did not match.
  • For information about additional methods and $functions that support regular expressions, see Regex processing.

Examples

The following example tests whether the regex \*bc?[5-8] matches the string a*b6. If the return code is 0 (no match), the status variable is checked for more information.

Begin %rc float %regex Longstring %String Longstring %Options string len 10 %status float %Options = %regex = '\*bc?[5-8]' %String = 'a\*b6' %rc = $RegexMatch (%String, %regex, %Options, %status) If (%rc EQ 0) then Print 'Status from $RegexMatch is ' %status Else Print %regex ' matches ' %String End If End

The regex matches the input string; the example result is:

\*bc?[5-8] matches a\*b6

This regex demonstrates the following:

  • To match a string, a regex pattern must merely "fit" a substring of the string.
  • Metacharacters, in this case star (*), must be escaped.
  • An optional character (c?) may fail to find a match, but this does not prevent the success of the overall match.
  • The character class range ([5-8]) matches the 6 in the input string.

Products authorizing $RegexMatch