RegexSubset (Stringlist function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
No edit summary
 
(22 intermediate revisions by 8 users not shown)
Line 1: Line 1:
{{Template:Stringlist:RegexSubset subtitle}}
{{Template:Stringlist:RegexSubset subtitle}}
<p>This method returns a <var>Stringlist</var> that is a subset of the method <var>Stringlist</var>. The subset contains copies of all the items in the method <var>Stringlist</var> that are matched by a specified regex. Information about the regular expression matching rules observed is provided in [[Regex processing]].</p><p><var>RegexSubset</var> accepts one required and four optional arguments, and it returns a <var>Stringlist</var>.</p>
<p>This method returns a <var>Stringlist</var> that is a subset of the method <var>Stringlist</var>. The subset contains copies of all the items in the method <var>Stringlist</var> that are matched by a specified regex. Information about the regular expression matching rules observed is provided in [[Regex_processing#Regex_rules|"Regex processing rules"]].</p>


==Syntax==
==Syntax==
{{Template:Stringlist:RegexSubset syntax}}
{{Template:Stringlist:RegexSubset syntax}}
===Syntax terms===
===Syntax terms===
<table class="syntaxTable">
<table class="syntaxTable">
<tr><th>%subsetList</th>
<tr><th>%subsetList</th>
<td>A <var>Stringlist</var> that contains the <var class="term">sl</var> items matched by <var class="term">regex</var>.</td></tr>
<td>A <var>Stringlist</var> that contains the <var class="term">sl</var> items matched by <var class="term">regex</var>.</td></tr>
<tr><th>sl</th>
<tr><th>sl</th>
<td>A <var>Stringlist</var> object.</td></tr>
<td>A <var>Stringlist</var> object.</td></tr>
<tr><th>regex</th>
<tr><th>regex</th>
<td>A string that is interpreted as a regular expression and is applied to the <var class="term">sl</var> method <var>Stringlist</var> items to determine whether the <var class="term">regex</var> finds a match.</td></tr>
<td>A string that is interpreted as a regular expression and is applied to the <var class="term">sl</var> method <var>Stringlist</var> items to determine whether the <var class="term">regex</var> finds a match.</td></tr>
<tr><th>Options</th>
 
<td>The <var class="term">Options</var> argument (name required) is an optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see [[Regex processing]].
<tr><th><var>Options</var></th>
<table class="syntaxNested">
<td>This is an optional, [[Notation conventions for methods#Named parameters|name required]], parameter supplying a string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not as you prefer. For more information about these options, see [[Regex_processing#Common_regex_options|Common regex options]].
<tr><th>I</th>
<td>Do case-insensitive matching between <var class="term">sl</var> and <var class="term">regex</var>.</td></tr>
<tr><th>S</th>
<td>Dot-All mode: a dot (<code>'.'</code>) can match any character, including carriage return and linefeed.</td></tr>
<tr><th>M</th>
<td>Multi-line mode: let anchor characters match end-of-line indicators <b><i>wherever</i></b> the indicator appears in the input string. <code>'M'</code> mode is ignored if <code>'C'</code> (XML Schema) mode is specified.</td></tr>
<tr><th>C</th>
<td>Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information, see [[Regex processing]].</td></tr>
</table>
</td></tr>
</td></tr>
<tr><th><b>Status</b></th>
 
<td>The <var class="term">Status</var> argument (name required) is optional; if specified, it is set to an integer code. These values are possible:
<tr><th><var>Status</var></th>
<td>The <var>Status</var> argument (name required) is optional; if specified, it is set to an integer code. These values are possible:
 
<table class="syntaxNested">
<table class="syntaxNested">
<tr><th><i>n</i></th>
<tr><th>n</th>
<td>The number of <var class="term">sl</var> items that are matched.</td></tr>
<td>The number of <var class="term">sl</var> items that are matched.</td></tr>
<tr><th>0</th>
 
<tr><th><var>0</var></th>
<td>No match: no items in <var class="term">sl</var> were matched by <var class="term">regex</var>.</td></tr>
<td>No match: no items in <var class="term">sl</var> were matched by <var class="term">regex</var>.</td></tr>
<tr><th>-2</th>
 
<td>Invalid <var class="term">StartCol</var> or <var class="term">EndCol</var> argument value.</td></tr>
<tr><th><var>-2</var></th>
<tr><th>-1<i>nnn</i></th>
<td>Invalid <var>StartCol</var> or <var>EndCol</var> argument value.</td></tr>
<td>The pattern in  <var class="term">regex</var> is invalid.  <i>nnn</i> (the absolute value of the return minus 1000) gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to <var class="product">Sirius Mods</var> Version 7.0, an invalid <var class="term">regex</var> results in a <var class="term">Status</var> value of <code>-1</code>.</td></tr>
 
<tr><th><var>-1</var>nnn</th>
<td>The pattern in  <var class="term">regex</var> is invalid.  <i>nnn</i> (the absolute value of the return minus 1000) gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to <var class="product">Sirius Mods</var> Version 7.0, an invalid <var class="term">regex</var> results in a <var>Status</var> value of <code>-1</code>.</td></tr>
</table>
</table>
If you omit this argument and a negative Status value is to be returned, the run is cancelled.</td></tr>
If you omit this argument and a negative Status value is to be returned, the run is cancelled.</td></tr>
<tr><th><b>StartCol</b></th>
 
<td>The <var class="term">StartCol</var> argument (name required) is an optional <var class="term">number</var> that specifies the starting column of the range of columns in which the matched string must be located. If specified, <var class="term">number</var> must be greater than or equal to 1 and less than or equal to the <var class="term">EndCol</var> argument value. If the argument is omitted, its default value is 1. If you specify a <var class="term">number</var> value that is greater than the length of a particular <var class="term">sl</var> item, the <var class="term">regex</var> is matched against the empty string for that item.</td>
<tr><th><var>StartCol</var></th>
<tr><th><b>EndCol</b></th>
<td>The <var>StartCol</var> argument (name required) is an optional <var class="term">number</var> that specifies the starting column of the range of columns in which the matched string must be located. If specified, <var class="term">number</var> must be greater than or equal to 1 and less than or equal to the <var>EndCol</var> argument value. If the argument is omitted, its default value is 1. If you specify a <var class="term">number</var> value that is greater than the length of a particular <var class="term">sl</var> item, the <var class="term">regex</var> is matched against the empty string for that item.</td>
<td>The <var class="term">EndCol</var> argument (name required) is an optional <var class="term">number</var> that specifies the ending column of the range of columns in which the matched string must be located. If specified, <var class="term">number</var> must be greater than or equal to 1, and greater than or equal to the <var class="term">StartCol</var> argument value, and less than or equal to the lesser of 6124 or the length of <var class="term">sl</var> item. If the <var class="term">EndCol</var> argument is omitted, its default value is 6124.<p>If the <var class="term">EndCol</var> argument is omitted and a <var class="term">sl</var> item exceeds 6124 bytes, the request is cancelled.</p></td>
 
<tr><th><var>EndCol</var></th>
<td>The <var>EndCol</var> argument (name required) is an optional <var class="term">number</var> that specifies the ending column of the range of columns in which the matched string must be located. If specified, <var class="term">number</var> must be greater than or equal to 1, and greater than or equal to the <var>StartCol</var> argument value, and less than or equal to the lesser of 6124 or the length of <var class="term">sl</var> item. If the <var>EndCol</var> argument is omitted, its default value is 6124.<p>If the <var>EndCol</var> argument is omitted and a <var class="term">sl</var> item exceeds 6124 bytes, the request is cancelled.</p></td>
</table>
</table>


==Usage notes==
==Usage notes==
<ul><li>All errors in <var class="term">RegexSubset</var>, including invalid argument(s) result in request cancellation.<li>It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. For further discussion of this, see [[User Language coding considerations]].
<ul>
<li>All errors in <var>RegexSubset</var>, including invalid argument(s) result in request cancellation.
 
<li>It is strongly recommended that you protect your environment from regular expression processing demands on PDL and STBL space by setting, say, <code>UTABLE LPDLST 3000</code> and <code>UTABLE LSTBL 9000</code>. See [[Regex processing#SOUL programming considerations|SOUL programming considerations]].
 
<li>The regex matching is limited to the first 6124 bytes of each item, but a matched item is copied in its entirety to the output subset.<li>Prior to copying matched items to <var class="term">%subsetList</var>, any preexisting contents of that <var>Stringlist</var> are deleted.
<li>The regex matching is limited to the first 6124 bytes of each item, but a matched item is copied in its entirety to the output subset.<li>Prior to copying matched items to <var class="term">%subsetList</var>, any preexisting contents of that <var>Stringlist</var> are deleted.
<li>For information about additional methods and $functions that support regular expressions, see [[Regex processing]].
 
<li><var class="term">RegexSubset</var> is available as of <var class="product">Sirius Mods</var> Version 6.9.</ul>
<li>For information about additional methods and $functions that support regular expressions, see [[Regex_processing|"Regex Processing"]].
 
<li><var>RegexSubset</var> is available as of <var class="product">Sirius Mods</var> Version 6.9.
</ul>


==Examples==
==Examples==
<ol><li>In the following code fragment, <var>RegexSubset</var> is applied to the method <var>Stringlist</var> <code>%sl</code> to find the <code>%sl</code> items that are matched by the regex <code>%\([a-z]*\)</code>. The regex is designed to find items that contain shared methods whose class names contain only upper and lowercase letters.
In the following code fragment, <var>RegexSubset</var> is applied to the method <var>Stringlist</var> <code>%sl</code> to find the <code>%sl</code> items that are matched by the regex <code>%\([a-z]*\)</code>. The regex is designed to find items that contain shared methods whose class names contain only upper and lowercase letters.


<p class="code"> ...
<p class="code">...
%sl = new
%sl = new
text to %sl
text to %sl
  b
    b
      %doc is object xmlDoc
    %doc is object xmlDoc
      %(daemon):getInput<var>Object</var>(%doc)
    %(daemon):getInput<var>Object</var>(%doc)
      %doc:selectSingleNode('/outer/inner'):addAttribute('foo','bar')
    %doc:selectSingleNode('/outer/inner'):addAttribute('foo','bar')
      %(daemon):return<var>Object</var>(%doc)
    %(daemon):return<var>Object</var>(%doc)
  end
    end
end text
end text


Line 76: Line 84:
   Print 'Matching item ' %i ' is: ' %sl2:Item(%i)
   Print 'Matching item ' %i ' is: ' %sl2:Item(%i)
End For
End For
  ...
</p>
</p>


Line 84: Line 91:
Matching item 1 is: %(daemon):getInputObject(%doc)
Matching item 1 is: %(daemon):getInputObject(%doc)
Matching item 2 is: %(daemon):returnObject(%doc)
Matching item 2 is: %(daemon):returnObject(%doc)
</p></ol>
</p>


==See also==
==See also==
{{Template:Stringlist:RegexSubset footer}}
{{Template:Stringlist:RegexSubset footer}}
[[Category:Regular expression processing]]

Latest revision as of 22:18, 21 January 2022

Create subset of Stringlist that matches a regex (Stringlist class)

This method returns a Stringlist that is a subset of the method Stringlist. The subset contains copies of all the items in the method Stringlist that are matched by a specified regex. Information about the regular expression matching rules observed is provided in "Regex processing rules".

Syntax

%subsetList = sl:RegexSubset( regex, [Options= string], [Status= %output], - [StartCol= number], [EndCol= number]) Throws InvalidRegex

Syntax terms

%subsetList A Stringlist that contains the sl items matched by regex.
sl A Stringlist object.
regex A string that is interpreted as a regular expression and is applied to the sl method Stringlist items to determine whether the regex finds a match.
Options This is an optional, name required, parameter supplying a string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not as you prefer. For more information about these options, see Common regex options.
Status The Status argument (name required) is optional; if specified, it is set to an integer code. These values are possible:
n The number of sl items that are matched.
0 No match: no items in sl were matched by regex.
-2 Invalid StartCol or EndCol argument value.
-1nnn The pattern in regex is invalid. nnn (the absolute value of the return minus 1000) gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to Sirius Mods Version 7.0, an invalid regex results in a Status value of -1.
If you omit this argument and a negative Status value is to be returned, the run is cancelled.
StartCol The StartCol argument (name required) is an optional number that specifies the starting column of the range of columns in which the matched string must be located. If specified, number must be greater than or equal to 1 and less than or equal to the EndCol argument value. If the argument is omitted, its default value is 1. If you specify a number value that is greater than the length of a particular sl item, the regex is matched against the empty string for that item.
EndCol The EndCol argument (name required) is an optional number that specifies the ending column of the range of columns in which the matched string must be located. If specified, number must be greater than or equal to 1, and greater than or equal to the StartCol argument value, and less than or equal to the lesser of 6124 or the length of sl item. If the EndCol argument is omitted, its default value is 6124.

If the EndCol argument is omitted and a sl item exceeds 6124 bytes, the request is cancelled.

Usage notes

  • All errors in RegexSubset, including invalid argument(s) result in request cancellation.
  • It is strongly recommended that you protect your environment from regular expression processing demands on PDL and STBL space by setting, say, UTABLE LPDLST 3000 and UTABLE LSTBL 9000. See SOUL programming considerations.
  • The regex matching is limited to the first 6124 bytes of each item, but a matched item is copied in its entirety to the output subset.
  • Prior to copying matched items to %subsetList, any preexisting contents of that Stringlist are deleted.
  • For information about additional methods and $functions that support regular expressions, see "Regex Processing".
  • RegexSubset is available as of Sirius Mods Version 6.9.

Examples

In the following code fragment, RegexSubset is applied to the method Stringlist %sl to find the %sl items that are matched by the regex %\([a-z]*\). The regex is designed to find items that contain shared methods whose class names contain only upper and lowercase letters.

... %sl = new text to %sl b %doc is object xmlDoc  %(daemon):getInputObject(%doc) %doc:selectSingleNode('/outer/inner'):addAttribute('foo','bar')  %(daemon):returnObject(%doc) end end text %regex = '%\([a-z]*\)' %opt='i' %sl2 = %sl:RegexSubset (%regex, Options=%opt, Status=%st) If (%st EQ 0) then Print 'Status from RegexSubset is ' %st Else Print %regex ' matches the following items:' End If For %i from 1 to %sl2:Count Print 'Matching item ' %i ' is: ' %sl2:Item(%i) End For

This code would print the following:

%\([a-z]*\) matches the following items: Matching item 1 is: %(daemon):getInputObject(%doc) Matching item 2 is: %(daemon):returnObject(%doc)

See also