RegexSubset (Stringlist function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (1 revision)
(No difference)

Revision as of 15:24, 17 January 2011

Create subset of Stringlist that matches a regex (Stringlist class)


RegexSubset is a member of the Stringlist class.

This method returns a Stringlist that is a subset of the method Stringlist. The subset contains copies of all the items in the method Stringlist that are matched by a specified regex. Information about the regular expression matching rules observed is provided in Regex processing. The method is available as of Version 6.9 of the Sirius Mods. RegexSubset accepts one required and four optional arguments, and it returns a Stringlist. Specifying an invalid argument results in request cancellation.

Syntax

%subsetList = sl:RegexSubset( regex, [Options= string], [Status= %output], - [StartCol= number], [EndCol= number]) Throws InvalidRegex

Syntax terms

%subset A Stringlist that contains the %sl items matched by regex.
%sl A Stringlist object.
regex A string that is interpreted as a regular expression and is applied to the %sl method Stringlist items to determine whether the regex finds a match.
StartCol= num1 The StartCol argument (name required) is an optional number that specifies the starting column of the range of columns in which the matched string must be located. If specified, num1 must be greater than or equal to 1 and less than or equal to the EndCol argument value. If the argument is omitted, its default value is 1. If you specify a num1 value that is greater than the length of a particular %sl item, regex is matched against the empty string for that item.
EndCol= num2 The EndCol argument (name required) is an optional number that specifies the ending column of the range of columns in which the matched string must be located. If specified, num2 must be greater than or equal to 1, greater than or equal to the StartCol argument value, and less than or equal to the lesser of 6124 or the length of inStr. If the EndCol argument is omitted, its default value is 6124.

If the EndCol argument is omitted and a %sl item exceeds 6124 bytes, the run is cancelled.

Options= string The Options argument (name required) is an optional string of options. The options are single letters, which may be specified in uppercase or lowercase, in any combination, and separated by blanks or not separated. For more information about these options, see Regex processing.
I Do case-insensitive matching between string and regex.
S Dot-All mode: a dot (..) can match any character, including carriage return and linefeed.
M Multi-line mode: let anchor characters match end-of-line indicators wherever the indicator appears in the input string. M mode is ignored if C (XML Schema) mode is specified.
C Do the match according to XML Schema regex rules. Each regex is implicitly anchored at the beginning and end, and no characters serve as anchors. For more information, see Regex processing.
Status= num The Status argument (name required) is optional; if specified, it is set to an integer code. These values are possible:
&thinsp.n The number of %sl items that are matched.
&thinsp.0 No match: no items in :hp1.%sl:ehp1. were matched by regex.
-2 Invalid StartCol or EndCol argument value.
-1nnn The pattern in regex is invalid.nnn, the absolute value of the return minus 1000, gives the 1-based position of the character being scanned when the error was discovered. The value for an error occurring at end-of-string is the length of the string + 1. Prior to Version 7.0 of the Sirius Mods, an invalid regex results in a Status value of .-1.

If you omit this argument and a negative Status value is to be returned, the run is cancelled.

Usage notes

  • It is strongly recommended that you protect your environment from regex processing demands on PDL and STBL space by setting, say, .UTABLE LPDLST 3000 and .UTABLE LSTBL 9000. For further discussion of this, see User Language coding considerations.
  • The regex matching is limited to the first 6124 bytes of each item, but a matched item is copied in its entirety to the output subset.
  • Prior to copying matched items to %subset, any preexisting contents of that Stringlist are deleted.
  • For information about additional methods and $functions that support regular expressions, see Regex processing.

Examples

In the following code fragment, RegexSubset is applied to the method Stringlist .%sl to find the .%sl items that are matched by the regex .%\([a-z]*\). The regex is designed to find items that contain shared methods whose class names contain only upper and lowercase letters.

...
%sl = new
text to %sl
b
%doc is object xmlDoc
%(daemon):getInputObject(%doc)
%doc:selectSingleNode('/outer/inner'):addAttribute('foo','bar')
%(daemon):returnObject(%doc)
end
end text

%regex = '%\([a-z]*\)'
%opt='i'
%sl2 = %sl:RegexSubset (%regex, Options=%opt, Status=%st)

If (%st EQ 0) then
Print 'Status from RegexSubset is ' %st
Else
Print %regex ' matches the following items:'
End If
For %i from 1 to %sl2:Count
Print 'Matching item ' %i ' is: ' %sl2:Item(%i)
End For
...

This code would print the following:

%\([a-z]*\) matches the following items:
Matching item 1 is: %(daemon):getInputObject(%doc)
Matching item 2 is: %(daemon):returnObject(%doc)