ParseLines (Stringlist subroutine): Difference between revisions
m (1 revision) |
mNo edit summary |
||
(23 intermediate revisions by 6 users not shown) | |||
Line 5: | Line 5: | ||
==Syntax== | ==Syntax== | ||
{{Template:Stringlist:ParseLines syntax}} | {{Template:Stringlist:ParseLines syntax}} | ||
===Syntax terms=== | ===Syntax terms=== | ||
<table class="syntaxTable"> | <table class="syntaxTable"> | ||
<tr><th>sl</th> | <tr><th>sl</th> | ||
<td>A <var>Stringlist</var> object. </td></tr> | <td>A <var>Stringlist</var> object; the parsed lines are appended as items to <var class="term">sl</var>. </td></tr> | ||
<tr><th>string</th> | <tr><th>string</th> | ||
<td>The string to be parsed. </td></tr> | <td>The string to be parsed. </td></tr> | ||
<tr><th>delims</th> | <tr><th>delims</th> | ||
<td> | <td>An, optional, list of one or more delimiters to be used as line delimiters. If specified, the first character in the delimiter list is the separator character for the delimiter list itself and the remainder are the delimiters to be used to separate the <var class="term">string</var> being parsed. If this optional argument is not specified, the delimiter list used contains <code>X'0D'</code>, <code>X'25'</code>, and <code>X'0D25'</code> (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII. </td></tr> | ||
<tr><th>< | |||
<td>This name required argument | <tr><th><var>StripTrailingNull</var></th> | ||
<td>This name required argument, <var>StripTrailingNull</var>, is a <var>boolean</var> value that indicates whether a trailing null line should be stripped. <var>StripTrailingNull</var> is an optional argument that defaults to <var>true</var>, which results in a trailing null line being stripped.</td></tr> | |||
</table> | </table> | ||
Line 20: | Line 24: | ||
<ul> | <ul> | ||
<li>All errors in <var>ParseLines</var> result in request cancellation. | <li>All errors in <var>ParseLines</var> result in request cancellation. | ||
<li>The default for trailing null-line stripping for <var>ParseLines</var> is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a <var>Stringlist</var> item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the <var>[[CreateLines (Stringlist function)|Createlines]]</var> method also ends the output string with the line delimiter by default. | |||
<p> | |||
Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a <var>ParseLines</var> operation is used to load a <var>Stringlist</var>, the result of a <var>Createlines</var> with the same delimiter character would produce an output different from the input to <var>ParseLines</var>, if the original input did not end with a delimiter — a delimiter would be added. Using <var>False</var> for both the <var>ParseLines</var> <var>StripTrailingNull</var> parameter and the <var>Createlines</var> <var>AddTrailingDelimiter</var> parameter would ensure reversability of these two methods.</p> | |||
</ul> | |||
==Examples== | |||
<ol> | |||
<li>The following loads the names of three popular sandwich contents onto separate <var>Stringlist</var> items: | <li>The following loads the names of three popular sandwich contents onto separate <var>Stringlist</var> items: | ||
< | <p class="code">%contents is object stringList | ||
%contents is object stringList | %contents = new | ||
... | ... | ||
% | %contents:parseLines('Bacon/Lettuce/Tomato', ' /') | ||
</ | </p> | ||
The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the <var>ParseLines</var> method can be used for general-purpose parsing, | The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the <var>ParseLines</var> method can be used for general-purpose parsing, although this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a string via <var>[[$Web_Input_Content]]</var>, <var>[[$Web_File_Content]]</var>, or <var>[[$Web_Output_Content]]</var>. Accordingly, the default delimiter set for the <var>ParseLines</var> method is <code>X'0D'</code>, <code>X'25'</code>, and <code>X'0D25'</code> (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively). | ||
<li>In the following example, the contents of the file being uploaded via form field < | <li>In the following example, the contents of the file being uploaded via form field <code>MYFILE</code> are moved to a <var>Stringlist</var>: | ||
< | <p class="code">%contents is object stringList | ||
%contents is object stringList | |||
%myfile is longstring | %myfile is longstring | ||
... | ... | ||
Line 38: | Line 49: | ||
... | ... | ||
%contents = new | %contents = new | ||
% | %contents:parseLines(%myfile) | ||
</ | </p> | ||
It would be possible (though probably awkward) to process the file in its native ASCII format, | It would be possible (though probably awkward) to process the file in its native ASCII format, although the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used: | ||
< | <p class="code">%asciiLinend is string len 32 static - | ||
%asciiLinend is string len 32 static - | initial('/' with $x2c('0D') with - | ||
'/' with $x2c('0A') with - | |||
'/' with $x2c('0A0D')) | |||
%contents is object stringList | %contents is object stringList | ||
%myfile is longstring | %myfile is longstring | ||
Line 54: | Line 64: | ||
... | ... | ||
%contents = new | %contents = new | ||
% | %contents:parseLines(%myfile, %asciiLinend) | ||
</ | </p> | ||
<li>As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used. | <li>As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used regardless of the order in which the delimiters are specified. For example, consider the following statement: | ||
For example, consider the following statement: | |||
< | <p class="code">%list:parseLines('ab/+/cd+/ef+gh', ' /+/ +/ +') | ||
% | </p> | ||
</ | |||
The resulting <var>Stringlist</var> would contain items | The resulting <var>Stringlist</var> would contain items <code>'ab'</code>, <code>'cd'</code>, <code>'ef'</code>, and <code>'gh'</code>, respectively. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters. | ||
<li>The default behavior of <var>ParseLines</var> is to strip a trailing null line. The following would add three items to the <var>Stringlist</var>: | <li>The default behavior of <var>ParseLines</var> is to strip a trailing null line. The following would add three items to the <var>Stringlist</var>: <code>'a'</code>, <code>'b'</code>, and <code>'c'</code>. | ||
< | <p class="code">%list:parseLines('a/b/c/', ' /') | ||
% | </p> | ||
</ | |||
This would produce results indistinguishable from: | This would produce results indistinguishable from: | ||
< | <p class="code">%list:ParseLines('a/b/c', ' /') | ||
% | </p> | ||
</ | |||
If the distinction between these two cases is important, the< | If the distinction between these two cases is important, the <var>StripTrailingNull</var> argument to <var>ParseLines</var> should be set to <var>False</var>: | ||
< | <p class="code">%list:ParseLines('a/b/c/', ' /', stripTrailingNull=false) | ||
% | </p></ol> | ||
</ | |||
The | ==See also== | ||
<ul> | |||
<li>The intrinsic <var>String</var> class <var>[[ParseLines (String function)|ParseLines]]</var> function | |||
</ul> | |||
{{Template:Stringlist:ParseLines footer}} |
Latest revision as of 17:48, 27 August 2014
Parse delimited string, appending to this Stringlist (Stringlist class)
This method is a subroutine used to parse a delimited string into a Stringlist object. Generally, as the name of the method suggests, the string would contain line-end character delimited lines.
Syntax
sl:ParseLines( string, [delims], [StripTrailingNull= boolean])
Syntax terms
sl | A Stringlist object; the parsed lines are appended as items to sl. |
---|---|
string | The string to be parsed. |
delims | An, optional, list of one or more delimiters to be used as line delimiters. If specified, the first character in the delimiter list is the separator character for the delimiter list itself and the remainder are the delimiters to be used to separate the string being parsed. If this optional argument is not specified, the delimiter list used contains X'0D' , X'25' , and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII. |
StripTrailingNull | This name required argument, StripTrailingNull, is a boolean value that indicates whether a trailing null line should be stripped. StripTrailingNull is an optional argument that defaults to true, which results in a trailing null line being stripped. |
Usage notes
- All errors in ParseLines result in request cancellation.
- The default for trailing null-line stripping for ParseLines is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a Stringlist item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the Createlines method also ends the output string with the line delimiter by default.
Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a ParseLines operation is used to load a Stringlist, the result of a Createlines with the same delimiter character would produce an output different from the input to ParseLines, if the original input did not end with a delimiter — a delimiter would be added. Using False for both the ParseLines StripTrailingNull parameter and the Createlines AddTrailingDelimiter parameter would ensure reversability of these two methods.
Examples
- The following loads the names of three popular sandwich contents onto separate Stringlist items:
%contents is object stringList %contents = new ... %contents:parseLines('Bacon/Lettuce/Tomato', ' /')
The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the ParseLines method can be used for general-purpose parsing, although this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a string via $Web_Input_Content, $Web_File_Content, or $Web_Output_Content. Accordingly, the default delimiter set for the ParseLines method is
X'0D'
,X'25'
, andX'0D25'
(which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively). - In the following example, the contents of the file being uploaded via form field
MYFILE
are moved to a Stringlist:%contents is object stringList %myfile is longstring ... %myfile = $web_file_content('MYFILE', ,'text') ... %contents = new %contents:parseLines(%myfile)
It would be possible (though probably awkward) to process the file in its native ASCII format, although the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used:
%asciiLinend is string len 32 static - initial('/' with $x2c('0D') with - '/' with $x2c('0A') with - '/' with $x2c('0A0D')) %contents is object stringList %myfile is longstring ... %myfile = $web_file_content('MYFILE') ... %contents = new %contents:parseLines(%myfile, %asciiLinend)
- As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used regardless of the order in which the delimiters are specified. For example, consider the following statement:
%list:parseLines('ab/+/cd+/ef+gh', ' /+/ +/ +')
The resulting Stringlist would contain items
'ab'
,'cd'
,'ef'
, and'gh'
, respectively. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters. - The default behavior of ParseLines is to strip a trailing null line. The following would add three items to the Stringlist:
'a'
,'b'
, and'c'
.%list:parseLines('a/b/c/', ' /')
This would produce results indistinguishable from:
%list:ParseLines('a/b/c', ' /')
If the distinction between these two cases is important, the StripTrailingNull argument to ParseLines should be set to False:
%list:ParseLines('a/b/c/', ' /', stripTrailingNull=false)
See also
- The intrinsic String class ParseLines function