ParseLines (Stringlist subroutine): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
mNo edit summary
 
(19 intermediate revisions by 6 users not shown)
Line 5: Line 5:
==Syntax==
==Syntax==
{{Template:Stringlist:ParseLines syntax}}
{{Template:Stringlist:ParseLines syntax}}
===Syntax terms===
===Syntax terms===
<table class="syntaxTable">
<table class="syntaxTable">
<tr><th>sl</th>
<tr><th>sl</th>
<td>A <var>Stringlist</var> object. </td></tr>
<td>A <var>Stringlist</var> object; the parsed lines are appended as items to <var class="term">sl</var>. </td></tr>
 
<tr><th>string</th>
<tr><th>string</th>
<td>The string to be parsed. </td></tr>
<td>The string to be parsed. </td></tr>
<tr><th>delims</th>
<tr><th>delims</th>
<td>A list of one or more delimiters to be used as line delimiters. The first character in the delimiter list is the separator character for the delimiter list itself. If this optional argument is not specified, the delimiter list used contains X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII. </td></tr>
<td>An, optional, list of one or more delimiters to be used as line delimiters. If specified, the first character in the delimiter list is the separator character for the delimiter list itself and the remainder are the delimiters to be used to separate the <var class="term">string</var> being parsed. If this optional argument is not specified, the delimiter list used contains <code>X'0D'</code>, <code>X'25'</code>, and <code>X'0D25'</code> (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII. </td></tr>
<tr><th><b>StripTrailingNull=</b> boolean</th>
 
<td>This name required argument (<tt>.StripTrailingNull</tt>) is a boolean value that indicates whether a trailing null line should be stripped.<tt>.StripTrailingNull</tt> is an optional argument that defaults to <tt>.true</tt>, which results in a trailing null line being stripped.</td></tr>
<tr><th><var>StripTrailingNull</var></th>
<td>This name required argument, <var>StripTrailingNull</var>, is a <var>boolean</var> value that indicates whether a trailing null line should be stripped. <var>StripTrailingNull</var> is an optional argument that defaults to <var>true</var>, which results in a trailing null line being stripped.</td></tr>
</table>
</table>


Line 20: Line 24:
<ul>
<ul>
<li>All errors in <var>ParseLines</var> result in request cancellation.
<li>All errors in <var>ParseLines</var> result in request cancellation.
<li>The default for trailing null-line stripping for <var>ParseLines</var> is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a <var>Stringlist</var> item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the <var>[[CreateLines (Stringlist function)|Createlines]]</var> method also ends the output string with the line delimiter by default. 
<p>
Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a <var>ParseLines</var> operation is used to load a <var>Stringlist</var>, the result of a <var>Createlines</var> with the same delimiter character would produce an output different from the input to <var>ParseLines</var>, if the original input did not end with a delimiter &mdash; a delimiter would be added. Using <var>False</var> for both the <var>ParseLines</var> <var>StripTrailingNull</var> parameter and the <var>Createlines</var> <var>AddTrailingDelimiter</var> parameter would ensure reversability of these two methods.</p>
</ul>
==Examples==
<ol>
<li>The following loads the names of three popular sandwich contents onto separate <var>Stringlist</var> items:
<li>The following loads the names of three popular sandwich contents onto separate <var>Stringlist</var> items:


<p class="code">%contents is object stringList
<p class="code">%contents is object stringList
%contents = new
...
...
%contentsarseLines('Bacon/Lettuce/Tomato', ' /')
%contents:parseLines('Bacon/Lettuce/Tomato', ' /')
</p>
</p>


The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the <var>ParseLines</var> method can be used for general-purpose parsing, though this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a longstring via $web_input_content, $web_file_content, or $web_output_content. As such, the default delimiter set for the <var>ParseLines</var> method is X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters).
The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the <var>ParseLines</var> method can be used for general-purpose parsing, although this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a string via <var>[[$Web_Input_Content]]</var>, <var>[[$Web_File_Content]]</var>, or <var>[[$Web_Output_Content]]</var>. Accordingly, the default delimiter set for the <var>ParseLines</var> method is <code>X'0D'</code>, <code>X'25'</code>, and <code>X'0D25'</code> (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively).
<li>In the following example, the contents of the file being uploaded via form field <tt>.MYFILE</tt> are moved to a <var>Stringlist</var>:
<li>In the following example, the contents of the file being uploaded via form field <code>MYFILE</code> are moved to a <var>Stringlist</var>:


<p class="code">%contents is object stringList
<p class="code">%contents is object stringList
Line 36: Line 49:
...
...
%contents = new
%contents = new
%contentsarseLines(%myfile)
%contents:parseLines(%myfile)
</p>
</p>


It would be possible (though probably awkward) to process the file in its native ASCII format, though the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used:
It would be possible (though probably awkward) to process the file in its native ASCII format, although the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used:


<p class="code">%asciiLinend is string len 32 static -
<p class="code">%asciiLinend is string len 32 static -
      initial('/' with $x2c('0D') with -
initial('/' with $x2c('0D') with -
      '/' with $x2c('0A') with -
        '/' with $x2c('0A') with -
      '/' with $x2c('0A0D'))
        '/' with $x2c('0A0D'))
%contents is object stringList
%contents is object stringList
%myfile is longstring
%myfile is longstring
Line 51: Line 64:
...
...
%contents = new
%contents = new
%contentsarseLines(%myfile, %asciiLinend)
%contents:parseLines(%myfile, %asciiLinend)
</p>
</p>


<li>As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used.
<li>As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used regardless of the order in which the delimiters are specified. For example, consider the following statement:
For example, consider the following statement:


<p class="code">%listarseLines('ab/+/cd+/ef+gh', ' /+/ +/ +')
<p class="code">%list:parseLines('ab/+/cd+/ef+gh', ' /+/ +/ +')
</p>
</p>


The resulting <var>Stringlist</var> would contain items "ab", "cd", "ef", and "gh", respectively. This is regardless of the order in which the delimiters are specified. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters.
The resulting <var>Stringlist</var> would contain items <code>'ab'</code>, <code>'cd'</code>, <code>'ef'</code>, and <code>'gh'</code>, respectively. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters.
<li>The default behavior of <var>ParseLines</var> is to strip a trailing null line. The following would add three items to the <var>Stringlist</var>: "a", "b", and "c".
<li>The default behavior of <var>ParseLines</var> is to strip a trailing null line. The following would add three items to the <var>Stringlist</var>: <code>'a'</code>, <code>'b'</code>, and <code>'c'</code>.


<p class="code">%listarseLines('a/b/c/', ' /')
<p class="code">%list:parseLines('a/b/c/', ' /')
</p>
</p>


This would produce results indistinguishable from:
This would produce results indistinguishable from:


<p class="code">%listarseLines('a/b/c', ' /')
<p class="code">%list:ParseLines('a/b/c', ' /')
</p>
</p>


If the distinction between these two cases is important, the<tt>.StripTrailingNull</tt> argument to <var>ParseLines</var> should be set to <tt>.false</tt>:
If the distinction between these two cases is important, the <var>StripTrailingNull</var> argument to <var>ParseLines</var> should be set to <var>False</var>:


<p class="code">%listarseLines('a/b/c/', ' /', stripTrailingNull=false)
<p class="code">%list:ParseLines('a/b/c/', ' /', stripTrailingNull=false)
</p>
</p></ol>


The default for trailing null-line stripping for <var>ParseLines</var> is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a <var>Stringlist</var> item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the [[CreateLines (Stringlist function)]] also ends the output string with the line delimiter by default. Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a <var>ParseLines</var> operation is used to load a <var>Stringlist</var>, the result of a CreateLines with the same delimiter character would produce an output different from the input to <var>ParseLines</var>, if the original input did not end with a delimiter -- a delimiter would be added. Using <tt>.false</tt> for both the <var>ParseLines</var> null-line-stripping parameter and the CreateLines add-terminating-delimiter parameter would ensure reversability of these two methods.
==See also==
<li>The <var>ParseLines</var> method is available in <var class=product>Sirius Mods</var> Version 6.7 and later.</ul>
<ul>
<li>The intrinsic <var>String</var> class <var>[[ParseLines (String function)|ParseLines]]</var> function
</ul>


==See also==
{{Template:Stringlist:ParseLines footer}}
{{Template:Stringlist:ParseLines footer}}

Latest revision as of 17:48, 27 August 2014

Parse delimited string, appending to this Stringlist (Stringlist class)


This method is a subroutine used to parse a delimited string into a Stringlist object. Generally, as the name of the method suggests, the string would contain line-end character delimited lines.

Syntax

sl:ParseLines( string, [delims], [StripTrailingNull= boolean])

Syntax terms

sl A Stringlist object; the parsed lines are appended as items to sl.
string The string to be parsed.
delims An, optional, list of one or more delimiters to be used as line delimiters. If specified, the first character in the delimiter list is the separator character for the delimiter list itself and the remainder are the delimiters to be used to separate the string being parsed. If this optional argument is not specified, the delimiter list used contains X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII.
StripTrailingNull This name required argument, StripTrailingNull, is a boolean value that indicates whether a trailing null line should be stripped. StripTrailingNull is an optional argument that defaults to true, which results in a trailing null line being stripped.

Usage notes

  • All errors in ParseLines result in request cancellation.
  • The default for trailing null-line stripping for ParseLines is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a Stringlist item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the Createlines method also ends the output string with the line delimiter by default.

    Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a ParseLines operation is used to load a Stringlist, the result of a Createlines with the same delimiter character would produce an output different from the input to ParseLines, if the original input did not end with a delimiter — a delimiter would be added. Using False for both the ParseLines StripTrailingNull parameter and the Createlines AddTrailingDelimiter parameter would ensure reversability of these two methods.

Examples

  1. The following loads the names of three popular sandwich contents onto separate Stringlist items:

    %contents is object stringList %contents = new ... %contents:parseLines('Bacon/Lettuce/Tomato', ' /')

    The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the ParseLines method can be used for general-purpose parsing, although this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a string via $Web_Input_Content, $Web_File_Content, or $Web_Output_Content. Accordingly, the default delimiter set for the ParseLines method is X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters respectively).

  2. In the following example, the contents of the file being uploaded via form field MYFILE are moved to a Stringlist:

    %contents is object stringList %myfile is longstring ... %myfile = $web_file_content('MYFILE', ,'text') ... %contents = new %contents:parseLines(%myfile)

    It would be possible (though probably awkward) to process the file in its native ASCII format, although the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used:

    %asciiLinend is string len 32 static - initial('/' with $x2c('0D') with - '/' with $x2c('0A') with - '/' with $x2c('0A0D')) %contents is object stringList %myfile is longstring ... %myfile = $web_file_content('MYFILE') ... %contents = new %contents:parseLines(%myfile, %asciiLinend)

  3. As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used regardless of the order in which the delimiters are specified. For example, consider the following statement:

    %list:parseLines('ab/+/cd+/ef+gh', ' /+/ +/ +')

    The resulting Stringlist would contain items 'ab', 'cd', 'ef', and 'gh', respectively. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters.

  4. The default behavior of ParseLines is to strip a trailing null line. The following would add three items to the Stringlist: 'a', 'b', and 'c'.

    %list:parseLines('a/b/c/', ' /')

    This would produce results indistinguishable from:

    %list:ParseLines('a/b/c', ' /')

    If the distinction between these two cases is important, the StripTrailingNull argument to ParseLines should be set to False:

    %list:ParseLines('a/b/c/', ' /', stripTrailingNull=false)

See also