ParseLines (Stringlist subroutine): Difference between revisions

From m204wiki
Jump to navigation Jump to search
No edit summary
 
m (1 revision)
(No difference)

Revision as of 14:54, 24 November 2010

Parse delimited string into Stringlist

This method is a subroutine used to parse a delimited string into a Stringlist object. Generally, as the name of the method suggests, the string would contain line-end character delimited lines.

ParaseLines is a member of the Stringlist class.

ParseLines Syntax

%sl:ParseLines( string, [delims] [, stripTrailingNull=boolean] )

Syntax Terms

%sl
A Stringlist object.
string
The string to be parsed.
delims
A list of one or more delimiters to be used as line delimiters. The first character in the delimiter list is the separator character for the delimiter list itself. If this optional argument is not specified, the delimiter list used contains X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters), and it should handle any line-end delimited data from an ASCII host that is been translated to ASCII.
StripTrailingNull= boolean
This name required argument (.StripTrailingNull) is a boolean value that indicates whether a trailing null line should be stripped..StripTrailingNull is an optional argument that defaults to .true, which results in a trailing null line being stripped.

Usage Notes

  • All errors in ParseLines result in request cancellation.
  • The following loads the names of three popular sandwich contents onto separate Stringlist items:
    %contents is object stringList
    ...
    %contentsarseLines('Bacon/Lettuce/Tomato', ' /')
    

    The space before the slash character is the delimiter-set delimiter character, that is, the character used to separate multiple delimiters, should there be more than one. As this example illustrates, the ParseLines method can be used for general-purpose parsing, though this was not really its intent and it is somewhat limited for this purpose. Its main use is expected to be in parsing text received into a longstring via $web_input_content, $web_file_content, or $web_output_content. As such, the default delimiter set for the ParseLines method is X'0D', X'25', and X'0D25' (which are the EBCDIC carriage-return, line-feed, and carriage-return/line-feed characters).

  • In the following example, the contents of the file being uploaded via form field .MYFILE are moved to a Stringlist:
    %contents is object stringList
    %myfile is longstring
    ...
    %myfile = $web_file_content('MYFILE', ,'text')
    ...
    %contents = new
    %contentsarseLines(%myfile)
    

    It would be possible (though probably awkward) to process the file in its native ASCII format, though the ASCII line-feed character is different from the EBCDIC line-feed character (interestingly, carriage return is the same), so a different delimiter set would have to be used:

    %asciiLinend is string len 32 static -
          initial('/' with $x2c('0D') with -
          '/' with $x2c('0A') with -
          '/' with $x2c('0A0D'))
    %contents is object stringList
    %myfile is longstring
    ...
    %myfile = $web_file_content('MYFILE')
    ...
    %contents = new
    %contentsarseLines(%myfile, %asciiLinend)
    
  • As with the default delimiter set, one delimiter can sometimes be contained in another. That is, for example, the carriage-return and line-feed are part of the carriage-return/line-feed delimiter. In such cases, the longest matching delimiter is used. For example, consider the following statement:
    %listarseLines('ab/+/cd+/ef+gh', ' /+/ +/ +')
    

    The resulting Stringlist would contain items "ab", "cd", "ef", and "gh", respectively. This is regardless of the order in which the delimiters are specified. The character used to separate the delimiters is irrelevant, other than the obvious fact that it can't appear in any of the delimiters.

  • The default behavior of ParseLines is to strip a trailing null line. The following would add three items to the Stringlist: "a", "b", and "c".
    %listarseLines('a/b/c/', ' /')
    

    This would produce results indistinguishable from:

    %listarseLines('a/b/c', ' /')
    

    If the distinction between these two cases is important, the.StripTrailingNull argument to ParseLines should be set to .false:

    %listarseLines('a/b/c/', ' /', stripTrailingNull=false)
    

    The default for trailing null-line stripping for ParseLines is based on the fact that most ASCII applications end the last line of a file with the line-end character(s). As such, not stripping the trailing null-line would create a Stringlist item where a line was not intended, so it would likely be a bit of an annoyance. Similarly, the CreateLines (Stringlist function) also ends the output string with the line delimiter by default. Still, the default trailing null-line handling is necessarily the most "correct" in some absolute sense. Specifically, if a ParseLines operation is used to load a Stringlist, the result of a CreateLines with the same delimiter character would produce an output different from the input to ParseLines, if the original input did not end with a delimiter -- a delimiter would be added. Using .false for both the ParseLines null-line-stripping parameter and the CreateLines add-terminating-delimiter parameter would ensure reversability of these two methods.

  • The ParseLines method is available in Sirius Mods Version 6.7 and later.