New (StringTokenizer constructor)

From m204wiki
Revision as of 02:32, 16 December 2010 by 198.242.244.228 (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Create new StringTokenizer object

New is a member of the StringTokenizer class.

This method returns a new instance of a StringTokenizer object. It has three optional arguments that let you specify the delimiter characters that determine the tokens in the string that is being tokenized.

Syntax

  %tok = New( [TokenChars=chars]      -
            [, Spaces=chars]          -
            [, Quotes=chars] )

Syntax terms

%tok
A StringTokenizer object variable to contain the new object instance.
TokenChars= chars
This name required string argument (TokenChars) is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters. TokenChars is an optional argument that defaults to a null string.
Spaces= chars
This name required string argument (Spaces) is a set of “whitespace” characters, that is, characters that separate tokens. Each of these characters is a “non-token delimiter,” a delimiter that is not itself a token. Spaces is an optional argument that defaults to a blank character.
Quotes= chars
This name required string argument (Quotes) is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a “quoted region”) is treated as a single token, and any delimiter characters (Quote, Space, or TokenChar) within a quoted region are treated as non-delimiters. Quotes is an optional argument that defaults to a null string.

Usage Notes

  • A character may belong to at most one of the Spaces, Quotes, or TokenChars sets of characters.
  • If you are specifying Spaces, Quotes, or TokenChars, each character in the string is a quotation character — that is, you may not separate characters — and no character may repeat (except for apostrophe, which may be doubled).
  • A quoted region is not affected by the TokensToLower and TokensToUpper properties.

Examples

In the following request, a variety of delimiters is on display:

    begin
    %tok is object stringtokenizer

    %tok = new(spaces=' ;', tokenchars='=_', quotes='"')
    %tok:string = '--=_alternative 0016B5A2CA2574DD_=;  -
      Content-Type: text/plain; charset="US-ASCII"'

    repeat while not %tok:atEnd
       printText {~} is {%tok:nextToken}
    end repeat
    end

The result is:

    %tok:nextToken is --
    %tok:nextToken is =
    %tok:nextToken is _
    %tok:nextToken is alternative
    %tok:nextToken is 0016B5A2CA2574DD
    %tok:nextToken is _
    %tok:nextToken is =
    %tok:nextToken is Content-Type:
    %tok:nextToken is text/plain
    %tok:nextToken is charset
    %tok:nextToken is =
    %tok:nextToken is US-ASCII