New (StringTokenizer constructor): Difference between revisions

From m204wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 1: Line 1:
<span style="font-size:120%; color:black"><b>Create new StringTokenizer object</b></span>
{{Template:StringTokenizer:New subtitle}}
[[Category:StringTokenizer methods|New constructor]]
<!--DPL?? Category:StringTokenizer methods|New constructor: Create new StringTokenizer object-->
<!--DPL?? Category:System methods|New (StringTokenizer constructor): Create new StringTokenizer object-->
<p>
New is a member of the [[StringTokenizer class]].
</p>


This method returns a new instance of a StringTokenizer object.
This method returns a new instance of a StringTokenizer object.
It has three optional arguments that let you specify the delimiter characters
It has three optional arguments that let you specify the delimiter characters
that determine the tokens in the string that is being tokenized.
that determine the tokens in the string that is being tokenized.
===Syntax===
==Syntax==
  %tok = New( [TokenChars=chars]      -
{{Template:StringTokenizer:New syntax}}
            [,&nbsp;Spaces=chars]          -
            [,&nbsp;Quotes=chars] )
===Syntax terms===
===Syntax terms===
<dl>
<dl>
Line 25: Line 17:
TokenChars is an optional argument that defaults to a null string.
TokenChars is an optional argument that defaults to a null string.
<dt><b>Spaces= </b><i>chars</i>
<dt><b>Spaces= </b><i>chars</i>
<dd>This name required string argument (Spaces) is a set of &ldquo;whitespace&rdquo;
<dd>This name required string argument (Spaces) is a set of "whitespace"
characters, that is, characters that separate tokens.
characters, that is, characters that separate tokens.
Each of these characters is
Each of these characters is
a &ldquo;non-token delimiter,&rdquo; a delimiter that is not itself a token.
a "non-token delimiter," a delimiter that is not itself a token.


Spaces is an optional argument that defaults to a blank character.
Spaces is an optional argument that defaults to a blank character.
<dt><b>Quotes= </b><i>chars</i>
<dt><b>Quotes= </b><i>chars</i>
<dd>This name required string argument (Quotes) is a set of quotation characters.
<dd>This name required string argument (Quotes) is a set of quotation characters.
The text between each disjoint pair of identical quotation characters (a &ldquo;quoted
The text between each disjoint pair of identical quotation characters (a "quoted
region&rdquo;) is treated as a single token, and any delimiter characters (Quote,
region") is treated as a single token, and any delimiter characters (Quote,
Space, or TokenChar) within a quoted region are treated as non-delimiters.
Space, or TokenChar) within a quoted region are treated as non-delimiters.


Line 40: Line 32:


</dl>
</dl>
===Usage Notes===
==Usage notes==
<ul>
<ul>
<li>A character may belong to at most one of the Spaces,
<li>A character may belong to at most one of the Spaces,
Line 51: Line 43:
and [[TokensToUpper (StringTokenizer property)|TokensToUpper]] properties.
and [[TokensToUpper (StringTokenizer property)|TokensToUpper]] properties.
</ul>
</ul>
===Examples===
==Examples==


In the following request, a variety of delimiters is on display:
In the following request, a variety of delimiters is on display:
Line 83: Line 75:
     %tok:nextToken is US-ASCII
     %tok:nextToken is US-ASCII
</pre>
</pre>
==See also==
{{Template:StringTokenizer:New footer}}

Revision as of 19:54, 6 February 2011

Create new StringTokenizer object (StringTokenizer class)


This method returns a new instance of a StringTokenizer object. It has three optional arguments that let you specify the delimiter characters that determine the tokens in the string that is being tokenized.

Syntax

%stringTokenizer = [%(StringTokenizer):]New[( [TokenChars= string], - [Spaces= string], - [Quotes= string], - [Separators= string])]

Syntax terms

%tok
A StringTokenizer object variable to contain the new object instance.
TokenChars= chars
This name required string argument (TokenChars) is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters. TokenChars is an optional argument that defaults to a null string.
Spaces= chars
This name required string argument (Spaces) is a set of "whitespace" characters, that is, characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. Spaces is an optional argument that defaults to a blank character.
Quotes= chars
This name required string argument (Quotes) is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a "quoted region") is treated as a single token, and any delimiter characters (Quote, Space, or TokenChar) within a quoted region are treated as non-delimiters. Quotes is an optional argument that defaults to a null string.

Usage notes

  • A character may belong to at most one of the Spaces, Quotes, or TokenChars sets of characters.
  • If you are specifying Spaces, Quotes, or TokenChars, each character in the string is a quotation character — that is, you may not separate characters — and no character may repeat (except for apostrophe, which may be doubled).
  • A quoted region is not affected by the TokensToLower and TokensToUpper properties.

Examples

In the following request, a variety of delimiters is on display:

    begin
    %tok is object stringtokenizer

    %tok = new(spaces=' ;', tokenchars='=_', quotes='"')
    %tok:string = '--=_alternative 0016B5A2CA2574DD_=;  -
      Content-Type: text/plain; charset="US-ASCII"'

    repeat while not %tok:atEnd
       printText {~} is {%tok:nextToken}
    end repeat
    end

The result is:

    %tok:nextToken is --
    %tok:nextToken is =
    %tok:nextToken is _
    %tok:nextToken is alternative
    %tok:nextToken is 0016B5A2CA2574DD
    %tok:nextToken is _
    %tok:nextToken is =
    %tok:nextToken is Content-Type:
    %tok:nextToken is text/plain
    %tok:nextToken is charset
    %tok:nextToken is =
    %tok:nextToken is US-ASCII

See also