New (StringTokenizer constructor): Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
{{Template:StringTokenizer:New subtitle}} | |||
This method returns a new instance of a StringTokenizer object. | This method returns a new instance of a StringTokenizer object. | ||
It has three optional arguments that let you specify the delimiter characters | It has three optional arguments that let you specify the delimiter characters | ||
that determine the tokens in the string that is being tokenized. | that determine the tokens in the string that is being tokenized. | ||
==Syntax== | |||
{{Template:StringTokenizer:New syntax}} | |||
===Syntax terms=== | ===Syntax terms=== | ||
<dl> | <dl> | ||
Line 25: | Line 17: | ||
TokenChars is an optional argument that defaults to a null string. | TokenChars is an optional argument that defaults to a null string. | ||
<dt><b>Spaces= </b><i>chars</i> | <dt><b>Spaces= </b><i>chars</i> | ||
<dd>This name required string argument (Spaces) is a set of | <dd>This name required string argument (Spaces) is a set of "whitespace" | ||
characters, that is, characters that separate tokens. | characters, that is, characters that separate tokens. | ||
Each of these characters is | Each of these characters is | ||
a | a "non-token delimiter," a delimiter that is not itself a token. | ||
Spaces is an optional argument that defaults to a blank character. | Spaces is an optional argument that defaults to a blank character. | ||
<dt><b>Quotes= </b><i>chars</i> | <dt><b>Quotes= </b><i>chars</i> | ||
<dd>This name required string argument (Quotes) is a set of quotation characters. | <dd>This name required string argument (Quotes) is a set of quotation characters. | ||
The text between each disjoint pair of identical quotation characters (a | The text between each disjoint pair of identical quotation characters (a "quoted | ||
region | region") is treated as a single token, and any delimiter characters (Quote, | ||
Space, or TokenChar) within a quoted region are treated as non-delimiters. | Space, or TokenChar) within a quoted region are treated as non-delimiters. | ||
Line 40: | Line 32: | ||
</dl> | </dl> | ||
==Usage notes== | |||
<ul> | <ul> | ||
<li>A character may belong to at most one of the Spaces, | <li>A character may belong to at most one of the Spaces, | ||
Line 51: | Line 43: | ||
and [[TokensToUpper (StringTokenizer property)|TokensToUpper]] properties. | and [[TokensToUpper (StringTokenizer property)|TokensToUpper]] properties. | ||
</ul> | </ul> | ||
==Examples== | |||
In the following request, a variety of delimiters is on display: | In the following request, a variety of delimiters is on display: | ||
Line 83: | Line 75: | ||
%tok:nextToken is US-ASCII | %tok:nextToken is US-ASCII | ||
</pre> | </pre> | ||
==See also== | |||
{{Template:StringTokenizer:New footer}} |
Revision as of 19:54, 6 February 2011
Create new StringTokenizer object (StringTokenizer class)
This method returns a new instance of a StringTokenizer object.
It has three optional arguments that let you specify the delimiter characters
that determine the tokens in the string that is being tokenized.
Syntax
%stringTokenizer = [%(StringTokenizer):]New[( [TokenChars= string], - [Spaces= string], - [Quotes= string], - [Separators= string])]
Syntax terms
- %tok
- A StringTokenizer object variable to contain the new object instance.
- TokenChars= chars
- This name required string argument (TokenChars) is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters. TokenChars is an optional argument that defaults to a null string.
- Spaces= chars
- This name required string argument (Spaces) is a set of "whitespace" characters, that is, characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. Spaces is an optional argument that defaults to a blank character.
- Quotes= chars
- This name required string argument (Quotes) is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a "quoted region") is treated as a single token, and any delimiter characters (Quote, Space, or TokenChar) within a quoted region are treated as non-delimiters. Quotes is an optional argument that defaults to a null string.
Usage notes
- A character may belong to at most one of the Spaces, Quotes, or TokenChars sets of characters.
- If you are specifying Spaces, Quotes, or TokenChars, each character in the string is a quotation character — that is, you may not separate characters — and no character may repeat (except for apostrophe, which may be doubled).
- A quoted region is not affected by the TokensToLower and TokensToUpper properties.
Examples
In the following request, a variety of delimiters is on display:
begin %tok is object stringtokenizer %tok = new(spaces=' ;', tokenchars='=_', quotes='"') %tok:string = '--=_alternative 0016B5A2CA2574DD_=; - Content-Type: text/plain; charset="US-ASCII"' repeat while not %tok:atEnd printText {~} is {%tok:nextToken} end repeat end
The result is:
%tok:nextToken is -- %tok:nextToken is = %tok:nextToken is _ %tok:nextToken is alternative %tok:nextToken is 0016B5A2CA2574DD %tok:nextToken is _ %tok:nextToken is = %tok:nextToken is Content-Type: %tok:nextToken is text/plain %tok:nextToken is charset %tok:nextToken is = %tok:nextToken is US-ASCII