New (StringTokenizer constructor): Difference between revisions
m (→Usage notes) |
mNo edit summary |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
This method returns a new instance of a <var>StringTokenizer</var> object. | This method returns a new instance of a <var>StringTokenizer</var> object. | ||
It has | It has optional arguments that let you specify the delimiter characters | ||
that determine the tokens in the string that is being tokenized. | that determine the tokens in the string that is being tokenized. | ||
==Syntax== | ==Syntax== | ||
{{Template:StringTokenizer:New syntax}} | {{Template:StringTokenizer:New syntax}} | ||
===Syntax terms=== | ===Syntax terms=== | ||
<table class="syntaxTable"> | <table class="syntaxTable"> | ||
Line 12: | Line 14: | ||
<tr><th><var>[%(StringTokenizer):]</var></th> | <tr><th><var>[%(StringTokenizer):]</var></th> | ||
<td>The optional class name in parentheses denotes a <var>[[Notation conventions for methods#Constructors|Constructor]]</var>. See [[#Usage notes| | <td>The optional class name in parentheses denotes a <var>[[Notation conventions for methods#Constructors|Constructor]]</var>. See [[#Usage notes|Usage notes]], below, for more information about invoking a <var>StringTokenizer</var> <var>Constructor</var>.</td></tr> | ||
<tr><th><var>TokenChars</var></th> | <tr><th><var>TokenChars</var></th> | ||
<td>This [[Notation conventions for methods#Named parameters|name required]] string argument is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters. | <td>This [[Notation conventions for methods#Named parameters|name required]] string argument is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters. | ||
<p> | |||
<var>TokenChars</var> is an optional argument that defaults to a null string. </td></tr> | <var>TokenChars</var> is an optional argument that defaults to a null string. </p></td></tr> | ||
<tr><th><var>Spaces</var></th> | <tr><th><var>Spaces</var></th> | ||
<td>This name required string argument is a set of "whitespace" characters, that is, characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. | <td>This name required string argument is a set of "whitespace" characters, that is, characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. | ||
<p> | |||
<var>Spaces</var> is an optional argument that defaults to a blank character. </td></tr> | <var>Spaces</var> is an optional argument that defaults to a blank character. </p></td></tr> | ||
<tr><th><var>Quotes</var></th> | <tr><th><var>Quotes</var></th> | ||
<td>This name required string argument is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a "quoted region") is treated as a single token, and any delimiter characters ( | <td>This name required string argument is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a "quoted region") is treated as a single token, and any delimiter characters (<var>Quotes</var>, <var>Spaces</var>, or <var>TokenChar</var>) within a quoted region are treated as non-delimiters. | ||
<p> | |||
<var>Quotes</var> is an optional argument that defaults to a null string.</td></tr> | <var>Quotes</var> is an optional argument that defaults to a null string.</p></td></tr> | ||
<tr><th><var>Separators</var></th> | <tr><th><var>Separators</var></th> | ||
<td>This name required string argument is a set of single characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. | <td>This name required string argument is a set of single characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token. | ||
<p> | |||
Multiple <var>Separators</var> characters do not compress to a single separator (like <var>Spaces</var> characters). | Multiple <var>Separators</var> characters do not compress to a single separator (like <var>Spaces</var> characters). </p> | ||
<p> | |||
<var>Separators</var> is an optional argument that defaults to a null string.</td></tr> | <var>Separators</var> is an optional argument that defaults to a null string. It is available as of <var class="product">Sirius Mods</var> version 7.8. </p></td></tr> | ||
</table> | </table> | ||
==Usage notes== | ==Usage notes== | ||
<ul> | <ul> | ||
<li>As described in [[Object variables#Using New or other Constructors| | <li>As described in [[Object variables#Using New or other Constructors|Using New or other Constructors]], <var>New</var> can be invoked with no object, with an explicit class name, or with an object variable in the class, even if that object is <var>Null</var>: | ||
<p class="code">%stringTokenizer = new | |||
%stringTokenizer = %(StringTokenizer):new | %stringTokenizer = %(StringTokenizer):new | ||
%stringTokenizer = %stringTokenizer:new | %stringTokenizer = %stringTokenizer:new | ||
</p> | </p></li> | ||
<li>The <var>New</var> parameters are also <var>StringTokenizer</var> properties: <var>[[Spaces (StringTokenizer property|Spaces]]</var>, <var>[[Quotes (StringTokenizer property| | |||
<li>The <var>New</var> parameters are also <var>StringTokenizer</var> properties: <var>[[Spaces (StringTokenizer property)|Spaces]]</var>, <var>[[Quotes (StringTokenizer property)|Quotes]]</var>, <var>[[TokenChars (StringTokenizer property)|TokenChars]]</var>, and <var>[[Separators (StringTokenizer property)|Separators]]</var>.</li> | |||
<li>A character may belong to at most one of the <var>Spaces</var>, | <li>A character may belong to at most one of the <var>Spaces</var>, | ||
<var>Quotes</var>, <var>TokenChars</var>, or <var>Separators</var> sets of characters. | <var>Quotes</var>, <var>TokenChars</var>, or <var>Separators</var> sets of characters.</li> | ||
<li>If you are specifying <var>Spaces</var>, | <li>If you are specifying <var>Spaces</var>, | ||
Line 54: | Line 58: | ||
each character in the string is a quotation | each character in the string is a quotation | ||
character — that is, you may not separate characters — and no character | character — that is, you may not separate characters — and no character | ||
may repeat (except for apostrophe, which may be doubled). | may repeat (except for apostrophe, which may be doubled).</li> | ||
<li>A quoted region is not affected by the [[TokensToLower (StringTokenizer property)|TokensToLower]] | |||
and [[TokensToUpper (StringTokenizer property)|TokensToUpper]] properties. | <li>A quoted region is not affected by the <var>[[TokensToLower (StringTokenizer property)|TokensToLower]]</var> | ||
and <var>[[TokensToUpper (StringTokenizer property)|TokensToUpper]]</var> properties.</li> | |||
</ul> | </ul> | ||
Latest revision as of 17:53, 10 October 2014
Create new StringTokenizer object (StringTokenizer class)
This method returns a new instance of a StringTokenizer object.
It has optional arguments that let you specify the delimiter characters
that determine the tokens in the string that is being tokenized.
Syntax
%stringTokenizer = [%(StringTokenizer):]New[( [TokenChars= string], - [Spaces= string], - [Quotes= string], - [Separators= string])]
Syntax terms
%stringTokenizer | A StringTokenizer object expression to contain the new object instance. |
---|---|
[%(StringTokenizer):] | The optional class name in parentheses denotes a Constructor. See Usage notes, below, for more information about invoking a StringTokenizer Constructor. |
TokenChars | This name required string argument is a set of single-character token-delimiters (delimiters that are also tokens) that may be separated by whitespace characters.
TokenChars is an optional argument that defaults to a null string. |
Spaces | This name required string argument is a set of "whitespace" characters, that is, characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token.
Spaces is an optional argument that defaults to a blank character. |
Quotes | This name required string argument is a set of quotation characters. The text between each disjoint pair of identical quotation characters (a "quoted region") is treated as a single token, and any delimiter characters (Quotes, Spaces, or TokenChar) within a quoted region are treated as non-delimiters.
Quotes is an optional argument that defaults to a null string. |
Separators | This name required string argument is a set of single characters that separate tokens. Each of these characters is a "non-token delimiter," a delimiter that is not itself a token.
Multiple Separators characters do not compress to a single separator (like Spaces characters). Separators is an optional argument that defaults to a null string. It is available as of Sirius Mods version 7.8. |
Usage notes
- As described in Using New or other Constructors, New can be invoked with no object, with an explicit class name, or with an object variable in the class, even if that object is Null:
%stringTokenizer = new %stringTokenizer = %(StringTokenizer):new %stringTokenizer = %stringTokenizer:new
- The New parameters are also StringTokenizer properties: Spaces, Quotes, TokenChars, and Separators.
- A character may belong to at most one of the Spaces, Quotes, TokenChars, or Separators sets of characters.
- If you are specifying Spaces, Quotes, TokenChars, or Separators, each character in the string is a quotation character — that is, you may not separate characters — and no character may repeat (except for apostrophe, which may be doubled).
- A quoted region is not affected by the TokensToLower and TokensToUpper properties.
Examples
In the following request, a variety of delimiters is on display:
begin %tok is object stringtokenizer %tok = new(spaces=' ;', tokenchars='=_', quotes='"') %tok:string = '--=_alternative 0016B5A2CA2574DD_=; - Content-Type: text/plain; charset="US-ASCII"' repeat while not %tok:atEnd printText {~} is {%tok:nextToken} end repeat end
The result is:
%tok:nextToken is -- %tok:nextToken is = %tok:nextToken is _ %tok:nextToken is alternative %tok:nextToken is 0016B5A2CA2574DD %tok:nextToken is _ %tok:nextToken is = %tok:nextToken is Content-Type: %tok:nextToken is text/plain %tok:nextToken is charset %tok:nextToken is = %tok:nextToken is US-ASCII