StringTokenizer class
The StringTokenizer class is used to divide an input string (String property of the method object) into substrings (tokens). The tokens are separated by either of two types of delimiters:
- Delimiters that are not tokens themselves (analogs of whitespace and punctuation characters in the English language)
- Delimiters that are also tokens themselves (token-characters, that may be of interest or significance, like operators in an arithmetic expression)
A token is thus a sequence of consecutive characters that are not delimiters, or it is a token-character delimiter. The delimiters are user definable, are specified per StringTokenizer object at creation time, and can be modified thereafter.
StringTokenizer operations maintain two positions within the input string:
- The location of the most recent token's first character (the "current token position")
- The location from which to begin parsing for the next token (the "tokenizing position")
You can explicitly modify these positions individually.
The methods in this class are listed in StringTokenizer methods. In the method templates, %tok is used to represent the object to which the method is being applied, sometimes called the "method object".
Example
To navigate the simplest path through an input string, you "walk" forward (left to right) from the beginning of the string using token-sized steps (that is, from whole token to next whole token to next whole token, and so on). The following is a simple example of this in which three tokens are separated by blank, non-token delimiters:
%tok = new %tok:string = 'a tokenization example' %tok:nextToken %tok:nextToken %tok:nextToken
Each of the NextToken method calls above returns a token: respectively, "a", "tokenization", and "example". The StringTokenizer class also has methods that let you take character-sized steps forward in the string, as well as methods that let you modify the position markers and thereby select tokens or sub-tokens in the order you require. You can also locate specified tokens, and you can return substrings that are the characters in the entire string that precede a position or that follow a position.
Many of the method examples make use of the PrintText statement, which is new as of version 7.2 of the Sirius Mods.
The StringTokenizer class is new as of Sirius Mods version 7.3.