$UnSpace: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 39: | Line 39: | ||
</p> | </p> | ||
$UnSpace can be used to make all space characters the same; here the space characters include the letters "X" and "S" and the blank character; they are all replaced by the letter "X": | <var>$UnSpace</var> can be used to make all space characters the same; here the space characters include the letters "X" and "S" and the blank character; they are all replaced by the letter "X": | ||
<p class="code"> $UnSpace(' ASSSBSSXC ', 'XS ') | <p class="code"> $UnSpace(' ASSSBSSXC ', 'XS ') |
Revision as of 23:44, 18 October 2012
Normalize spaces and quotes
Most Sirius $functions have been deprecated in favor of Object Oriented methods. The OO equivalent for the $UnSpace function is Unspace (String function).
This function normalizes a string by removing leading and trailing "space" characters and collapsing other sequences of "unquoted" spaces to single spaces. Normalization can also "undouble" quoted quote characters.
The $UnSpace function accepts three arguments and returns a string result which is the first argument, normalized according to the description of the returned value, as shown below.
The first argument is the string to be normalized.
The second argument is a string which is the set of space characters. The first character of this string is the replacement space character. The default set of space characters is the blank character.
The third argument is a string which is the set of quote characters. If a quote character occurs twice in succession in argument 3, this indicates that the quote character should be "undoubled" when it occurs within a string quoted by that same character. No quote character may occur as a space character. The default is that there are no quote characters.
The returned value is the string value of argument 1, normalized as follows:
- Leading space characters are removed. If there is not an unmatched quote, trailing space characters are removed.
- Within a quoted substring, spaces are not collapsed nor replaced. If the quote character introducing the substring is specified as two occurrences in argument 3, each pair of consecutive occurrences of that character in the substring is replaced by a single occurrence of that character. If a quote character is not matched, the rest of the input is quoted.
- Outside quoted substrings, each sequence of length one or more of any mixture of space characters is replaced by a single occurrence of the replacement space character.
Syntax
<section begin="syntax" /> %OUT = $UnSpace(input, spaces, quotes) <section end="syntax" />
Several examples follow, each one showing an invocation of $UnSpace and the corresponding result.
With only one argument, $UnSpace simply removes leading and trailing blank sequences and collapses other blank sequences:
$UnSpace(' A B C ') -> A B C
$UnSpace can be used to make all space characters the same; here the space characters include the letters "X" and "S" and the blank character; they are all replaced by the letter "X":
$UnSpace(' ASSSBSSXC ', 'XS ') -> AXBXC
Within a quoted substring, spaces are not changed:
$UnSpace('."XAX"XBX', '.X', '"') -> "XAX".B
An unmatched quote (which is therefore, by definition, a trailing unmatched quote), causes the remander of the string to be quoted:
$UnSpace('".A..."..B...C.."..', '.', '"') -> ".A...".B.C."..
Doubling a quote character in argument 3 causes undoubling of quotes within a string quoted by that character:
$UnSpace('" "" "', , '""') -> " " "
The following two inputs produce the same output; the first is a null quoted substring followed by the letter A and a final unmatched quote; the second is a quoted string whose first pair of characters is a doubled quote:
$UnSpace('""A"', , '""') -> ""A" $UnSpace('"""A"', , '""') -> ""A"
If a quote character is not doubled in argument 3, undoubling for it is not performed:
$UnSpace('" "" "', , '"') -> " "" "
Multiple quote characters can be specified; a quote character inside a substring quoted by a different character is not undoubled:
$UnSpace('./A$$B/..$C//D$../E//F.', '.', '$$//') -> /A$$B/.$C//D$./E/F.