U (String function): Difference between revisions
Jump to navigation
Jump to search
m (→Examples) |
m (→Examples) |
||
Line 33: | Line 33: | ||
print %copy | print %copy | ||
</p><li>The following <var>Print</var> statement displays <code>2122</code>: | </p><li>The following <var>Print</var> statement displays <code>2122</code>: | ||
<p class="code">%tm Unicode Initial('™':U) | <p class="code">%tm Unicode Initial('&#x2122;':U) | ||
print %tm:UnicodeToUtf16:[[StringToHex (String_function)|StringToHex]] | print %tm:UnicodeToUtf16:[[StringToHex (String_function)|StringToHex]] | ||
</p> | </p> |
Revision as of 20:54, 9 March 2012
Convert EBCDIC string to Unicode constant, including character encoding (String class)
The U intrinsic method converts an EBCDIC string, which may include XML character and entity references, to a Unicode string. The function also converts XML style hexadecimal character references and XHTML entity references to the represented Unicode character. Since in use the method acts like a Unicode constant, it is also included in the list of Constant methods.
Syntax
%unicode = string:U
Syntax terms
%unicode | A Unicode variable to receive the Unicode string represented by the method object string. |
---|---|
string | A constant character string value, which may include an XML-style hexadecimal character reference or an XHTML entity reference. That is, string may contain an ampersand (& ) in the following cases:
¬ ) is not allowed. |
Usage notes
- U is a compile-time-only equivalent of the EbcdicToUnicode method of the intrinsic string class (with its CharacterDecode argument implicitly set to
True
). - Using the U method (or EbcdicToUnicode) is necessary for converting to type Unicode if the string you want to convert may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a Unicode variable otherwise, whereas keyboard-available characters can simply be assigned directly to a Unicode variable without character reference and without conversion by U.
- The U method is available as of Sirius Mods Version 7.3.
Examples
- The following Print statement displays a plus sign (
+
):%p Unicode Initial('+') print %p
- The following Print statement displays a copyright sign (
©
):%copy Unicode Initial('©':U) print %copy
- The following Print statement displays
2122
:%tm Unicode Initial('™':U) print %tm:UnicodeToUtf16:StringToHex
Simply specifyingprint %tm
in the previous example above would attempt to convert to EBCDIC, but since the Unicode trademark character does not translate to a valid EBCDIC character, the Print output will use a character reference:™
.
See also
- You can find the list of XHTML entities on the Internet at the following URL: