U (String function)
Convert EBCDIC string to Unicode constant, including character encoding (String class)
The U intrinsic method converts an EBCDIC string, which may include XML character and entity references, to a Unicode string. The function also converts XML style hexadecimal character references, XHTML entity references, and '&' references to the represented Unicode character. Since in use the method acts like a Unicode constant, it is also documented with the "Constant methods".
Syntax
%unicode = string:U
Syntax terms
%unicode | A Unicode string variable to receive the Unicode encoding of the method object string. |
---|---|
string | A constant character string value, which may include an XML-style hexadecimal character reference or an XHTML entity reference. That is, string may contain an ampersand (&) in the following cases:
|
Usage notes
- U is a compile-time-only equivalent of the EbcdicToUnicode method of the intrinsic string class (with its CharacterDecode argument implicitly set to
'True'
). - Using the U method (or EbcdicToUnicode) is necessary for converting to type Unicode if the string you want to convert may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a Unicode variable otherwise, whereas keyboard-available characters can simply be assigned directly to a Unicode variable without character reference and without conversion by U.
- The U method is available as of "Sirius Mods" Version 7.3.
Examples
- The first Print statement below displays a plus sign
+
:%p Unicode Initial('+') print %p
- The second Print displays a copyright sign
©
):%copy Unicode Initial('©':U) print %copy
- The third displays
2122
:%tm Unicode Initial('™':U) print %tm:UnicodeToUtf16:StringToHex
Note
Simply specifying'print %tm'
in the previous example above (or its equivalent'print %tm:UnicodeToEbcdic'
) would attempt to translate to EBCDIC and fail because the Unicode trademark character does not translate to a valid EBCDIC character. But the UnicodeToUtf16 method can convert the Unicode variable to a byte-stream string, which the StringToHex method converts to its hex representation.
See also
- You can find the list of XHTML entities on the Internet at the following URL: