U (String function): Difference between revisions

Revision as of 07:05, 2 February 2011

Convert EBCDIC string to Unicode constant, including character encoding (String class)

The U intrinsic method converts an EBCDIC string, which may include XML character and entity references, to a Unicode string. The function also converts XML style hexadecimal character references, XHTML entity references, and '&' references to the represented Unicode character. Since in use the method acts like a Unicode constant, it is also documented with the Constant_methods.

Syntax

%unicode = string:U

Syntax terms

%unicode A Unicode string variable to receive the Unicode encoding of the method object string.

string

A constant character string value, which may include an XML-style hexadecimal character reference or an XHTML entity reference. That is, string may contain an ampersand (&) in the following cases:

As the substring '&'. This substring is converted to a single ampersand ('&') character.
At the start of a hexadecimal character reference (for example, the eight characters '“' for the Unicode "Left double quotation mark" '“'). The character reference is converted to the referenced character.
As of Sirius Mods version 7.6, an XHTML entity reference (for example, the six characters ' ' for the "non-breaking-space" character). The entity reference is converted to the referenced character.

A decimal character reference (for example, ¬) is not allowed.

Usage notes

U is a compile-time-only equivalent of the EbcdicToUnicode method of the intrinsic String class (with its CharacterDecode argument implicitly set to 'True').
Using the U method (or EbcdicToUnicode) is necessary for converting to type Unicode if the string you want to convert may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a Unicode variable otherwise, whereas keyboard-available characters can simply be assigned directly to a Unicode variable without character reference and without conversion by U.
You can find the list of XHTML entities on the Internet at the following URL:
http://www.w3.org/TR/xhtml1/dtds.html#h-A2
The U method is available as of Sirius Mods version 7.3.

Examples

The first Print statement below displays a plus sign (+); the second Print displays a copyright sign (©); the third displays '2122':
%p Unicode Initial('+') Print %p
Constant for trademark symbol:
%tm Unicode Initial('™':U) Print %tm:UnicodeToUtf16:StringToHex

Note

Simply specifying 'Print %tm' in the the third Print statement above (or its equivalent 'Print %tm:UnicodeToEbcdic') would attempt to translate to EBCDIC and fail because the Unicode trademark character does not translate to a valid EBCDIC character. But the UnicodeToUtf16 method can convert the Unicode variable to a byte-stream string, which the StringToHex method converts to its hex representation.

@@ Line 1: / Line 1: @@
 {{Template:String:U subtitle}}
-This [[Intrinsic classes|intrinsic]] function converts an EBCDIC string, which may include XML character and entity references, to a Unicode string. The function also converts XML style hexadecimal character references, XHTML entity references, and ''''&amp;amp;'''' references to the represented Unicode character. Since in use the method acts like a Unicode constant, it is also documented with the [[Constant methods]].
+The <var>U</var> [[Intrinsic classes|intrinsic]] method converts an EBCDIC string, which may include XML character and entity references, to a Unicode string. The function also converts XML style hexadecimal character references, XHTML entity references, and <b>'&amp;amp;'</b> references to the represented Unicode character. Since in use the method acts like a Unicode constant, it is also documented with the [[Constant_methods]].
-The <var>U</var> method is available as of <var class=product>Sirius Mods</var> version 7.3.
 ==Syntax==
 {{Template:String:U syntax}}
@@ Line 9: / Line 8: @@
 <table class="syntaxTable">
 <tr><th>%unicode</th>
-<td>A <var>U</var>nicode string variable to receive the <var>U</var>nicode encoding of the method object string.                    </td></tr>
+<td>A Unicode string variable to receive the Unicode encoding of the method object <var class="term>string</var>.</td></tr>
 <tr><th>string</th>
-<td>A constant character string value, which may include an XML-style hexadecimal character reference or an XHTML entity reference. That is, ''string'' may contain an ampersand (&amp;) in the following cases:       *As the substring ''''&amp;amp;''''. This substring is converted to a single ampersand character.             *At the start of a hexadecimal character reference (for example, the eight characters '&amp;#x201C;' for the <var>U</var>nicode "Left double quotation mark"). The character reference is converted to the referenced character.                                             *As of <var class=product>Sirius Mods</var> version 7.6, an XHTML entity reference (for example, the six characters '&amp;nbsp;' for the "non-breaking-space" character). The entity reference is converted to the referenced character.                                                                                                                                                              A decimal character reference (for example, &amp;#172;) is ''not'' allowed.</td></tr>
+<td>A constant character string value, which may include an XML-style hexadecimal character reference or an XHTML entity reference. That is, ''string'' may contain an ampersand (&amp;) in the following cases:<ul><li>As the substring ''''&amp;amp;''''. This substring is converted to a single ampersand (<code>'&'</code>) character.<li>At the start of a hexadecimal character reference (for example, the eight characters <code>'&amp;#x201C;'</code> for the Unicode "Left double quotation mark" <code>'&#x201C;'</code>). The character reference is converted to the referenced character.<li>As of <var class=product>Sirius Mods</var> version 7.6, an XHTML entity reference (for example, the six characters '&amp;nbsp;' for the "non-breaking-space" character). The entity reference is converted to the referenced character.</ul>A decimal character reference (for example, &amp;#172;) is ''not'' allowed.</td></tr>
 </table>
-==<var>U</var>sage notes==
+==Usage notes==
-*The <var>U</var> method is a compile-time-only equivalent of the [[EbcdicToUnicode (String function)|EbcdicToUnicode]] method of the intrinsic <var>String</var> class (with its CharacterDecode argument implicitly set to ''''True'''').
+<ul><li><var>U</var> is a compile-time-only equivalent of the [[EbcdicToUnicode (String function)|EbcdicToUnicode]] method of the intrinsic <var>String</var> class (with its CharacterDecode argument implicitly set to <code>'True'</code>).
-*<var>U</var>sing the <var>U</var> method (or EbcdicTo<var>U</var>nicode) is necessary for converting to type <var>U</var>nicode if the string you want to convert may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a <var>U</var>nicode variable otherwise, whereas keyboard-available characters can simply be assigned directly to a <var>U</var>nicode variable without character reference and without conversion by <var>U</var>.
+<li>Using the <var>U</var> method (or EbcdicToUnicode) is necessary for converting to type Unicode if the string you want to convert may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a Unicode variable otherwise, whereas keyboard-available characters can simply be assigned directly to a Unicode variable without character reference and without conversion by <var>U</var>.
-*You can find the list of XHTML entities on the Internet at the following <var>U</var>RL:
+<li>You can find the list of XHTML entities on the Internet at the following <var>U</var>RL:
 <p class="code">http://www.w3.org/TR/xhtml1/dtds.html#h-A2
 </p>
-===Example===
+<li>The <var>U</var> method is available as of <var class="product">[[Sirius Mods]]</var> version 7.3.</ul>
-The first Print statement below displays a plus sign (+); the second Print displays a copyright sign (&copy;); the third displays ''''2122'''':
+==Examples==
-<p class="code">%p <var>U</var>nicode Initial('+')
+<ol><li>The first Print statement below displays a plus sign (+); the second Print displays a copyright sign (&copy;); the third displays ''''2122'''':
+<p class="code">%p Unicode Initial('+')
 Print %p
+</p>
+<li>Entity for copyright symbol:
+<p class="code">%copy Unicode Initial('&amp;copy;':U)
+Print %copy
+</p><li>Constant for trademark symbol:
+<p class="code">%tm Unicode Initial('&amp;#x2122;':U)
+Print %tm:UnicodeToUtf16:StringToHex
+</p></ol>
-* Entity for copyright symbol:
+===Note===
-%copy <var>U</var>nicode Initial('&amp;copy;':<var>U</var>)
+Simply specifying ''''Print %tm''''  in the the third Print statement above (or its equivalent ''''Print %tm:UnicodeToEbcdic'''') would attempt to translate to EBCDIC and fail because the Unicode trademark character does not translate to a valid EBCDIC character. But the UnicodeToUtf16 method can convert the Unicode variable to a byte-stream string, which the StringToHex method converts to its hex representation.
-Print %copy
-* Constant for trademark symbol:
-%tm <var>U</var>nicode Initial('&amp;#x2122;':<var>U</var>)
-Print %tm:<var>U</var>nicodeTo<var>U</var>tf16:<var>String</var>ToHex
-</p>
-====Note====
-Simply specifying ''''Print %tm''''  in the the third Print statement above (or its equivalent ''''Print %tm:<var>U</var>nicodeToEbcdic'''') would
-attempt to translate to EBCDIC and fail because the <var>U</var>nicode trademark character does not translate to a valid EBCDIC character. But the <var>U</var>nicodeTo<var>U</var>tf16 method can convert the <var>U</var>nicode variable to a byte-stream <var>Longstring</var>, which the <var>String</var>ToHex method converts to its hex representation.
 ==See also==
 {{Template:String:U footer}}

Float class String class Unicode class	List of Float methods List of String methods List of Unicode methods List of Intrinsic methods	Float methods syntax String methods syntax Unicode methods syntax
Notation conventions for methods

U (String function): Difference between revisions

Revision as of 07:05, 2 February 2011

Contents

Syntax

Syntax terms

Usage notes

Examples

Note

See also

Navigation menu

U (String function): Difference between revisions

Revision as of 07:05, 2 February 2011

Syntax

Syntax terms

Usage notes

Examples

Note

See also

Navigation menu

Search