EbcdicToUnicode (String function): Difference between revisions

Revision as of 23:21, 3 February 2011

Convert EBCDIC string to Unicode (String class)

EbcdicToUnicode is an intrinsic function that converts an EBCDIC string to Unicode using the current Unicode tables. Options are available to control:

the conversion of XML style hexadecimal character references, XHTML entity references, and '&' references are converted to the represented Unicode character;
how to handle untranslatable EBCDIC characters

Syntax

%unicode = string:EbcdicToUnicode[( [CharacterDecode= boolean], - [Untranslatable= unicode])] Throws CharacterTranslationException

Syntax terms

%unicode	A string variable to receive the method object `string` translated to `Unicode`.
string	An EBCDIC character string.
CharacterDecode	The optional, but `nameRequired`, argument `CharacterDecode` is a `Boolean enumeration`: If its value is `True`, an ampersand (`&`) in the input EBCDIC string is allowed *only* as the beginning of one of these types of character or entity reference: The substring '&'. This substring is converted to a single '&' character. A hexadecimal character reference (for example, the eight characters '“' for the `Unicode` Left double quotation mark `'“'`). The character reference is converted to the referenced character. As of `Sirius Mods` version 7.6, an XHTML entity reference (for example, the six characters '©' for the copyright character `'©'`). The entity reference is converted to the referenced character. A decimal character reference (for example, ¬) is *not* allowed. If its value is `False`, the default, an ampersand is treated only as a normal character.
Untranslatable	The optional, but `nameRequired`, argument `Untranslatable` is a single character or a null string that specifies how to handle EBCDIC input characters that are not translatable to `Unicode`: If the value is a single `Unicode` character, any untranslatable EBCDIC characters are replaced with that `Unicode` character. If the value is the null string, any untranslatable EBCDIC characters are removed from the input string. The `Untranslatable` parameter is optional. If it is omitted and an EBCDIC character is encountered that is not translatable to `Unicode`, a `CharacterTranslationException` exception is thrown. The `Untranslatable` parameter is available as of `Sirius Mods` version 7.5. It provides the functionality formerly provided by the `EbcdicTranslateNonUnicode` and the `EbcdicRemoveNonUnicode` methods, which are invalid as of `Sirius Mods` 7.5.

Exceptions

EbcdicToUnicode can throw the following exception:

CharacterTranslationException: If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Usage notes

Using EbcdicToUnicode (or the U function) is necessary if the string you want to convert to Unicode may contain a hexadecimal character reference. Such a reference cannot be meaningfully assigned to a Unicode variable otherwise.
EbcdicToUnicode is available as of Sirius Modsversion 7.3.

Examples

The following fragment shows four calls of EbcdicToUnicode: respectively against translatable EBCDIC characters, a string with a character reference, a string with an entity reference, and a string with an EBCDIC character that cannot be translated to Unicode. The X constant function is used in the example.
%e string Len 20 %u unicode %e = '12' %u = %e:EbcdicToUnicode Print %u Print %u:UnicodeToUtf16:StringToHex %e = '1™2' %u = %e:EbcdicToUnicode(CharacterDecode=True) Print %u:UnicodeToUtf16:StringToHex %e = '©' %u = %e:EbcdicToUnicode(CharacterDecode=True) Print %u %e = 'F1FFF2':X %u = %e:EbcdicToUnicode

The result of the above fragment is:

12 00310032 003121220032 © CANCELLING REQUEST: MSIR.0751: Class STRING, function EBCDICTOUNICODE: CHARACTER TRANSLATIONEXCEPTION exception: EBCDIC character X'FF' without valid translation to Unicode at byte position 2 ...
Note: The initial Print %u statement in the example above is not very revealing because it is equivalent to specifying Print %u:UnicodeToEbcdic'—'; a Unicode string is implicitly converted to EBCDIC when it is used in an EBCDIC context like a Print statement. UnicodeToUtf16, however, converts the Unicode variable to a byte-stream string, which StringToHex converts to its hex representation.

@@ Line 68: / Line 68: @@
 <p class="code">http://www.w3.org/TR/xhtml1/dtds.html#h-A2
 </p>
-<li>More information is available about <var>[[Unicode Tables|Unicode]]</var>.
+<li>More information is available about <var>[[Unicode]]</var>.
 <li>The <var>[[EbcdicToAscii (String function)|EbcdicToAscii]]</var> method converts an EBCDIC string to ASCII.</ul>
 {{Template:String:EbcdicToUnicode footer}}

Float class String class Unicode class	List of Float methods List of String methods List of Unicode methods List of Intrinsic methods	Float methods syntax String methods syntax Unicode methods syntax
Notation conventions for methods

EbcdicToUnicode (String function): Difference between revisions

Revision as of 23:21, 3 February 2011

Contents

Syntax

Syntax terms

Exceptions

Usage notes

Examples

See also

Navigation menu

EbcdicToUnicode (String function): Difference between revisions

Revision as of 23:21, 3 February 2011

Syntax

Syntax terms

Exceptions

Usage notes

Examples

See also

Navigation menu

Search