EbcdicToUnicode (String function): Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 22: | Line 22: | ||
</ul></td></tr> | </ul></td></tr> | ||
<tr><th><var>Untranslatable</var></th> | <tr><th><var>Untranslatable</var></th> | ||
<td>The optional, | <td>The optional, name required, argument <var>Untranslatable</var> is a single character or a null string that specifies how to handle EBCDIC input characters that are not translatable to Unicode: | ||
<ul> | <ul> | ||
<li>If the value is a single Unicode character, any untranslatable EBCDIC characters are replaced with that Unicode character. | <li>If the value is a single Unicode character, any untranslatable EBCDIC characters are replaced with that Unicode character. |
Revision as of 22:25, 10 June 2012
Convert EBCDIC string to Unicode (String class)
EbcdicToUnicode is an intrinsic function that converts an EBCDIC string to Unicode using the current Unicode tables. Options are available to control:
- The conversion to the represented Unicode character of XML-style hexadecimal character references, XHTML entity references, and
&
references - How to handle untranslatable EBCDIC characters
Syntax
%unicode = string:EbcdicToUnicode[( [CharacterDecode= boolean], - [Untranslatable= unicode])] Throws CharacterTranslationException
Syntax terms
%unicode | A Unicode variable to receive the method object string translated to Unicode. |
---|---|
string | An EBCDIC character string. |
CharacterDecode | The optional, but name required, argument CharacterDecode is a Boolean enumeration:
|
Untranslatable | The optional, name required, argument Untranslatable is a single character or a null string that specifies how to handle EBCDIC input characters that are not translatable to Unicode:
The Untranslatable parameter is available as of Sirius Mods version 7.5. It provides the functionality formerly provided by the EbcdicTranslateNonUnicode and the EbcdicRemoveNonUnicode methods, which are invalid as of Sirius Mods 7.5. |
Exceptions
EbcdicToUnicode can throw the following exception:
- CharacterTranslationException
- If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.
Usage notes
- Using EbcdicToUnicode with
CharacterDecode=True
(or using the U function) is necessary if the string you want to convert to Unicode may contain a hexadecimal or XHTML entity character reference which you want converted to the corresponding Unicode character. - EbcdicToUnicode is available as of Sirius Mods version 7.3.
Examples
The following fragment shows four calls of EbcdicToUnicode: respectively against translatable EBCDIC characters, a string with a character reference, a string with an entity reference, and a string with an EBCDIC character that cannot be translated to Unicode. The X constant function is used in the example.
%e string Len 20 %u unicode %e = '12' %u = %e:EbcdicToUnicode Print %u Print %u:UnicodeToUtf16:StringToHex %e = '1™2' %u = %e:EbcdicToUnicode(CharacterDecode=True) Print %u:UnicodeToUtf16:StringToHex %e = '©' %u = %e:EbcdicToUnicode(CharacterDecode=True) Print %u %e = 'F1FFF2':X %u = %e:EbcdicToUnicode
The result of the above fragment is:
12 00310032 003121220032 © CANCELLING REQUEST: MSIR.0751: Class STRING, function EBCDICTOUNICODE: CHARACTER TRANSLATIONEXCEPTION exception: EBCDIC character X'FF' without valid translation to Unicode at byte position 2 ...
Note: The initial Print %u
statement in the example above is not very revealing because it is equivalent to specifying Print %u:UnicodeToEbcdic
— a Unicode string is implicitly converted to EBCDIC when it is used in an EBCDIC context like a Print statement. UnicodeToUtf16, however, converts the Unicode variable to a byte-stream string, which StringToHex converts to its hex representation.
See also
- U is a compile-time-only equivalent of the EbcdicToUnicode method (with the CharacterDecode argument implicitly set to
True
). - You can find the list of XHTML entities on the Internet at the following URL:
- More information is available about Unicode.
- The EbcdicToAscii method converts an EBCDIC string to ASCII.