UrlDecodeUnicode and FormUrlDecodeUnicode (String functions)

From m204wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Decode URL encoded characters to unicode (String class)

[Introduced in Sirius Mods 7.9]

UrlDecodeUnicode and FormUrlDecodeUnicode decode data that's been URL encoded or percent-encoded to a unicode string.

Syntax

%unicode = string:UrlDecodeUnicode[( [AllowUntranslatable= boolean])] Throws CharacterTranslationException

%unicode = string:FormUrlDecodeUnicode[( [AllowUntranslatable= boolean])] Throws CharacterTranslationException

Syntax terms

%unicodeThe Unicode variable that results when string is decoded.
string The UTF-8 encoded string that contains a URL-encoded representation of a Unicode string.
AllowUntranslatable This optional, name required, argument indicates whether Unicode characters that cannot be translated back to EBCDIC are to be allowed. If boolean is set to False, a Unicode character that cannot be translated back to EBCDIC results in request cancellation.

Exceptions

UrlDecodeUnicode and FormUrlDecodeUnicode can throw the following exception:

CharacterTranslationException
If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Usage notes

  • If an EBCDIC version of URL-encoded data needs to be converted to Unicode, the EBCDIC string must be:
    1. Assigned to a Unicode variable (using either implicit EBCDIC-to-Unicode translation or doing it explicitly with EbcdicToUnicode)
    2. Converted to UTF-8 using UnicodeToUtf8
    3. Translated to Unicode using UrlDecodeUnicode

    An example below illustrates this process.

  • The difference between FormUrlDecodeUnicode and UrlDecodeUnicode is that FormUrlDecodeUnicode converts pluses (+) to spaces, while UrlDecodeUnicode expects spaces to be percent encoded as %20. UrlDecodeUnicode will throw a CharacterTranslationException exception if it encounters a plus character. Typically form URL encoding is only used in HTML form posts when the form is posted using the x-www-form-urlencoded content type.
  • URL encoding is mostly used in web applications or for encoding a URI, possibly in an XML namespace declaration.
  • The inverses of UrlDecodeUnicode and FormUrlDecodeUnicode are UnicodeUrlEncode and UnicodeFormUrlEncode, respectively.

Examples

URL decoding an EBCDIC string

The following example URL decodes a URL encoded EBCDIC string to Unicode:

b %ebcdic is longstring %unicode is unicode %ebcdic = 'I%20like%20apple%20%CF%80%20and%20eat%20it%204%20%C3%97%20a%20day.' %unicode = %ebcdic:ebcdicToUnicode:unicodeToUtf8:urlDecodeUnicode printText {~} = {%unicode} end

and outputs:

%unicode = I like apple π and eat it 4 × a day.

Catching an exception when URL decoding

The following is an example of catching a URL decoding exception. In the example, the supposedly URL encoded string contains a %2x, which is not a valid percent encoded hexadecimal value. This results in an exception:

begin %url is longstring %u is unicode %err is object characterTranslationException %url = 'What%20the%20deuce%2xis%20going%20on%20here?':u:unicodeToUtf8 printText ************** {~}="{%url:utf8ToUnicode}" try %u = %url:urlDecodeUnicode printText It worked! {~=%u} catch CharacterTranslationException to %err printText Caught CharacterTranslationException: printText Reason: {%err:reason} printText HexValue: {%err:hexValue} printText Position: {%err:bytePosition} printText Problem: {%err:description} end try end

This outputs:

************** %url:utf8ToUnicode="What%20the%20deuce%2xis%20going%20on%20here?" Caught CharacterTranslationException: Reason: InvalidUrlEncoding HexValue: 78 Position: 21 Problem: invalid URL encoding X'78' at byte position 21: Hexadecimal digit expected

See also