Serial (XmlDoc/XmlNode function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (xpath arg)
 
(65 intermediate revisions by 8 users not shown)
Line 1: Line 1:
<span style="font-size:120%; color:black"><b>Serialize selected subtree as string</b></span>
{{Template:XmlDoc/XmlNode:Serial subtitle}}
[[Category:XmlDoc methods|Serial function]]
<var>Serial</var> converts an <var>XmlDoc</var> subtree to the UTF-8 or EBCDIC text string representation of the subtree. This process is called '''serialization''', because the text representation of a document is called the '''serial''' form.
[[Category:XmlNode methods|Serial function]]
[[Category:XmlDoc API methods]]
[[Category:System methods]]
<!--DPL?? Category:XmlDoc methods|Serial function: Serialize selected subtree as string-->
<!--DPL?? Category:XmlNode methods|Serial function: Serialize selected subtree as string-->
<!--DPL?? Category:XmlDoc API methods|Serial (XmlDoc/XmlNode function): Serialize selected subtree as string-->
<!--DPL?? Category:System methods|Serial (XmlDoc/XmlNode function): Serialize selected subtree as string-->
<p>
Serial is a member of the [[XmlDoc class|XmlDoc]] and [[XmlNode class|XmlNode]] classes.
</p>


This function converts an XmlDoc subtree to the UTF-8 or EBCDIC
==Syntax==
text string representation of the subtree.
{{Template:XmlDoc/XmlNode:Serial syntax}}
(This process is called '''serialization''',
===Syntax terms===
because the text representation of a document is called the '''serial'''
<table class="syntaxTable">
form.)
<tr><th>%string</th>
===Syntax===
<td>A string variable for the string serialization of the subtree, encoded either in UTF-8 or, if the <var>EBCDIC</var> option (see below) is used, in EBCDIC.</td></tr>
  %utfOrEbcd = nr:Serial( [Xpath], [options]            -
                        [,&nbsp;AddTrailingDelimiter=bool] )


====Syntax Terms====
<tr><th>nr</th>
<dl>
<td>An <var>XmlDoc</var> or <var>XmlNode</var>, used as the context node for the <var class="term">xpath</var> expression. If an <var>XmlDoc</var>, the <var>Root</var> node is the context node.</td></tr>
<dt><i><b>%utfOrEbcd</b></i>
<dd>A string variable for the string serialization of the subtree,
encoded either in UTF-8 or,
if the <tt>EBCDIC</tt> option (see below) is used, in EBCDIC.
<dt>nr
<dd>An XmlDoc or XmlNode, used as the context node for the <i>XPath</i>
expression.
If an XmlDoc, the Root node is the context node.
<dt>XPath
<dd>A Unicode string that is an XPath expression that results in a nodelist,
the head of which is the top of the subtree to serialize.
This is an optional argument; its default is a period (.), that is, the node
referenced by the method object (<i>nr</i>).


Prior to ''Sirius Mods'' version 7.6, this is an EBCDIC string.
<tr><th>xpath</th>
<dt><i><b>options</b></i>
<td>A <var>Unicode</var> string that is an [[XPath#XPath_syntax|Xpath expression]] that results in a nodelist, the head of which is the top of the subtree to serialize. Any other nodes in the nodelist are ignored.
<dd>A blank delimited string that can contain one or more of the following:
<p>
This is an optional argument; its default is a period (<tt>.</tt>), that is, the node referenced by the method object(<var class="term">nr</var>). </p>
<p>
Prior to <var class="product">[[Sirius Mods]]</var> Version 7.6, this is an EBCDIC string.</p></td></tr>
 
<tr><th><div id="options"></div>options</th>
<td>A blank delimited string that can contain one or more of the following options (but no repeats).
<p class="note">'''Note:''' These options are described in greater detail in [[XmlDoc API serialization options]]. </p>
<ul>
<ul>
<li><b>CharacterEncodeAll</b><br>
If the <var>EBCDIC</var> option is specified, use character encoding in all contexts (that is, not only in <var>Attribute</var> or <var>Element</var> values) to display Unicode characters that do not translate to EBCDIC. This option is available starting with version 8.0 of the <var class="product">Sirius Mods</var>.
<li><b>EBCDIC</b>
<li><b>EBCDIC</b>
<br>Produces serialized output in EBCDIC text rather than the default encoding, UTF-8.


This indicates that the serialization should be in EBCDIC rather
than UTF-8.
UTF-8 encoding is provided by default.
Selecting <tt>EBCDIC</tt> under ''Sirius Mods'' 7.6 or higher causes
conversion via the Unicode tables of the subtree content, which is
stored in Unicode.
For more information about the Unicode tables, see [[??]] refid=u80..
Serializing to UTF-8 involves no translation: the stored Unicode characters
are merely encoded as UTF-8.
<li><b>ExclCanonical</b>
<li><b>ExclCanonical</b>
<br>Produces serialized output in exclusive XML canonical form, as defined in the W3C "Exclusive XML Canonicalization" specification  (http://www.w3.org/tr/xml-exc-c14n).


This indicates that the output of the serialization will be in exclusive XML
<li><b>Indent <i>n</i></b>
canonical form, as defined in the W3C &ldquo;Exclusive XML Canonicalization&rdquo;
<br>Inserts space characters and line-ends into the serialized string such that if the string is broken at the line-ends and displayed as a tree, the display of each lower level in the subtree is indented ''n'' spaces from the previous level's starting point. You must also specify <code>CR</code>, <code>LF</code>, or <code>CRLF</code> (see below). </li>
specification (:hp0 color=SirLink.http://www.w3.org/tr/xml&hyph.exc&hyph.c14n:ehp0.) ,
which is an extension of the &ldquo;XML Canonicalization&rdquo; specification
(:hp0 color=SirLink.http://www.w3.org/TR/xml&hyph.c14n:ehp0.) .
These specifications constrain serializations to facilitate
processing such as digital signatures.
 
This option, added in ''Sirius Mods'' version 7.0, is described in greater detail in the
list item for [[??]] refid=canotes reftxt=Canonicalization.
in the &ldquo;Usage Notes&rdquo; section, below.
 
Specifying any of the Serial method CR, LF, CRLF, or Indent options
when you also specify ExclCanonical is allowed.
Although the resulting output will not be completely canonical, it may be
what you require for the purposes of a digital signature, for example.
The formatting addressed by those options is defined in
the Exclusive Canonicalization specification and covered by the ExclCanonical
option.
 
Similarly, the effect of the XmlDecl option contradicts the Exclusive
Canonicalization specification.
If you do specify the XmlDecl and ExclCanonical options together, however,
the serialized XML Declaration is followed by a linefeed character.
<br>
<li><b>Indent</b> <i><b>n</b></i>


Inserts space characters (and line-ends, as described for the next option)
<li><b>CR</b> (carriage-return), <b>LF</b> (linefeed), or <b>CRLF</b> (carriage-return followed by a linefeed)
into the serialized string such that
<br>Inserts one of these line-end characters to provide line breaks in the serialized output. </li>
if the string is broken at the line-ends and displayed as a tree,
the display of each lower level in the subtree
is indented ''n'' spaces from the previous level's starting point.


If serialized output with an Indent value of 2 is displayed as a tree,
<div id="noempty"></div>
the spacing is as in the following:
<pre>
    <top>
      <leaf1 xx="yy">value</leaf1>
      <sub>
        <leaf2>value</leaf2>
      </sub>
    </top>
</pre>
 
One of the line-end options, below, must also be specified.
 
<i>n</i> is a non-negative integer, and its maximum value
(as of ''Sirius Mods'' version 7.0) is 254.
<li>One of the '''line-end options''' below, to provide line breaks
in the output after any of the following is serialized:
<ul>
<li>An element start-tag, if it has any non-text node children
<li>An element end tag
<li>An empty element tag
<li>A processing instruction (PI)
<li>A comment
<li>A text node, if it has any siblings
</ul>
<dl>
<dt>CR
<dd>Insert a carriage-return character as the line-end sequence in the above cases.
<dt>LF
<dd>Insert a linefeed character as the line-end sequence in the above cases.
<dt>CRLF
<dd>Insert a carriage-return character followed by a
linefeed character as the line-end sequence in the above cases.
</dl>
'''Note:'''
If one of these line-end options is specified and
an <tt>AddTrailingDelimiter=false</tt> argument is also specified, no
line-end character is added at the end of the serialized subtree.
<br>
<li><b>NoEmptyElt</b>
<li><b>NoEmptyElt</b>
<br>This deprecated option serializes all empty elements with a start tag followed by an end tag. The default is to serialize an empty element with an empty element tag (as in <code><middleName/></code>).
<p>
<var>NoEmptyElt</var> is deprecated in order to deter users from using it to serialize HTML: The recommended approach for HTML is shown on the <var>[[NoEmptyElement (XmlNode property)#browserExample|NoEmptyElement]]</var> property page &mdash; some tags (<code>&#x3c;div></code>) <b>require</b> separate start and end tags, while other tags (<code>&#x3c;br></code>) <b>do not allow</b> separate start and end tags. </p></li>


Deprecated as of ''Sirius Mods'' version 7.0,
this option ensures that all empty elements are serialized with a start tag
followed by an end tag.
For example:
<pre>
    <middleName></middleName>
</pre>
If NoEmptyElt is not specified, the default is to serialize an empty
element with an empty element tag; using the same example as above,
this would be:
<pre>
    <middleName/>
</pre>
The ExclCanonical option provides the same empty element serialization
as NoEmptyElement.
<li><b>OmitNullElement</b>
<li><b>OmitNullElement</b>
<br>An <var>Element</var> node that has no children and no <var>Attributes</var> will not be serialized, unless it is the top level <var>Element</var> in the subtree being serialized. </li>


An Element node that has no children and
no Attributes will not be serialized, unless it is the top level Element in
the subtree being serialized.
The serialization of a child-less and Attribute-less Element is omitted,
even if the Element's serialization would contain
Namespace declarations in its start tag.
If an Element node has no Attributes, but has (only) Element children
(one or more), and all of its children are Attribute-less and child-less,
then that parent Element is serialized, even though its content in the
serialization is empty.
That parent is serialized with a start tag and an end tag
(and an inserted line separator, if called for by the serializing
method's parameter options).
For example, if the Serial method display of a particular XmlDoc in tree format
is the following when OmitNullElement is ''not'' specified:
<pre>
    <top>
      <middle>
          <empty/>
          <p:empty2 xmlns:p="uri:stuff"/>
      </middle>
    </top>
</pre>
Here is the display of the XmlDoc with the OmitNullElement option specified:
<pre>
    <top>
      <middle>
      </middle>
    </top>
</pre>
But if you attempt to display only the <tt>empty</tt> subtree of %d
using OmitNullElement, the <tt>empty</tt> node
is not suppressed, and the result is:
<pre>
    <empty/>
</pre>
The OmitNullElement option is available as of ''Sirius Mods'' version 7.3.
<li><b>SortCanonical</b>
<li><b>SortCanonical</b>
<br>This deprecated option serializes namespace declarations and attributes in sorted order (from lowest to highest with Unicode code ordering).  It is superseded by the <var>ExclCanonical</var> option. </li>


Deprecated as of ''Sirius Mods'' version 7.0, SortCanonical serializes
<li><b>UTF-8</b>
namespace declarations (based on the prefix being declared) and
<br>Produces serialized output in UTF-8. This is the default. </li>
attributes (based on the namespace URI followed by the local name)
in sorted order.
This can be useful, for instance, when using Serial to
serialize a portion of an XML document for a signature.


The sort order for namespace declarations and attributes is from
lowest to highest, and it uses the Unicode code ordering (for example,
numbers are lower than letters).
Added in ''Sirius Mods'' version 6.9, this option is superseded by the ExclCanonical
option.
<br>
:noteh
<li><b>WithComments</b>
<li><b>WithComments</b>
<br>Includes in the serialized output all <var>Comment</var> nodes in the specified subtree.
<p class="note">'''Note:''' Specifying <var>WithComments</var> without specifying <var>ExclCanonical</var> has no effect. Specifying <var>ExclCanonical</var> without specifying <var>WithComments</var> suppresses all <var>Comment</var> nodes from the result.</p> </li>


This indicates that all Comment nodes
in the specified subtree are to be included in the serialized output.
'''Note:'''
This option, added in ''Sirius Mods'' version 7.0, is only a supplement
to the ExclCanonical option:
specifying WithComments without specifying ExclCanonical has no effect.
Specifying ExclCanonical without specifying WithComments causes all Comment
nodes to be suppressed from the result.
<li><b>XmlDecl</b>
<li><b>XmlDecl</b>
 
<br>Ensures that the serialization will contain the "XML Declaration" (<code><?xml version=...?></code>), if the value of the <var>[[Version (XmlDoc property)|Version]]</var> property is a non-null string, if the <var>XmlDoc</var> is not empty, and if the top of the subtree being serialized is the <var>Root</var> node. </li>
This indicates that the serialized XmlDoc will contain the &ldquo;XML
Declaration&rdquo; (<?xml version=...?>), if the value of the Version property
(??[[Version (XmlDoc property)|Version]]) is a non-null string, and if the
XmlDoc is not empty.
 
XmlDecl may only be specified if the top of the subtree being serialized is the
Root node.
 
The XmlDecl option is new in ''Sirius Mods'' version 6.7.
</ul>
</ul>
<dt>AddTrailingDelimiter=<i><b>bool</b></i>
</td></tr>
<dd>This Boolean, name required parameter determines whether a final line-end
character is added to the serialization when one of
the Serial method line-end options (<tt>LF</tt>, <tt>CR</tt>, or
<tt>CRLF</tt>) is specified.


The default value of AddTrailingDelimiter is <tt>True</tt>,
<tr><th><var>AddTrailingDelimiter</var></th>
so Serial specified with a line-end option adds a trailing line-end by default.
<td>This [[Methods#Named parameters|name required]] parameter is a <var>[[Enumerations#Using Boolean enumerations|Boolean]]</var> value, which determines whether a final line-end character is added to the serialization when one of the <var>Serial</var> method line-end options (<var>LF</var>, <var>CR</var>, or <var>CRLF</var>) is specified.
If AddTrailingDelimiter is
<p>
specified as <tt>False</tt>, no final line-end character is added.
The default value of <var>AddTrailingDelimiter</var> is <var>True</var>, so <var>Serial</var> specified with a line-end option adds a trailing line-end by default. If <var>AddTrailingDelimiter</var> is <var>False</var>, no final line-end character is added.</p>
 
<p>
Specifying the AddTrailingDelimiter argument without also specifying one of the
Specifying the <var>AddTrailingDelimiter</var> argument without also specifying one of the line-end options has no effect on the resulting serialization.</p>
line-end options has no effect on the resulting serialization.
<p>
 
<var>AddTrailingDelimiter</var> was introduced as of <var class="product">Sirius Mods</var> Version 7.0. It may be useful if a digital signature must be created which includes line-end characters between XML tags, but the <var>XmlDoc</var> does not contain those line-end <var>Text</var> nodes.</p></td></tr>
AddTrailingDelimiter is new as of ''Sirius Mods'' version 7.0.
</table>
It may be useful if a digital signature must be created which includes line-end
characters between XML tags, but the XmlDoc does not contain those line-end Text
nodes.
</dl>
===Usage Notes===
<ul>
<li>To obtain a Longstring that is the UTF-8
serialization of an entire XmlDoc, including the &ldquo;XML declaration,&rdquo;
use ??[[Xml (XmlDoc function)|Xml]].
<li>The ''options'' argument values may be specified in any case.
For example,
<tt>XmlDecl</tt> and <tt>xmldecl</tt> are interchangeable.
<li>Line-end/whitespace characters:
<ul>
<li>Using one of the line-end character options (CR, LF, CRLF) produces output
that is similar to the BothCompact option of the Print method
(??[[Print (XmlDoc/XmlNode subroutine)|Print]]).
<li>If one of the line-end
(<tt>CR</tt>, <tt>LF</tt>, <tt>CRLF</tt>) options
is specified, and an element to be serialized has the
<tt>xml:space="preserve"</tt> attribute, then
within the serialization of that element and its descendants, no line-end
characters are inserted.
 
In addition, the <tt>xml:space="default"</tt> attribute has no effect
under these options:
specified by itself, it does not influence serialization, nor
does it cause resumption of readability line-ends or indents if they were
suspended by a containing <tt>xml:space="preserve"</tt>.
<li>As of version 6.7, the Serial method uses the hexadecimal
character references specified in the XML Canonicalization specification
(:hp0 color=SirLink.http://www.w3.org/TR/xml-c14n:ehp0.)  to
display the following whitespace characters:
<ul>
<li>For Attribute nodes: tab, carriage return (CR), and linefeed (LF)
<li>For Text nodes: carriage return
</ul>
 
Since the character references are not subject to the standard XML whitespace
normalization ([[??]] refid=normwhi.),
a serialized document (or subtree) that is then deserialized
will retain this whitespace.
 
These character references are used:
<dl>
<dt>tab
<dd>&amp;#x9;
<dt>CR
<dd>&amp;#xD;
<dt>LF
<dd>&amp;#xA;
</dl>
 
The EBCDIC and corresponding ASCII encodings of the characters is:
<dl>
<dt>&thinsp.
<dd>EBCDIC    ASCII
<dt>tab
<dd>X'05'          X'09'
<dt>CR
<dd>X'0D'          X'0D'
<dt>LF
<dd>X'25'          X'0A'
</dl>
</ul>
<li>As of ''Sirius Mods'' version 7.6, Attribute values are always serialized
within double-quotation-mark (<tt>"</tt>) delimiters,
and a double-quotation mark character in an attribute value is serialized
as <tt>&amp;quot;</tt>.
Prior to version 7.6, this convention was not strictly observed.
<li>Canonicalization:
 
Canonicalization refers to
a particular serialization of an XML document that is
unique, yet still a logically equivalent representation
of the document.
Exclusive canonicalization is canonicalization augmented by rules for
preserving or excluding the namespace context (declaration) of nodes when
''only a portion of an XML document'' is serialized.
 
Therefore, if a portion (subtree) of an XML document is exclusively
canonicalized, it is
serialized uniquely and is &ldquo;substantially independent of its XML context&rdquo;
(that is, contains all essential and no extraneous information from its
ancestor nodes).
This independence makes the subtree suitable for working with digital signatures.
 
Some of the many requirements for canonicalization are provided automatically
by specifying the Serial method with no options specified.
For example, UTF-8 encoding and exclusion of the XML declaration, if any,
are provided by default by Serial.
Specifying <tt>ExclCanonical</tt>, which is new as of ''Sirius Mods'' version 7.0,
adds the following features to the no-option default:
<ul>
<li>Sorting of namespace declarations (based on the prefix
being declared) and of attributes (based on the namespace URI followed by the
local name).
The sort order is from lowest
to highest, and it uses the Unicode code ordering (for example, numbers
are lower than letters).
<li>For empty elements, serialization with both a start tag and an end tag,
instead of using a single &ldquo;empty element tag.&rdquo;
<li>The suppression of any Comment nodes that may be present in the subtree.
Comment nodes are suppressed unless the <tt>WithComments</tt> option is
specified along with ExclCanonical.
 
For an example, see item [[??]] refid=namspx5..
<li>Special namespace declaration handling: A namespace declaration is produced
only if it is utilized by an element or attribute in the subtree.
The declaration is produced in the
start-tag of an element that uses it (or has an attribute using
it), unless the parent of the element is in the subtree and the
declaration is in scope at the parent.
 
For examples, see items [[??]] refid=namspx1. and [[??]] refid=namspx2..
<li>''Attribute values'' are always serialized within
double-quotation-mark (<tt>"</tt>) delimiters,
and a double-quotation mark character in an attribute value is serialized
as <tt>&amp;quot;</tt>.


With or without the ExclCanonical option,
==Usage notes==
these special characters in attribute values are serialized
as entity and hexadecimal character references:
<ul>
<ul>
<li>The ampersand (&) is serialized as <tt>&amp;amp;</tt>
<li>The <var class="term">options</var> argument values may be specified in any case. For example, <code>XmlDecl</code> and <code>xmldecl</code> are interchangeable.
<li>The less-than symbol (<) is serialized as <tt>&amp;lt;</tt>
<li>As of <var class="product">Sirius Mods</var> Version 7.6, <var>Attribute</var> values are always serialized within double-quotation-mark (<tt>"</tt>) delimiters, and a double-quotation mark character in an <var>Attribute</var> value is serialized as <code>&amp;quot;</code>. Prior to <var class="product">Sirius Mods</var> Version 7.6, this convention was not strictly observed.
<li>The carriage return (CR) character is serialized as <tt>&amp;#xD;</tt>
<li>The linefeed (LF) character is serialized as <tt>&amp;#xA;</tt>
<li>The tab character is serialized as <tt>&amp;#x9;</tt>
</ul>
</ul>


For examples, see item [[??]] refid=namspx6..
==Examples==
 
<li>Within ''Text nodes'', the following characters are
serialized as entity and hexadecimal character references:
 
If you specify Serial with no options:
<ul>
<li>The less-than symbol (<) is serialized as <tt>&amp;lt;</tt>
<li>The ampersand (&) is serialized as <tt>&amp;amp;</tt>
<li>The carriage return (CR) character is serialized as <tt>&amp;#xD;</tt>
</ul>
 
If you specify the ExclCanonical option, the following is ''also'' true:
<ul>
<li>The greater-than symbol (>) is serialized as <tt>&amp;gt;</tt>
</ul>
 
For examples, see item [[??]] refid=namspx6..
<li>If serializing the Root of an XmlDoc, a linefeed character
is inserted ''between'' the children of the Root.
This character is represented exactly
by <tt>X'25'</tt> if the <tt>EBCDIC</tt> option of Serial is used; otherwise
it is represented by <tt>X'0A'</tt>.
'''Note:'''
No linefeed is inserted if the XmlDoc has one PI or Comment node and
does not have an Element node.
In this case (which is allowed by [[Janus SOAP]]), the XML document is not well-formed
and therefore the canonicalization specifications ignore it.
<li>If the subtree to be serialized is a single node that is either of these:
<ul>
<li>A PI child of the Root
<li>A single node that is a Comment child of the Root and
the <tt>WithComments</tt> option is specified
</ul>
 
Then a linefeed character is added after the PI or Comment if
there is a following Element sibling, or is added before the PI or Comment
if there is a preceding Element sibling.
'''Note:'''
No linefeed is inserted if the XmlDoc does not have an Element node.
In this case (which is allowed by [[Janus SOAP]]), the XML document is not well-formed
and therefore the canonicalization specifications ignore it.
</ul>
 
Qualifications/exceptions:
<ul>
<li>The canonicalization specifications, especially
exclusive canonicalization, include references to the
serialization of a ''subset'' of a document.
The ExclCanonical option is based not on a subset but on a ''subtree''.
<li>Although the ExclCanonical and SortCanonical options use
the &ldquo;Unicode&rdquo; sort sequence,
this is currently limited to Unicode values less than 256 (as
of version &NUNCVSN. of [[Janus SOAP]]),
so it is accomplished with an 8-byte EBCDIC to 8-byte
Unicode table, which is (for all intents and purposes) merely an EBCDIC-to-ASCII
translation.
<li>The specifications support an argument to canonicalization that
is a list of namespace declarations that are to be &ldquo;forced&rdquo; into the
serialization.
The ExclCanonical option does not provide this support.
</ul>
 
A series of examples of the effects of the ExclCanonical option
begins with item [[??]] refid=namspx1..
</ul>
===Examples===
<ol>
<ol>
<li>In the following example, the Serial method EBCDIC
<li>In the following example, the <var>Serial</var> method EBCDIC formatting of a document is shown. A <var>Print</var> statement display of the default UTF-8 formatting of <var>Serial</var> is a string that is not readily decipherable.
formatting of a document is shown.
<p class="code">begin
A Print statement display of the default UTF-8 formatting of Serial is a
  %doc is object xmlDoc
string that is not readily decipherable.
  %sl is object [[Stringlist_class|stringlist]]
<pre>
  %doc = new
    Begin
  %sl = new
    %doc is Object XmlDoc
    %sl is Object Stringlist
    %doc = New
    %sl = New


    text to %sl
  text to %sl
    <top>
  <nowiki><top>
       <a>
       <a>
        <b>05</b>
        <b>05</b>
       </a>
       </a>
       <c>
       <c>
        <d att="val"/>
        <d att="val"/>
       </c>
       </c>
    </top>
  </top></nowiki>
    end text
  end text
 
 
    Call %doc:LoadXml(%sl)
  [[Notation_conventions_for_methods#Callable_methods|Call]] %doc:[[LoadXml_(XmlDoc/XmlNode_function)|loadXml]](%sl)
    Print 'Serial method output follows:'
  print 'Serial method output follows:'
    Print %doc:Serial('top', 'ebcdic')
  print %doc:serial('top', 'ebcdic')
    End
end
</pre>
</p>
 
The example results follow:
The example results follow:
<pre>
<p class="output">Serial method output follows:
    Serial method output follows:
<nowiki><top><a><b>05</b></a><c><d att="val"/></c></top></nowiki>
    <top><a><b>05</b></a><c><d att="val"/></c></top>
</p>
</pre>
<li>In the following fragment, the <var>Serial</var> method EBCDIC formatting of a document with untranslatable Unicode is shown.
<li>This and the remaining examples show various aspects of
<p class="code">%doc2:[[AddElement_(XmlDoc/XmlNode_function)|addElement]]('circumference', -
the <tt>ExclCanonical</tt> option.
  '2 * &amp;#x3C0; * r':U)
The examples use the <tt>EBCDIC</tt> option to display
print %doc2:serial(, 'ebcdic')
the result.
</p>
If using ExclCanonical for digital signature processing, you probably
The result follows (the Unicode codepoint for the Greek letter &#x03c0; has the hexadecimal value 03C0):
should omit the EBCDIC option and use the default encoding, UTF-8.
<p class="output"><circumference>2 * &amp;#x03C0; * r</circumference>
 
</p>
Under exclusive canonicalization, a namespace is not serialized if it is not
</ol>
necessary.
In this example, the subtree to be serialized is displayed in blue font
in the request code that follows:
<pre>
    Begin
    %doc is Object XmlDoc
    %doc = New
    %l is longstring
    %sl is object stringlist
    %sl = New
    text to %sl
    <top>
      :hp2 color=blue.<a xmlns:p3="urn:p3" xmlns:p2="urn:p2" xmlns:p1="urn:p1">
            <p1:b/>
            <p2:b/>
      </a>'''
    </top>
    end text
 
    Call %doc:LoadXml(%sl)
    Print 'Exclcan via ParseLines:'
    %sl = New
    %l=%doc:Serial('top/a', 'EBCDIC exclcanonical indent 2 lf')
    %sl:Parselines(%l)
    %sl:Print
    End
</pre>
 
The exclusive canonical serialization (displayed, after being parsed from string
to Stringlist, with line breaks and indent for the sake of clarity)
omits the declaration for <tt>p3</tt>,
because it is not utilized in the serialized subtree:
<pre>
    <a>
      <p1:b xmlns:p1="urn:p1"></p1:b>
      <p2:b xmlns:p2="urn:p2"></p2:b>
    </a>
</pre>


An element '''utilizes''' an in-scope
==Request-cancellation errors==
namespace declaration in either of these cases:
This list is not exhaustive: it does <i>not</i> include all the errors that are request  cancelling.
<ul>
<ul>
<li>The element is prefixed and the declaration is of that prefix.
<li><var class="term">xpath</var> argument is invalid.
<li>The element is unprefixed and it is a default namespace declaration.
<li>Result of <var class="term">xpath</var> is empty.
<li>An <var class="term">options</var> setting is invalid.
<li>Insufficient free space exists in CCATEMP.
</ul>
</ul>


An attribute '''utilizes''' an in-scope
==See also==
namespace declaration if
<ul>
the attribute is prefixed and the declaration is of that prefix.
<li>The subroutine that serializes an <var>XmlDoc</var> and sends it as a Web response is <var>[[WebSend (XmlDoc subroutine)|WebSend]]</var>. </li>


In the preceding example, there was no alternative to removing the non-utilized
<li>Additional serializing methods include:
declaration for <tt>p3</tt>, but if it were utilized by a
descendant element &ldquo;lower&rdquo; in the document tree, it would be
moved to that element.
 
Another application of the utilization rule is shown in the next example.
 
<li>Under exclusive canonicalization, namespaces are imported to where
they are needed.
 
Using the same type of request as in example [[??]] refid=namspx1 page=no. above,
the <tt>w</tt> element is the subtree to serialize (display form, blue font):
<pre>
    <a xmlns:p3="urn:p3" xmlns:p2="urn:p2" xmlns:p1="urn:p1">
    :hp2 color=blue.
    <w>
      <p1:b/>
      <p2:b/>
    </w>
    '''
    </a>
</pre>
 
Exclusive canonical serialization (display form), which gets required namespace
declarations from an ancestor of the serialized subtree:
<pre>
    <w>
      <p1:b xmlns:p1="urn:p1"></p1:b>
      <p2:b xmlns:p2="urn:p2"></p2:b>
    </w>
</pre>
<li>PIs and Comments
 
Using the same type of request as in example [[??]] refid=namspx1 page=no. above,
this is the subtree to be serialized (display form):
<pre>
    <a>
      <!-- Comment 1 -->
      <w>
        <?pi-without-data?>
      </w>
    </a>
</pre>
 
Exclusive canonical serialization (display form),
which omits the Comment node:
<pre>
    <a>
      <w>
        <?pi-without-data?>
      </w>
    </a>
</pre>
'''Note:'''
To include the Comment node, specify also the <tt>WithComments</tt>
option of Serial.
 
<li>Character references
 
Using the same type of request as in example [[??]] refid=namspx1 page=no. above,
this is the subtree to serialize (display form):
<pre>
    <doc>
      <comp>val>"0" val&amp;lt;"10"</comp>
      <comp expr='val>"0"'></comp>
      <norm attr=' &amp;apos;  &amp;#xD;&amp;#xA;&amp;#x9; &amp;apos; '/>
      <white>&amp;#x9;&amp;#xD;&amp;#xA;</white>
    </doc>
</pre>
 
This is the result from Serial method ''with no options'' specified
(display form, and the <tt><white></tt> element has a line that wraps
to emphasize the non-visible linefeed character it contains):
<pre>
    <doc>
      <comp>val>"0" val&amp;lt;"10"</comp>
      <comp expr='val>"0"'></comp>
      <norm attr=" '  &amp;#xD;&amp;#xA;&amp;#x9; ' "/>
      <white> &amp;#xD;
    </white>
    </doc>
</pre>
 
The exclusive canonical serialization follows (display form,
wrapped <tt><white></tt> element line has no indent).
<pre>
    <doc>
      <comp>val:hp2 color=blue.&amp;gt;'''"0" val&amp;lt;"10"</comp>
      <comp expr=:hp2 color=blue."'''val>:hp2 color=blue.&amp;quot;'''0:hp2 color=blue.&amp;quot;'''"></comp>
      <norm attr=" '  &amp;#xD;&amp;#xA;&amp;#x9; ' ">:hp2 color=blue.</norm>'''
      <white> &amp;#xD;
    </white>
    </doc>
</pre>
 
The differences from no-option Serial (blue font) include:
<ul>
<ul>
<li>The greater-than symbol (>) within a text node is serialized
<li><var>[[Xml (XmlDoc function)|Xml]]</var>
as <tt>&amp;gt;</tt>.
<li><var>[[Print (XmlDoc/XmlNode subroutine)|Print]]</var>
<li>Attribute values are enclosed in double-quotation marks (<tt>"</tt>).
<li><var>[[AddXml (HttpRequest subroutine)|AddXml]]</var> (<var>[[HttpRequest class|HttpRequest]]</var> class)
<li>A double-quotation mark in an attribute value is serialized
</ul> </li>
as <tt>&amp;quot;</tt>.
<li>An empty element is serialized with two tags (a start tag
followed by an end tag), not with a single empty-element tag.
</ul>
</ol>


===Request-Cancellation Errors===
<li>See the description of [[XmlDoc_API_serialization_options#EBCDIC_serialization_of_untranslatable_Unicode_characters|Unicode to EBCDIC conversion]] performed by <var>Serial</var> with the <code>EBCDIC</code> option. </li>
<ul>
<li><i>XPath</i> is invalid.
<li>Result of <i>XPath</i> is empty.
<li><i>Options</i> are invalid.
<li>Insufficient free space exists in CCATEMP.
</ul>


<li>To deserialize a string, use <var>[[LoadXml (XmlDoc/XmlNode function)|LoadXml]]</var> or <var>[[WebReceive (XmlDoc function)|WebReceive]]</var>. </li>


===See Also===
<li>For more information about using XPath expressions, see [[XPath]]. </li>


<ul>
<li>For additional discussion about serialization, see [[XmlDoc API#Transport: receiving and sending XML|Transport: receiving and sending XML]].</li>
<li>The subroutine that serializes an XmlDoc and sends it as a
Web response is ??[[WebSend (XmlDoc subroutine)|WebSend]], described below.
<li>Additional serializing methods include:
<ul>
<li>[[Xml (XmlDoc function)|Xml]]
<li>[[Print (XmlDoc/XmlNode subroutine)|Print]]
<li>AddXml (HttpRequest class, described in the [[Janus Sockets]]R.)
</ul>
<li>To deserialize a string,
use ??[[LoadXml (XmlDoc/XmlNode function)|LoadXml]]
or ??[[WebReceive (XmlDoc function)|WebReceive]].
<li>For more information about using XPath expressions, see [[XPath]].
</ul>
</ul>
{{Template:XmlDoc/XmlNode:Serial footer}}

Latest revision as of 17:58, 18 February 2015

Serialize selected subtree as string (XmlDoc and XmlNode classes)

Serial converts an XmlDoc subtree to the UTF-8 or EBCDIC text string representation of the subtree. This process is called serialization, because the text representation of a document is called the serial form.

Syntax

%string = nr:Serial[( [xpath], [options], [AddTrailingDelimiter= boolean])] Throws XPathError

Syntax terms

%string A string variable for the string serialization of the subtree, encoded either in UTF-8 or, if the EBCDIC option (see below) is used, in EBCDIC.
nr An XmlDoc or XmlNode, used as the context node for the xpath expression. If an XmlDoc, the Root node is the context node.
xpath A Unicode string that is an Xpath expression that results in a nodelist, the head of which is the top of the subtree to serialize. Any other nodes in the nodelist are ignored.

This is an optional argument; its default is a period (.), that is, the node referenced by the method object(nr).

Prior to Sirius Mods Version 7.6, this is an EBCDIC string.

options
A blank delimited string that can contain one or more of the following options (but no repeats).

Note: These options are described in greater detail in XmlDoc API serialization options.

  • CharacterEncodeAll
    If the EBCDIC option is specified, use character encoding in all contexts (that is, not only in Attribute or Element values) to display Unicode characters that do not translate to EBCDIC. This option is available starting with version 8.0 of the Sirius Mods.
  • EBCDIC
    Produces serialized output in EBCDIC text rather than the default encoding, UTF-8.
  • ExclCanonical
    Produces serialized output in exclusive XML canonical form, as defined in the W3C "Exclusive XML Canonicalization" specification (http://www.w3.org/tr/xml-exc-c14n).
  • Indent n
    Inserts space characters and line-ends into the serialized string such that if the string is broken at the line-ends and displayed as a tree, the display of each lower level in the subtree is indented n spaces from the previous level's starting point. You must also specify CR, LF, or CRLF (see below).
  • CR (carriage-return), LF (linefeed), or CRLF (carriage-return followed by a linefeed)
    Inserts one of these line-end characters to provide line breaks in the serialized output.
  • NoEmptyElt
    This deprecated option serializes all empty elements with a start tag followed by an end tag. The default is to serialize an empty element with an empty element tag (as in <middleName/>).

    NoEmptyElt is deprecated in order to deter users from using it to serialize HTML: The recommended approach for HTML is shown on the NoEmptyElement property page — some tags (<div>) require separate start and end tags, while other tags (<br>) do not allow separate start and end tags.

  • OmitNullElement
    An Element node that has no children and no Attributes will not be serialized, unless it is the top level Element in the subtree being serialized.
  • SortCanonical
    This deprecated option serializes namespace declarations and attributes in sorted order (from lowest to highest with Unicode code ordering). It is superseded by the ExclCanonical option.
  • UTF-8
    Produces serialized output in UTF-8. This is the default.
  • WithComments
    Includes in the serialized output all Comment nodes in the specified subtree.

    Note: Specifying WithComments without specifying ExclCanonical has no effect. Specifying ExclCanonical without specifying WithComments suppresses all Comment nodes from the result.

  • XmlDecl
    Ensures that the serialization will contain the "XML Declaration" (<?xml version=...?>), if the value of the Version property is a non-null string, if the XmlDoc is not empty, and if the top of the subtree being serialized is the Root node.
AddTrailingDelimiter This name required parameter is a Boolean value, which determines whether a final line-end character is added to the serialization when one of the Serial method line-end options (LF, CR, or CRLF) is specified.

The default value of AddTrailingDelimiter is True, so Serial specified with a line-end option adds a trailing line-end by default. If AddTrailingDelimiter is False, no final line-end character is added.

Specifying the AddTrailingDelimiter argument without also specifying one of the line-end options has no effect on the resulting serialization.

AddTrailingDelimiter was introduced as of Sirius Mods Version 7.0. It may be useful if a digital signature must be created which includes line-end characters between XML tags, but the XmlDoc does not contain those line-end Text nodes.

Usage notes

  • The options argument values may be specified in any case. For example, XmlDecl and xmldecl are interchangeable.
  • As of Sirius Mods Version 7.6, Attribute values are always serialized within double-quotation-mark (") delimiters, and a double-quotation mark character in an Attribute value is serialized as &quot;. Prior to Sirius Mods Version 7.6, this convention was not strictly observed.

Examples

  1. In the following example, the Serial method EBCDIC formatting of a document is shown. A Print statement display of the default UTF-8 formatting of Serial is a string that is not readily decipherable.

    begin %doc is object xmlDoc %sl is object stringlist %doc = new %sl = new text to %sl <top> <a> <b>05</b> </a> <c> <d att="val"/> </c> </top> end text Call %doc:loadXml(%sl) print 'Serial method output follows:' print %doc:serial('top', 'ebcdic') end

    The example results follow:

    Serial method output follows: <top><a><b>05</b></a><c><d att="val"/></c></top>

  2. In the following fragment, the Serial method EBCDIC formatting of a document with untranslatable Unicode is shown.

    %doc2:addElement('circumference', - '2 * &#x3C0; * r':U) print %doc2:serial(, 'ebcdic')

    The result follows (the Unicode codepoint for the Greek letter π has the hexadecimal value 03C0):

    <circumference>2 * &#x03C0; * r</circumference>

Request-cancellation errors

This list is not exhaustive: it does not include all the errors that are request cancelling.

  • xpath argument is invalid.
  • Result of xpath is empty.
  • An options setting is invalid.
  • Insufficient free space exists in CCATEMP.

See also