XPath: Difference between revisions
mNo edit summary |
|||
(38 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
This article has information to help you use | This article has information to help you use | ||
XPath arguments to various [[XmlDoc API]] methods. | XPath arguments to various [[XmlDoc API]] methods. | ||
Most of the information is taken from the [http://www.w3.org/TR/xpath XPath 1 standard], which is the | Most of the information is taken from the [http://www.w3.org/TR/xpath XPath 1 standard], which is the authoritative reference. | ||
authoritative reference. | |||
References to Version 2 of the XPath standard in this manual | References to Version 2 of the XPath standard in this manual | ||
are to the [http://www.w3.org/TR/xpath20 XPath 2 standard], which became a W3C Recommendation | are to the [http://www.w3.org/TR/xpath20 XPath 2 standard], which became a W3C Recommendation on January 23, 2007. | ||
on January 23, 2007. | |||
The five sections in this | The five sections in this topic explain, respectively, the following: | ||
<ol> | <ol> | ||
<li>How to understand the | <li>How to understand the components of an XPath expression, which is composed of '''steps'''. </li> | ||
components of an XPath expression, which is composed of '''steps'''. | <li>The syntax of XPath. </li> | ||
<li>The syntax of XPath. | <li>Some subtle aspects of XPath. </li> | ||
<li>Some subtle aspects of XPath. | <li>Specific XPath axis combinations to avoid. </li> | ||
<li>Specific XPath axis combinations to avoid. | <li>The subset of XPath supported by the current version of the XmlDoc API. </li> | ||
<li>The subset of XPath supported by the current version of the | |||
</ol> | </ol> | ||
==XPath operation== | |||
The purpose of XPath is to select a subset of nodes from a document. | The purpose of XPath is to select a subset of nodes from a document. | ||
This selection is done using an | This selection is done using an <b>expression</b>, as described by | ||
the <tt>PathExpr</tt> production (these syntax productions for XPath | the <tt>PathExpr</tt> production (these syntax productions for XPath | ||
are shown in [[#XPath syntax|XPath syntax]]). | are shown in [[#XPath syntax|XPath syntax]]). | ||
The simple form (that is, without parentheses) of a PathExpr expression | The simple form (that is, without parentheses) of a PathExpr expression | ||
is called a | is called a <b>Location Path</b>. (the <tt>LocPath</tt> production | ||
(<tt>[1]</tt>) in the syntax). | (<tt>[1]</tt>) in the syntax). | ||
A Location Path consists of a series of | A Location Path consists of a series of <b>Steps</b> (<tt>Step</tt> ([4]) | ||
production). | production). | ||
Each Step operates by taking an input set of nodes from the preceding | Each Step operates by taking an input set of nodes from the preceding | ||
Line 34: | Line 32: | ||
An example XPath expression is: | An example XPath expression is: | ||
< | <p class="code">pitm[2]/partnum | ||
</p> | |||
</ | This expression contains two Steps (the slash symbol ( '''/''' ) is used to | ||
This expression contains two Steps (the slash symbol ( / ) is used to | |||
separate the Steps in a Location Path). | separate the Steps in a Location Path). | ||
Often a Step will start with an element name, which selects all the child | Often a Step will start with an element name, which selects all the child | ||
elements with that name. | elements with that name. | ||
In the above example, < | In the above example, <code>partnum</code> children of <code>pitm</code> | ||
elements are selected. | elements are selected. | ||
These | These <b>child</b> relationships are one kind of relationship between the | ||
input to a Step and the first part of the algorithm; | input to a Step and the first part of the algorithm; | ||
the kinds of relationships, or | the kinds of relationships, or <b>Axes</b>, are shown | ||
in the <tt>AxisName</tt> ([6]) production. | in the <tt>AxisName</tt> ([6]) production. | ||
The element names in the above example | The element names in the above example | ||
are a form of | are a form of <b>NodeTest</b>, described by the <tt>NodeTest</tt> | ||
([7]) production. | ([7]) production. | ||
The NodeTest is used to restrict the set of nodes. | The NodeTest is used to restrict the set of nodes. | ||
Square brackets ('''[''' | Square brackets ('''[''' ''']''') | ||
in a Step surround another form of restriction, which is called | in a Step surround another form of restriction, which is called | ||
a | a <b>Predicate</b>, given by the production ([8]) with the same name. | ||
A Predicate is a much more open-ended type of restriction, allowing various functions | A Predicate is a much more open-ended type of restriction, allowing various functions | ||
and operations, including Booleans. | and operations, including Booleans. | ||
Line 70: | Line 67: | ||
the following Predicate, if any. | the following Predicate, if any. | ||
<li>The final filtered sets are combined (using set union), and the result | <li>The final filtered sets are combined (using set union), and the result | ||
is the set of nodes | is the set of nodes that becomes input to the following Step. | ||
<li>The result of the final Step is the result of the Location Path. | <li>The result of the final Step is the result of the Location Path. | ||
</ol> | </ol> | ||
===Axes=== | |||
The various forms of the <tt>AxisName</tt> ([6]) production generate | The various forms of the <tt>AxisName</tt> ([6]) production generate | ||
nodes based on a context node using the simple tree relationships | nodes based on a context node using the simple tree relationships | ||
described by the name. | described by the name. | ||
For example, < | For example, <code>attribute::</code> (abbreviated as <tt>@</tt>, the at sign) | ||
generates the set of all attributes of a context node. | generates the set of all attributes of a context node. | ||
Line 86: | Line 84: | ||
its parent, and so on, to and including the Root node of the | its parent, and so on, to and including the Root node of the | ||
XmlDoc. | XmlDoc. | ||
<dt>ancestor-or-self | <dt>ancestor-or-self | ||
<dd>The same contents as <tt>ancestor</tt>, except that it also | <dd>The same contents as <tt>ancestor</tt>, except that it also | ||
includes the context node. | includes the context node. | ||
<dt>attribute | <dt>attribute | ||
<dd>Contains the attributes of the context node, which must be an element. | <dd>Contains the attributes of the context node, which must be an element. | ||
<dt>child | <dt>child | ||
<dd>Contains the children of the context node. | <dd>Contains the children of the context node. | ||
<dt>descendant | <dt>descendant | ||
<dd>Contains the context node children, their | <dd>Contains the context node children, their | ||
Line 98: | Line 100: | ||
Since the children of a node do not include its attributes, | Since the children of a node do not include its attributes, | ||
this axis does not include any attributes (so, this is not equivalent to | this axis does not include any attributes (so, this is not equivalent to | ||
a | a "sub-tree"). | ||
<dt>descendant-or-self | <dt>descendant-or-self | ||
<dd>The same contents as <tt>descendant</tt>, except that it also | <dd>The same contents as <tt>descendant</tt>, except that it also | ||
includes the context node. | includes the context node. | ||
Specified by <tt>//</tt>, an | Specified by <tt>'''//'''</tt>, an | ||
abbreviation for the step consisting of <tt>descendant-or-self::node()</tt>, | abbreviation for the step consisting of <tt>descendant-or-self::node()</tt>, | ||
this axis can be used, for example, to locate an element by its name, | this axis can be used, for example, to locate an element by its name, | ||
if the | if the "path" to it is not known: | ||
< | <p class="code">//foo | ||
</p> | |||
</ | |||
<dt>following | <dt>following | ||
<dd>Contains all the nodes that are after the context node in document order, | <dd>Contains all the nodes that are after the context node in document order, | ||
excluding any descendants and excluding attribute nodes. | excluding any descendants and excluding attribute nodes. | ||
<dt>following-sibling | <dt>following-sibling | ||
<dd>Contains all the siblings of the context node that are | <dd>Contains all the siblings of the context node that are | ||
positioned after that node. | positioned after that node. | ||
Contains no siblings if context node is an attribute. | Contains no siblings if context node is an attribute. | ||
<dt>parent | <dt>parent | ||
<dd>Contains the parent of the context node. | <dd>Contains the parent of the context node. | ||
Each node has only one parent, except the root node, which has no parent. | Each node has only one parent, except the root node, which has no parent. | ||
<dt>preceding-sibling | <dt>preceding-sibling | ||
<dd>Contains all the siblings of the context node that are | <dd>Contains all the siblings of the context node that are | ||
positioned before that node. | positioned before that node. | ||
Contains no siblings if context node is an attribute. | Contains no siblings if context node is an attribute. | ||
<dt>self | <dt>self | ||
<dd>Contains the context node itself. | <dd>Contains the context node itself. | ||
</dl> | </dl> | ||
The following XPath axes are '''not''' | The following XPath axes are '''not''' supported in the current version: | ||
supported in the current version: | |||
<dl> | <dl> | ||
<dt>namespace | <dt>namespace | ||
Line 136: | Line 142: | ||
declarations by using certain XmlDoc API methods, for example, the [[URI (XmlNode function)|URI]] | declarations by using certain XmlDoc API methods, for example, the [[URI (XmlNode function)|URI]] | ||
function on an XmlNode. | function on an XmlNode. | ||
<dt>preceding | <dt>preceding | ||
<dd>Support for this axis may be added in a later version. | <dd>Support for this axis may be added in a later version. | ||
</dl> | </dl> | ||
===NodeTests=== | |||
The various forms of the <tt>NodeTest</tt> ([7]) production filter | The various forms of the <tt>NodeTest</tt> ([7]) production filter | ||
nodes as follows: | nodes as follows: | ||
<dl> | <dl> | ||
<dt>NodeType '(' ')' | <dt>NodeType '('<tt> </tt>')' | ||
<dd>This selects any node that has the respective node type | <dd>This selects any node that has the respective node type; for example, | ||
< | <code>comment()</code> selects all Comment nodes in a node set. | ||
<dt>'processing-instruction' '(' Lit ')' | |||
<dd>This selects any Processing Instruction node if the target name is | <dt>'processing-instruction'<tt> </tt>'(' Lit ')' | ||
<dd>This selects any Processing Instruction node, if the target name is | |||
equal to the value of <tt>Lit</tt>. | equal to the value of <tt>Lit</tt>. | ||
<dt>'*' | NCName ':' '*' | QName | |||
<dt>'<tt>*</tt>' | NCName '<tt>:</tt>' '<tt>*</tt>' | QName | |||
<dd>These forms test the '''name''' of a node, after restricting the type of node | <dd>These forms test the '''name''' of a node, after restricting the type of node | ||
to the | to the "principal node type" of the Axis, as follows: | ||
<ul> | <ul> | ||
<li>Name tests in the <tt>attribute::</tt> Axis restrict to Attribute nodes. | <li>Name tests in the <tt>attribute::</tt> Axis restrict to Attribute nodes. | ||
Line 159: | Line 169: | ||
The name tests then filter the resulting nodes as follows: | The name tests then filter the resulting nodes as follows: | ||
<dl> | <dl> | ||
<dt>'*' | <dt>'<tt>*</tt>' | ||
<dd>This selects a node of the selected type regardless of the node's name. | <dd>This selects a node of the selected type regardless of the node's name. | ||
The | The "selected type" is the principal node | ||
type of the node subset selected by the preceding Axis. | type of the node subset selected by the preceding Axis. | ||
The default Axis type is <tt>child</tt>, | The default Axis type is <tt>child</tt>, so the default node type is <tt>Element</tt>. | ||
so the default node type is <tt>Element</tt>. | |||
<dt>NCName ':' '*' | <dt>NCName '<tt>:</tt>' '<tt>*</tt>' | ||
<dd>This selects a node of the selected type if the node has an associated | <dd>This selects a node of the selected type if the node has an associated | ||
namespace equal to the URI associated with <tt>NCName</tt>. | namespace equal to the URI associated with <tt>NCName</tt>. | ||
<dt>QName | <dt>QName | ||
<dd>This selects a node of the selected type if the node has the same name | <dd>This selects a node of the selected type if the node has the same name | ||
Line 176: | Line 187: | ||
discussed in | discussed in | ||
[[XML processing in Janus SOAP#Names and namespaces|Names and namespaces]]. | [[XML processing in Janus SOAP#Names and namespaces|Names and namespaces]]. | ||
===Predicates=== | |||
Each nodeSet that is the result of NodeTest filtering is input to the | Each nodeSet that is the result of NodeTest filtering is input to the | ||
series of Predicates in the Step. | series of Predicates in the Step. | ||
Line 186: | Line 198: | ||
There are a variety of Predicates, and except for a numeric Predicate, | There are a variety of Predicates, and except for a numeric Predicate, | ||
a Predicate selects a node if the value of the Predicate, converted to | a Predicate selects a node if the value of the Predicate, converted to | ||
a Boolean, is < | a Boolean, is <code>true</code>. | ||
Within a single step: | Within a single step: | ||
<ul> | <ul> | ||
<li>Multiple Predicates are allowed | <li>Multiple Predicates are allowed. | ||
<li>Location paths within Predicates can themselves contain Predicates | <li>Location paths within Predicates can themselves contain Predicates. | ||
</ul> | </ul> | ||
Common forms of predicates use Location Path expressions, | Common forms of predicates use Location Path expressions, | ||
XPath [[#Functions|functions]], | XPath [[#Functions|functions]], or a combination of these. | ||
or a combination of these. | For example, the following path selects all <code>contact</code> | ||
For example, the following path selects all < | children of the second <code>cust</code> element that have | ||
children of the second < | a <code>fax</code> child element: | ||
a < | <p class="code">/active/cust[position(2)]/contact[fax] | ||
< | </p> | ||
</ | |||
Location Paths (described in [[#XPath operation|XPath operation]]) | Location Paths (described in [[#XPath operation|XPath operation]]) | ||
Line 210: | Line 219: | ||
including multi-step and absolute expressions. | including multi-step and absolute expressions. | ||
You can also use a location path or the number( | You can also use a location path or the number( ) function, | ||
followed by a comparison | followed by a comparison | ||
operator and literal, as a predicate of an XPath expression, | operator and literal, as a predicate of an XPath expression, | ||
Line 217: | Line 226: | ||
For a description of the XPath predicates currently supported in the XmlDoc API, | For a description of the XPath predicates currently supported in the XmlDoc API, | ||
see [[#Predicates supported in the current version|Predicates supported in the current version]] | see [[#Predicates supported in the current version|Predicates supported in the current version]] | ||
===Functions=== | |||
There are many functions defined for XPath; this section merely gives | There are many functions defined for XPath; this section merely gives | ||
a sample of some of them. | a sample of some of them. | ||
Furthermore, as of the current version, many of these are | Furthermore, as of the current version, many of these are not supported. | ||
not supported. | See [[#XPath functions supported in the current version |XPath functions supported in the current version ]] for a list of the XPath functions currently supported. All supported functions in the following list are shown as a link to the section which may contain additional notes for using the function in the XmlDoc API. | ||
See [[#XPath functions supported in the current version |XPath functions supported in the current version ]] for a list of the XPath functions currently supported. | |||
In the XmlDoc API, XPath functions are used only in predicates. | In the XmlDoc API, XPath functions are used only in predicates. | ||
Line 228: | Line 237: | ||
Here are some XPath functions that return a '''numeric result''': | Here are some XPath functions that return a '''numeric result''': | ||
<dl> | <dl> | ||
<dt>last( | <dt>last( ) | ||
<dd>Returns the size (number of nodes) of the set that the predicate | <dd>Returns the size (number of nodes) of the set that the predicate | ||
is filtering | is filtering. | ||
<dt>position( | |||
Not supported in the XmlDoc API. | |||
<dt>[[#pos|position]]( ) | |||
<dd>Returns the position of the node in the set that the predicate | <dd>Returns the position of the node in the set that the predicate | ||
is filtering | is filtering | ||
Line 247: | Line 259: | ||
All other axes use document order to arrange the nodes in a node set. | All other axes use document order to arrange the nodes in a node set. | ||
'''Note:''' | <p class="note">'''Note:''' | ||
If a PathExpr is parenthesized and followed by Predicates, and if | If a PathExpr is parenthesized and followed by Predicates, and if | ||
position( | position( ) is used in those predicates, the axis in effect is the | ||
<tt>child</tt> axis. | <tt>child</tt> axis. </p> | ||
<dt>count(nodeSet) | |||
<dt>count(<i>nodeSet</i>) | |||
<dd>Returns the number of nodes in the argument nodeSet | <dd>Returns the number of nodes in the argument nodeSet | ||
Notice that this function uses a <tt>nodeSet</tt> type argument. | Notice that this function uses a <tt>nodeSet</tt> type argument. | ||
As in this example, a <tt>LocPath</tt> can be passed: | As in this example, a <tt>LocPath</tt> can be passed: | ||
< | <p class="code">/book/chapter[count(section) >= 3] | ||
</p> | |||
</ | |||
This expression will select all chapters that have three or more sections. | This expression will select all chapters that have three or more sections. | ||
<dt>number(object?) | |||
Not supported in the XmlDoc API. | |||
<dt>[[#num|number]](<i>object</i>?) | |||
<dd>Returns the numeric value of the argument (after stripping leading | <dd>Returns the numeric value of the argument (after stripping leading | ||
and trailing blanks if it is a string object). | and trailing blanks if it is a string object). | ||
If the argument cannot be converted to a number, it returns the | If the argument cannot be converted to a number, it returns the | ||
special value | special value "NaN" ("Not A Number"), which is not equal | ||
to any other object (including another expression whose result is NaN). | to any other object (including another expression whose result is NaN). | ||
The default argument is the context node (of the | The default argument is the context node (of the "containing" expression, | ||
which, for our purposes, is the context node of the predicate containing the | which, for our purposes, is the context node of the predicate containing the | ||
number( | number( ) function). | ||
If the argument is a nodeSet, the value of the | If the argument is a nodeSet, the value of the | ||
first node (in document order) of the argument nodeSet is converted to | first node (in document order) of the argument nodeSet is converted to | ||
Line 277: | Line 292: | ||
Here are some XPath functions that return a '''string result''': | Here are some XPath functions that return a '''string result''': | ||
<dl> | <dl> | ||
<dt>string(object?) | <dt>string(<i>object</i>?) | ||
<dd> | <dd> | ||
This function, like several other XPath functions, allows different | This function, like several other XPath functions, allows different | ||
kinds of arguments: for example, it can be used to convert a number | kinds of arguments: for example, it can be used to convert a number | ||
to a string. | to a string. | ||
The string( | The string( ) function is implicitly used when a comparison is made between | ||
a node and a string value, for example: | a node and a string value, for example: | ||
< | <p class="code">/book/chapter[@title = 'Introduction'] | ||
</p> | |||
</ | |||
In this case, each node in the | In this case, each node in the | ||
node set that is the result of the < | node set that is the result of the <code>@title</code> | ||
PathExpr is converted using string( | PathExpr is converted using string( ), then compared to the string | ||
literal < | literal <code>Introduction</code>. | ||
The default argument | The default argument | ||
of the string( | of the string( ) function is the context node of the expression. | ||
The string( | The string( ) function, when given a node set argument, uses the string value | ||
of the first node, in document order, of that node set. | of the first node, in document order, of that node set. | ||
<dt>substring(string, number, number?) | |||
Not supported in the XmlDoc API. | |||
<dt>substring(<i>string</i>, <i>number</i>, <i>number</i>?) | |||
<dd>Returns the substring of the first argument, starting at the position | <dd>Returns the substring of the first argument, starting at the position | ||
specified by the second argument, and for the number of | specified by the second argument, and for the number of | ||
Line 304: | Line 321: | ||
As with most XPath expressions, conversions are freely done, so if the | As with most XPath expressions, conversions are freely done, so if the | ||
first argument is not a manifest string type, it is converted to one | first argument is not a manifest string type, it is converted to one | ||
using the string( | using the string( ) function. | ||
Not supported in the XmlDoc API. | |||
</dl> | </dl> | ||
This XPath function returns a '''boolean result''': | This XPath function returns a '''boolean result''': | ||
<dl> | <dl> | ||
<dt>not(boolean) | <dt>[[#not|not]](<i>boolean</i>) | ||
<dd>Returns < | <dd>Returns <code>true</code> if its argument is false, and <code>false</code> | ||
otherwise. | otherwise. | ||
For example, < | For example, <code>not(paragraph[3])</code> selects a node if it is not the | ||
third < | third <code>paragraph</code> child. | ||
</dl> | </dl> | ||
==XPath syntax== | |||
This section contains a version of the XPath syntax. | This section contains a version of the XPath syntax. | ||
See | See | ||
Line 337: | Line 358: | ||
</ul> | </ul> | ||
For a | For a "cross-reference" to the productions as contained in the XPath | ||
Recommendation, see [[#XPath syntax cross-reference|XPath syntax cross-reference]]. | Recommendation, see [[#XPath syntax cross-reference|XPath syntax cross-reference]]. | ||
See also [[#XPath supported in the current version|XPath supported in the current version]]. | See also [[#XPath supported in the current version|XPath supported in the current version]]. | ||
< | <p class="code">[A19] PathExpr ::= LocPath | ||
[A19] PathExpr ::= LocPath | |||
| PrimaryExpr Predicate+ | | PrimaryExpr Predicate+ | ||
| PrimaryExpr Predicate* '/' RelativeLocPath | | PrimaryExpr Predicate* '/' RelativeLocPath | ||
Line 357: | Line 377: | ||
[1] LocPath ::= RelativeLocPath | [1] LocPath ::= RelativeLocPath | ||
| AbsoluteLocPath | | AbsoluteLocPath | ||
[2] AbsoluteLocPath ::= '/' RelativeLocPath? | [2] AbsoluteLocPath ::= '/' RelativeLocPath? | ||
| '//' RelativeLocPath | | '//' RelativeLocPath | ||
[3] RelativeLocPath ::= Step ('/' Step)* | [3] RelativeLocPath ::= Step ('/' Step)* | ||
| RelativeLocPath '//' Step | | RelativeLocPath '//' Step | ||
Line 384: | Line 406: | ||
[21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )* | [21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )* | ||
[23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )* | [23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )* | ||
[24] RelExpr ::= NumExpr | [24] RelExpr ::= NumExpr | ||
( ('<' | '>' | '<=' | '>=') NumExpr )* | ( ('<' | '>' | '<=' | '>=') NumExpr )* | ||
Line 391: | Line 415: | ||
( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )* | ( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )* | ||
[29] Lit ::= '"' [^"]* '"' | [29] Lit ::= '"' [^"]* '"' | "'" [^']* "'" | ||
[30] Number ::= [0-9]+ ('.' [0-9]*)? | [30] Number ::= [0-9]+ ('.' [0-9]*)? | ||
[35] FunctionName ::= QName - NodeType | [35] FunctionName ::= QName - NodeType | ||
[36] Variable ::= '$' QName | [36] Variable ::= '$' QName | ||
[38] NodeType ::= 'comment' | 'text' | 'processing-instruction' | 'node' | |||
</p> | |||
===XPath syntax notes=== | |||
For information about XPath functions, see [[#Functions|Functions]]. | For information about XPath functions, see [[#Functions|Functions]]. | ||
<ul> | <ul> | ||
<li>In [A19], [2], and [3], the double slash (<tt>//</tt>) is an | <li>In [A19], [2], and [3], the double slash (<tt>//</tt>) is an | ||
abbreviation for | abbreviation for | ||
< | <p class="code">/descendant-or-self::node()/ | ||
</p> | |||
</ | |||
<li>When an at sign (<tt>@</tt>) is used in a Step ([4]), it is an | <li>When an at sign (<tt>@</tt>) is used in a Step ([4]), it is an | ||
abbreviation for | abbreviation for | ||
< | <p class="code">attribute:: | ||
</p> | |||
</ | |||
<li>The syntax for Step ([4]) notes that it may begin directly with a | <li>The syntax for Step ([4]) notes that it may begin directly with a | ||
NodeTest. | NodeTest. | ||
In that case, <tt>child::</tt> is implied before the NodeTest. | In that case, <tt>child::</tt> is implied before the NodeTest. | ||
<li>The syntax for QName and NCName are given in [[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]]. | <li>The syntax for QName and NCName are given in [[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]]. | ||
<li>The precedence of expressions from | <li>The precedence of expressions from | ||
<tt>Expr</tt> ([21]), <tt>EqExpr</tt> ([23]), | <tt>Expr</tt> ([21]), <tt>EqExpr</tt> ([23]), | ||
Line 434: | Line 462: | ||
The operators are all left-associative. | The operators are all left-associative. | ||
For example, | For example, | ||
< | <code>3 > 2 > 1</code> is the same as | ||
< | <code>(3 > 2) > 1</code>, which evaluates to false. | ||
<li>The only forms of PrimaryExpr ([B15]) that can create node sets, | <li>The only forms of PrimaryExpr ([B15]) that can create node sets, | ||
and so be used in a PathExpr ([A19]), are the id(< | and so be used in a PathExpr ([A19]), are the <tt>id( )</tt> function and | ||
parenthesized LocPath ([1]) (or parenthesized unions of them, with | parenthesized LocPath ([1]) (or parenthesized unions of them, with <code>|</code> ([C27])). | ||
</ul> | </ul> | ||
See also the notes in [[#XPath supported in the current version|XPath supported in the current version]]. | See also the notes in [[#XPath supported in the current version|XPath supported in the current version]]. | ||
===XPath syntax cross-reference=== | |||
Here is a listing of all first productions contained in various | Here is a listing of all first productions contained in various | ||
numbered sections and unnumbered subsections from the | numbered sections and unnumbered subsections from the | ||
Line 477: | Line 507: | ||
[39] ExprWhitespace (last production) | [39] ExprWhitespace (last production) | ||
</dl> | </dl> | ||
==Some notes on XPath usage== | |||
The following subsections describe some subtle issues in XPath, | The following subsections describe some subtle issues in XPath, | ||
which the XmlDoc API implements exactly as specified in the recommendations. | which the XmlDoc API implements exactly as specified in the recommendations. | ||
== | |||
<blockquote class="note" id="entities"> | |||
<p><b>Note:</b> Predicates in XPath expressions, which are widely used, are enclosed within square bracket characters (<tt>[</tt> <tt>]</tt>). However, simply typing square bracket characters in your XPath expressions risks [[Unicode#Consistent XPath predicate errors .E2.80.94 wrong codepage?|invalid character errors]] arising from a mismatch between a 3270-keyboard codepage and the codepage set for the Online with the <var>[[UNICODE command|UNICODE]]</var> command. </p> | |||
<p> | |||
Model 204 7.6 maintenance added [[XML processing in Janus SOAP#Entity references|XMHTL entities]] for left and right square brackets in response to this problem. These entities are used in some of the [[#Predicates supported in the current version|predicate examples]] shown below. </p> | |||
</blockquote> | |||
==="//" and "."=== | |||
As mentioned in [[#Performance considerations: Document order, certain axes|Performance considerations: Document order, certain axes]], the <tt>descendant-or-self</tt> | As mentioned in [[#Performance considerations: Document order, certain axes|Performance considerations: Document order, certain axes]], the <tt>descendant-or-self</tt> | ||
axis (commonly appearing in an XPath expression with the <tt>//</tt> | axis (commonly appearing in an XPath expression with the <tt>//</tt> | ||
Line 494: | Line 532: | ||
<li>An XPath expression that begins with <tt>//</tt> is an absolute | <li>An XPath expression that begins with <tt>//</tt> is an absolute | ||
expression; if you want to use it at the start of a relative XPath | expression; if you want to use it at the start of a relative XPath | ||
expression, make use of the < | expression, make use of the <code>.</code> step: | ||
< | <p class="code">%chapter = %doc:SelectSingleNode('/book/chapter[title="Aerobics"]') | ||
%sections = %chapter('.//section') | |||
</p> | |||
</ | |||
</ul> | </ul> | ||
===Attributes: not children, and excluded from most axes=== | |||
One subtle point to observe is that attributes are not children of | One subtle point to observe is that attributes are not children of | ||
their parents! | their parents! | ||
As stated in the XPath Recommendation: | As stated in the XPath Recommendation: | ||
<ul> | <ul class="nobul"> | ||
<li>2.2 Axes | <li>2.2 Axes | ||
<li>. . . | <li>. . . | ||
<ul> | <ul> | ||
<li>the descendant axis contains the descendants of the context node; a descendant | <li style="list-style-type:square;">the descendant axis contains the descendants of the context node; a descendant | ||
is a child or a child of a child and so on; thus the descendant axis never | is a child or a child of a child and so on; thus the descendant axis never | ||
contains attribute or namespace nodes | contains attribute or namespace nodes | ||
Line 529: | Line 567: | ||
the default axis, an Attribute node will never the result of a step | the default axis, an Attribute node will never the result of a step | ||
which does not explicitly include one of the above (abbreviated or unabbreviated) axes. | which does not explicitly include one of the above (abbreviated or unabbreviated) axes. | ||
===Order of nodes: node sets versus nodelists=== | |||
The order of nodes in an XML document is the order in which the | The order of nodes in an XML document is the order in which the | ||
node (or its start-tag, in the case of an Element node) first occurs in | node (or its start-tag, in the case of an Element node) first occurs in | ||
Line 542: | Line 581: | ||
In XPath, however, document order is not always used. | In XPath, however, document order is not always used. | ||
In an XPath expression step, the axis implies an order. | In an XPath expression step, the axis implies an order. | ||
This order is important for the <tt>position()</tt> and | This order is important for the <tt>position( )</tt> and | ||
<tt>last()</tt> predicate functions, which | <tt>last( )</tt> predicate functions, which | ||
filter a node based on its order in a node set. | filter a node based on its order in a node set. | ||
This order is the same as document order, except for | This order is the same as document order, except for | ||
Line 555: | Line 594: | ||
For example, consider the following document: | For example, consider the following document: | ||
< | <p class="code"><top> | ||
<a/> | |||
<b/> | |||
<c/> | |||
<d/> | |||
<e/> | |||
</top> | |||
</p> | |||
</ | |||
Using XPath to select the first of the | Using XPath to select the first of the "following" siblings | ||
(< | (<code>/*/c/following-sibling::*[1]</code>) | ||
yields the equivalent of the first element in document order: < | yields the equivalent of the first element in document order: <code>d</code>. | ||
However, selecting the first of the preceding siblings | However, selecting the first of the preceding siblings | ||
(< | (<code>/*/c/preceding-sibling::*[1]</code>) | ||
yields the first element in ''reverse'' document order: < | yields the first element in ''reverse'' document order: <code>b</code>. | ||
This reverse ordering | This reverse ordering | ||
is apparent in some contexts but not in others. | is apparent in some contexts but not in others. | ||
For example, the XPath expression in the following statement is used | For example, the XPath expression in the following statement is used | ||
against the document above (call it < | against the document above (call it <code>%doc</code>) to select | ||
nodes into an XmlNodelist: | nodes into an XmlNodelist: | ||
< | <p class="code">%nodelist = %doc:SelectNodes('/*/c/preceding-sibling::*()') | ||
</p> | |||
</ | |||
The XmlDoc API (re)arranges the two found nodes in < | The XmlDoc API (re)arranges the two found nodes in <code>%nodelist</code> into document | ||
order: < | order: <code>a</code> and <code>b</code>, in that order, and | ||
the following statement selects < | the following statement selects <code>b</code>, the second of the nodes: | ||
< | <p class="code">Print %nodelist:Item(2):LocalName | ||
</p> | |||
</ | |||
Yet, if you use the following statement (instead of the previous two) | Yet, if you use the following statement (instead of the previous two) | ||
in an attempt to directly select | in an attempt to directly select | ||
the | the "second" preceding sibling, the result is node <code>a</code>: | ||
< | <p class="code">Print %doc:LocalName('/*/c/preceding-sibling::*[2]') | ||
</p> | |||
</ | |||
The Item method, above, selects the second node in the set of | The Item method, above, selects the second node in the set of | ||
document-ordered nodes in the XmlNodelist. | document-ordered nodes in the XmlNodelist. | ||
The position( | The position( ) function above (the <code>2</code> within the brackets) | ||
selects the second node in the | selects the second node in the | ||
set of reverse-document-ordered nodes passed from the preceding-sibling axis | set of reverse-document-ordered nodes passed from the preceding-sibling axis | ||
after filtering by the <code>*</code> NodeTest. | after filtering by the <code>*</code> NodeTest. | ||
==Performance considerations: Document order, certain axes== | |||
This section discusses the performance implications of evaluating certain | This section discusses the performance implications of evaluating certain | ||
XPath expressions. | XPath expressions. | ||
Line 626: | Line 662: | ||
of methods like SelectNodes and UnionSelected in the XmlNodelist class | of methods like SelectNodes and UnionSelected in the XmlNodelist class | ||
designate a set of nodes. | designate a set of nodes. | ||
In addition to these | In addition to these "set-valued" methods, XPath expressions can be | ||
used in many XmlDoc API methods | used in many XmlDoc API methods | ||
(SelectSingleNode, Value, DeleteSubtree, QName, and more) | (<var>SelectSingleNode</var>, <var>Value</var>, <var>DeleteSubtree</var>, <var>QName</var>, and more) to operate on a single node that satisfies the expression. | ||
to operate on a single node that satisfies the expression. | For a simple XPath expression (described above), the "single node" methods | ||
For a simple XPath expression (described above), the | |||
may be able to determine the desired node by scanning | may be able to determine the desired node by scanning | ||
fewer nodes, giving better performance than the set-valued methods for the | fewer nodes, giving better performance than the set-valued methods for the | ||
Line 643: | Line 678: | ||
In other words, given an XPath expression <i>expr</i> that uses any | In other words, given an XPath expression <i>expr</i> that uses any | ||
of the axis cases described below in | of the axis cases described below in [[#The extra-processing expressions|The extra-processing expressions]] and given any | ||
and given any | |||
single-node selection method <i>XMeth</i>, this expression: | single-node selection method <i>XMeth</i>, this expression: | ||
<!--?? %obj:<i><b>XMeth</b></i>(<i><b>expr</b></i>) --> | <!--?? %obj:<i><b>XMeth</b></i>(<i><b>expr</b></i>) --> | ||
Line 659: | Line 693: | ||
For example, consider the following document: | For example, consider the following document: | ||
< | <p class="code"><top> | ||
<a> | |||
<b x="1"/> | |||
</a> | |||
<b x="2"/> | |||
</top> | |||
</p> | |||
</ | |||
When, say, < | When, say, <code>Value('/*/*/b/@x')</code> is evaluated, the document search | ||
ends when the first match is found (and the Value method returns < | ends when the first match is found (and the Value method returns <code>1</code>). | ||
But when < | But when <code>Value('//b/@x')</code> is evaluated, the document search first | ||
finds the match < | finds the match <code>x=2</code>, then it continues searching the entire | ||
document for all matches, to ensure that the match which is lowest in | document for all matches, to ensure that the match which is lowest in | ||
document order (< | document order (<code>x=1</code>) is the result. | ||
The performance implications of the expressions that involve extra processing | The performance implications of the expressions that involve extra processing | ||
Line 682: | Line 715: | ||
may be selected in an order (due to the selection algorithm) that differs | may be selected in an order (due to the selection algorithm) that differs | ||
from document order. | from document order. | ||
===The extra-processing expressions=== | |||
Extra processing can occur in the following cases: | Extra processing can occur in the following cases: | ||
<ol> | <ol> | ||
Line 717: | Line 750: | ||
</ul> | </ul> | ||
</ol> | </ol> | ||
<div id="ordScan"></div> | <div id="ordScan"></div> | ||
In addition to the cost of the actual XPath search performed with the above | In addition to the cost of the actual XPath search performed with the above | ||
expressions, | expressions, they can incur an additional "one-time" cost for XPath evaluation. | ||
they can incur an additional "one-time" cost for XPath evaluation. | |||
If the document has been modified in such a way that the internal order of | If the document has been modified in such a way that the internal order of | ||
the nodes cannot be guaranteed to be the same as document order (this will | the nodes cannot be guaranteed to be the same as document order (this will | ||
always happen with any of the XML < | always happen with any of the XML <var>Insert..Before</var> methods, and | ||
usually will happen | usually will happen | ||
with any of the < | with any of the <var>Add..</var> methods), then the entire document | ||
(not only the subtree | (not only the subtree | ||
being searched) must be scanned so that the order is adjusted. | being searched) must be scanned so that the order is adjusted. | ||
Line 732: | Line 764: | ||
a full scan. This adjustment "fixes up" the <var>XmlDoc</var> and it will remain "fixed up" for subsequent XPath searches, until such time as the <var>XmlDoc</var> is subsequently updated in such a way that the internal order does not guarantee document order. | a full scan. This adjustment "fixes up" the <var>XmlDoc</var> and it will remain "fixed up" for subsequent XPath searches, until such time as the <var>XmlDoc</var> is subsequently updated in such a way that the internal order does not guarantee document order. | ||
'''Note:''' | <blockquote class="note"> | ||
<p>'''Note:''' </p> | |||
<ol> | <ol> | ||
<li>One important exception to the above rules is | <li>One important exception to the above rules is | ||
Line 739: | Line 772: | ||
by the <tt>child</tt> axis without any predicate. | by the <tt>child</tt> axis without any predicate. | ||
An example of the usual way to specify this is: | An example of the usual way to specify this is: | ||
< | <p class="code">//chapter | ||
</p> | |||
</ | |||
In this case, the internal node selection algorithm operates in document | In this case, the internal node selection algorithm operates in document | ||
order, and no extra processing is incurred. | order, and no extra processing is incurred. | ||
<p> | |||
Even with this special case, it is better to avoid the <tt>descendant-or-self</tt> | Even with this special case, it is better to avoid the <tt>descendant-or-self</tt> | ||
step (specified explicitly or by using < | step (specified explicitly or by using <code>//</code>) if your | ||
document structure lends itself to explicitly specifying the | document structure lends itself to explicitly specifying the | ||
"intermediate" elements with <code>*</code> (or even better, with their names) | |||
(or even better, with their names) | that should be matched. </p> | ||
that should be matched. | |||
<li>The considerations described in this section only apply to the | <li>The considerations described in this section only apply to the "outer" XPath | ||
expression; they do not apply to any expression within a predicate. | expression; they do not apply to any expression within a predicate. | ||
Although it is still better, for the sake of efficiency, to prune the search | Although it is still better, for the sake of efficiency, to prune the search | ||
by explicitly | by explicitly | ||
specifying | specifying "intermediate" elements rather than using <code>//</code>, | ||
there is no efficiency concern due to the internal order of node selection | there is no efficiency concern due to the internal order of node selection | ||
with an XPath predicate such as the following: | with an XPath predicate such as the following: | ||
< | <p class="code">Print %d:Value('/book/chapter' With '[.//credit/details/@auth="Dave"]') | ||
</p> | |||
</ | |||
<li>'''In conclusion''', except when you '''must''' use | <li>'''In conclusion''', except when you '''must''' use | ||
the | the "<tt>//chapter</tt>" exception discussed in Note 1, above, | ||
''avoid these extra-processing axes and axis combinations'' | ''avoid these extra-processing axes and axis combinations'' | ||
(especially in outer XPath expressions) if your documents | (especially in outer XPath expressions) if your documents | ||
are relatively large and performance is a consideration. | are relatively large and performance is a consideration. | ||
</ol> | </ol> | ||
</blockquote> | |||
==XPath supported in the current version== | |||
This section contains a condensed excerpt of the XPath syntax, showing | This section contains a condensed excerpt of the XPath syntax, showing | ||
only those parts of XPath used in the current version. | only those parts of XPath used in the current version. | ||
Line 841: | Line 873: | ||
</pre> | </pre> | ||
===Explanatory notes=== | |||
The following notes are intended only to explain the above syntax; they | The following notes are intended only to explain the above syntax; they | ||
do not present any limitations on XPath support in the XmlDoc API: | do not present any limitations on XPath support in the XmlDoc API: | ||
Line 848: | Line 879: | ||
<li>When the at sign (<tt>@</tt>) is used in a Step ([4]), it is an | <li>When the at sign (<tt>@</tt>) is used in a Step ([4]), it is an | ||
abbreviation for | abbreviation for | ||
< | <p class="code">attribute:: | ||
</p> | |||
</ | |||
<li>The syntax for Step ([4]) notes that it may begin directly with a | <li>The syntax for Step ([4]) notes that it may begin directly with a | ||
NodeTest. | NodeTest. | ||
In that case, <tt>child::</tt> is implied before the NodeTest. | In that case, <tt>child::</tt> is implied before the NodeTest. | ||
<li>The syntax for QName is given in | <li>The syntax for QName is given in | ||
[[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]]. | [[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]]. | ||
<li>A node is selected by a PositiveInteger predicate if | <li>A node is selected by a PositiveInteger predicate if | ||
the position of the node, in the set which the predicate | the position of the node, in the set which the predicate | ||
is filtering, is equal to that PositiveInteger. | is filtering, is equal to that PositiveInteger. | ||
<li>A node is selected by an | <li>A node is selected by an | ||
Existence test if the result of the <tt>PathExpr</tt>, using that node as | Existence test if the result of the <tt>PathExpr</tt>, using that node as | ||
the context node, is non-empty. | the context node, is non-empty. | ||
<li>A node is selected by a | <li>A node is selected by a | ||
Comparison if any node in the result of the <tt>PathExpr</tt>, using | Comparison if any node in the result of the <tt>PathExpr</tt>, using | ||
Line 867: | Line 902: | ||
holds the specified relationship to the <tt>Lit</tt>. | holds the specified relationship to the <tt>Lit</tt>. | ||
</ul> | </ul> | ||
The following notes | ===<b id="xpathRestr"></b>Restrictions and limitations=== | ||
concern limitations on XPath support in the XmlDoc API: | The following notes concern limitations on XPath support in the XmlDoc API: | ||
<ul> | <ul> | ||
<li>One way to summarize the XPath productions that are ''not'' supported | <li>One way to summarize the XPath productions that are ''not'' supported | ||
Line 876: | Line 910: | ||
is to list the XPath operators that | is to list the XPath operators that | ||
are ''not'' supported, as shown in the following table: | are ''not'' supported, as shown in the following table: | ||
< | |||
Unsupported operators | <table class="thJustBold"> | ||
+ | <tr class="head"><th>Unsupported operators</th><th>Meaning</th><th>Comments</th></tr> | ||
| | <tr><th> + - * div mod</th><td>arithmetic</td><td/)</tr> | ||
</ | <tr><th> | </th><td> union </td><td>The <var>UnionSelected</var> method can be used to form the [[UnionSelected (XmlNodelist function)#exmp1|union of two nodesets]].</tr> | ||
'''Note:''' | </table> | ||
<p class="note">'''Note:''' | |||
Parentheses are allowed for grouping | |||
within Boolean expressions within predicates, but this is the only place | within Boolean expressions within predicates, but this is the only place | ||
they are supported. | they are supported. </p> | ||
<li>As of the current version, XPath function | <li>As of the current version, XPath function | ||
support is limited enough that it can be shown in the syntax above. | support is limited enough that it can be shown in the syntax above. | ||
Line 890: | Line 927: | ||
supported functions and for any differences between the | supported functions and for any differences between the | ||
the XmlDoc API implementation and the XPath 1 definition of a function. | the XmlDoc API implementation and the XPath 1 definition of a function. | ||
<li>The Boolean operators (<tt>and</tt>, <tt>or</tt>) and | <li>The Boolean operators (<tt>and</tt>, <tt>or</tt>) and | ||
relational operators (<tt>= != < <= > >=</tt>) are supported | relational operators (<tt>= != < <= > >=</tt>) are supported | ||
only in (some) predicates (see [[#Comparison tests in predicates|Comparison tests in predicates]]). | only in (some) predicates (see [[#Comparison tests in predicates|Comparison tests in predicates]]). | ||
<li>XPath variables ($<var_name>) are not supported. | |||
<li>The numeric | <li>XPath variables ($<i>var_name</i>) are not supported. | ||
<li>The numeric constants <tt>+/- infinity</tt> are ''not'' supported. | |||
<li>The size of an XPath expression is limited to approximately | <li>The size of an XPath expression is limited to approximately | ||
26 steps, if each has an NCName NodeTest. | 26 steps, if each has an NCName NodeTest. | ||
<li>In the XmlDoc API, a numeric value (either a literal or a node value) | <li>In the XmlDoc API, a numeric value (either a literal or a node value) | ||
may be of any form available in | may be of any form available in SOUL. | ||
In particular, | In particular, "E-format" literals, such as <code>1.003E-5</code> | ||
(even though they are not very common in XML documents) may be specified. | (even though they are not very common in XML documents) may be specified. | ||
The same form of numbers is available in XPath 2. | The same form of numbers is available in XPath 2. | ||
XPath 1 only allows decimal numbers; it does not allow E-format literals nor | XPath 1 only allows decimal numbers; it does not allow E-format literals nor node values. | ||
node values. | |||
<li>The precision used in the XmlDoc API XPath support is that provided by | <li>The precision used in the XmlDoc API XPath support is that provided by SOUL — namely, 15 decimal digits. | ||
<li>If the XPath support in the XmlDoc API attempts to convert a long string | <li>If the XPath support in the XmlDoc API attempts to convert a long string | ||
(that is, longer than 255 bytes) or a number whose absolute value is | (that is, longer than 255 bytes) or a number whose absolute value is | ||
beyond the capabilities of <var class="product"> | beyond the capabilities of <var class="product">SOUL</var> (maximum absolute value approximately 7.237E75), the request is cancelled. | ||
7.237E75), the request is cancelled. | |||
</ul> | </ul> | ||
===Predicates supported in the current version=== | |||
The XmlDoc API supports these predicates: | The XmlDoc API supports these predicates: | ||
<ul> | <ul> | ||
<li>Two types of | <li>Two types of <b>Location-Path-expression predicates</b>: | ||
<ul> | <ul> | ||
<li>A Location Path (that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]]) | <li>A Location Path (that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]]) used as an existence test | ||
used as an existence test | <p> | ||
< | |||
If the nodeSet that results from the Location Path is non-empty, | If the nodeSet that results from the Location Path is non-empty, | ||
the predicate evaluates as true. | the predicate evaluates as true. | ||
The '''usual''' (but not only) purpose of this predicate is to select | The '''usual''' (but not only) purpose of this predicate is to select | ||
a node if it has at least one attribute or element with a given name. | a node if it has at least one attribute or element with a given name. </p> | ||
<p> | |||
For example, the following expression selects (as the XPath argument to the <var>[[SelectNodes (XmlDoc/XmlNode function)|SelectNodes]]</var> method, for example) all <code>contact</code> | |||
children of <code>cust</code> elements, if the | |||
<code>cust</code> element has an <code>invoice</code> child element and | |||
<code>contact</code> has a <code>fax</code> child element: | |||
</p> | |||
<p class="code" id="brackXmp">'/active/cust&lsqb;invoice&rsqb;/contact&lsqb;fax&rsqb;'<b>:u</b> </p> | |||
<blockquote class="note"> | |||
<p><b>Note:</b> This example substitutes the <code>[</code> and <code>]</code> entities for left and right square-bracket characters from the keyboard, for the reason explained at the [[#entities|beginning of this section]]. Notice also the accompanying use of the <var>[[U (String function)|U]]</var> method. The value of the expression is: </p> | |||
<p class="code">/active/cust[invoice]/contact[fax] | |||
</p> | |||
<p> | |||
Most of the remaining predicate examples below omit the entities in order to more clearly display the XPath grammar. </p> | |||
</blockquote> | |||
</li> | |||
<li>A Location Path expression with a comparison operator and literal | <li>A Location Path expression with a comparison operator and literal | ||
< | <p> | ||
For example, < | For example, <code>@price > 200</code> | ||
selects a node if the numeric value of the node's < | selects a node if the numeric value of the node's <code>price</code> | ||
Attribute is greater than 200. | Attribute is greater than 200. </p> | ||
<p> | |||
See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion. | See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion. </p></li> | ||
</ul> | </ul> | ||
<li>These types of | <li>These types of <b>function predicates</b>: | ||
<ul> | <ul> | ||
<li>A | <li>A "simple" position test using a numeric literal <i>n</i> | ||
< | <p> | ||
This test is | This test is equivalent to the implicit use of the position( ) function in the | ||
equivalent to the implicit use of the position( | predicate term <code>position()=<i>n</i></code>. </p> | ||
predicate term < | <p> | ||
For example: </p> | |||
For example: | <p class="code">/book/chapter[2]/section[9]/paragraph[3] | ||
< | </p></li> | ||
<li>The number( ) function with a location path argument, followed by a comparison to a numeric literal | |||
<p> | |||
For example, <code>number(@size) > 30</code> | |||
selects a node if the numeric value of the node's <code>size</code> | |||
Attribute is greater than 30. </p> | |||
<p> | |||
This predicate differs from the similar Location Path example above | This predicate differs from the similar Location Path example above | ||
(< | (<code>@price > 200</code>) primarily in that it allows the Attribute value to be non-numeric. | ||
primarily in that it allows the Attribute value to be non-numeric. | The previous Location Path example cancels the request if a numeric comparison is performed | ||
The previous Location Path | with a <code>price</code> Attribute whose value is non-numeric. </p> | ||
example cancels the request if a numeric comparison is performed | <p> | ||
with a < | See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion. </p></li> | ||
<li>The position( ) function, followed by a comparison operator, | |||
followed by an integer literal, which may be negative or zero. | |||
<p> | |||
For example: </p> | |||
<p class="code">/book/chapter[position()>1]/section[2] | |||
</p></li> | |||
<li>The not( ) function, which returns the opposite boolean value | |||
<li>The not( | |||
of its boolean argument. | of its boolean argument. | ||
<p></p> | <p> | ||
For example: </p> | |||
< | <p class="code">/book/chapter[2]/section[9]/not(paragraph[3]) | ||
</p></li> | |||
</ | |||
</ul> | </ul> | ||
<li | |||
< | <li><b>Nested predicates</b>. | ||
For example, this statement | <p> | ||
selects | For example, this statement selects each <code>Chapter</code> whose first <code>Section</code> has a <code>Racy</code> attribute: </p> | ||
< | <p class="code">%lis = %bk:SelectNodes('Chapter[Section[1 and @Racy]]'<b>:u</b>) | ||
</p> | |||
</ | <blockquote> | ||
<li>< | <p><b>Note:</b> As in an [[#brackXmp|earlier example]], the <code>[</code> and <code>]</code> entities in this example require Model 204 7.6 or higher. The value of the XPath argument is: </p> | ||
< | <p class="code">Chapter[Section[1 and @Racy]] </p> | ||
</blockquote> | |||
</li> | |||
<li><b>Multiple predicates in a single step</b>. | |||
<p> | |||
For example, using the <tt>position()</tt> function to filter based | For example, using the <tt>position()</tt> function to filter based | ||
on the position of nodes from the preceding predicate, rather than | on the position of nodes from the preceding predicate, rather than | ||
from the step's NodeTest: | from the step's NodeTest: </p> | ||
< | <p class="code">/book/chapter[author="Alex"] [2] | ||
</p> | |||
</ | <p> | ||
The preceding two-predicate step | The preceding two-predicate step selects the second <code>chapter</code> child that is authored by <code>Alex</code>, while the following expression | ||
selects the second < | selects the second <code>chapter</code> child of the <code>book</code>, | ||
while the following expression | if its author is <code>Alex</code>: </p> | ||
selects the second < | <p class="code">/book/chapter[author="Alex" and 2] | ||
if its author is Alex: | </p> | ||
< | <p> | ||
Parentheses for grouping in Boolean expressions are supported. | |||
</ | For example: </p> | ||
<p class="code">chapter[@type="methods" and | |||
(@class="Stringlist" or @class="Daemon")] | |||
</p></li> | |||
<li><b>Combination predicates</b>. | |||
<p> | |||
<li | |||
< | |||
Predicates may combine any of these supported | Predicates may combine any of these supported | ||
functions and supported Location Path expressions | functions and supported Location Path expressions | ||
using the <tt>and</tt> and <tt>or</tt> Boolean operators. | using the <tt>and</tt> and <tt>or</tt> Boolean operators. </p> | ||
<p> | |||
For example: | For example: </p> | ||
< | <p class="code">/active/cust[invoice and position>1] | ||
</p></li> | |||
</ | |||
</ul> | </ul> | ||
===XPath functions supported in the current version=== | |||
The following XPath functions are supported in the current version, | The following XPath functions are supported in the current version, | ||
and | and [[#Functions|Functions]] gives their XPath 1 definitions. | ||
Any differences between the XPath 1 definition and the the XmlDoc API implementation | Any differences between the XPath 1 definition and the the XmlDoc API implementation | ||
are shown below. | are shown below. | ||
'''Note:''' | <p class="note">'''Note:''' | ||
In discussing XPath functions, the name of the function followed | In discussing XPath functions, the name of the function followed | ||
by an empty pair of parentheses (for example, < | by an empty pair of parentheses (for example, <code>number()</code>) | ||
is sometimes used to name the function, whether or not | is sometimes used to name the function, whether or not | ||
the particular function being discussed takes arguments. | the particular function being discussed takes arguments. </p> | ||
<ul> | <ul> | ||
<li>position( | <li id="pos">position( ) | ||
< | <p>Performs as specified in the XPath standard.</p> | ||
<li id="not">not(<i>bool</i>) | |||
<p> | |||
The function argument is a Boolean expression, and the function result | The function argument is a Boolean expression, and the function result | ||
is < | is <code>true</code> if the value of the argument is <code>false</code>, | ||
and it is < | and it is <code>false</code> otherwise. </p> | ||
<p> | |||
Notes: | Notes: </p> | ||
<ul> | <ul> | ||
<li><p>Performs as specified in the XPath standard. | |||
<li>The result of the <tt>not()</tt> function applied to a comparison | <li>The result of the <tt>not()</tt> function applied to a comparison | ||
expression is different than the result of the same expression with | expression is different than the result of the same expression with | ||
Line 1,063: | Line 1,099: | ||
For example, | For example, | ||
this statement selects children that have the value of the status | this statement selects children that have the value of the status | ||
attribute equal to | attribute equal to "pending": | ||
< | <p class="code">%lis = %nod:SelectNodes('*[@status="pending"]') | ||
</p> | |||
</ | |||
This statement selects children that have the value of the status | This statement selects children that have the value of the status | ||
attribute equal to something other than “pending”: | attribute equal to something other than “pending”: | ||
< | <p class="code">%lis = %nod:SelectNodes('*[@status!="pending"]') | ||
</p> | |||
</ | |||
This statement selects children that have the value of the status | This statement selects children that have the value of the status | ||
Line 1,078: | Line 1,112: | ||
or that have no status attribute: | or that have no status attribute: | ||
< | <p class="code">%lis = %nod:SelectNodes('*[not(@status="pending")]') | ||
</p> | |||
</ | |||
</ul> | </ul> | ||
The XmlDoc API number( | <li id="num">number(<i>nodeset?</i>) | ||
<p> | |||
The XmlDoc API number( ) function differs from the XPath 1 definition as follows:</p> | |||
<ol> | <ol> | ||
<li>XPath 1 allows a variety of argument types (for example, a string literal); | <li>XPath 1 allows a variety of argument types (for example, a string literal); | ||
the XmlDoc API allows only a nodeSet argument. | the XmlDoc API allows only a nodeSet argument. | ||
<li>In XPath 1, if a nodeSet argument to number( | |||
<li>In XPath 1, if a nodeSet argument to number( ) | |||
contains more than one node, the first node | contains more than one node, the first node | ||
(in document order) is converted to a number and returned. | (in document order) is converted to a number and returned. | ||
< | <p> | ||
In the XmlDoc API, if the argument result contains more than one node, the | In the XmlDoc API, if the argument result contains more than one node, the | ||
request is cancelled, which is consistent with the XPath 2 standard. | request is cancelled, which is consistent with the XPath 2 standard. </p> | ||
<li>The definition of a numeric value for number( | |||
<li>The definition of a numeric value for number( ) (after | |||
stripping leading and trailing whitespace) is the same | stripping leading and trailing whitespace) is the same | ||
as the NumericLit production ([30]) in [[#XPath supported in the current version|XPath supported in the current version]]. | as the NumericLit production ([30]) in [[#XPath supported in the current version|XPath supported in the current version]]. | ||
This is consistent with the XPath 2 standard, and is | This is consistent with the XPath 2 standard, and is | ||
an extension of the XPath 1 definition of number( | an extension of the XPath 1 definition of number( ), which only accepts | ||
numbers of the form DecimalNumber ([30d]). | numbers of the form DecimalNumber ([30d]). | ||
</ol> | </ol> | ||
</ul> | </ul> | ||
===Comparison tests in predicates=== | |||
In the XPath standard (XPath 1 and XPath 2), either operand | In the XPath standard (XPath 1 and XPath 2), either operand | ||
in a comparison test in a predicate may be any form of XPath expression. | in a comparison test in a predicate may be any form of XPath expression. | ||
The predicate evaluates as true if the comparison is true of at least one | The predicate evaluates as true if the comparison is true of at least one | ||
node in the resulting nodeSet, | node in the resulting nodeSet, | ||
and typically the | and typically the purpose of the predicate is to select the nodes | ||
purpose of the predicate is to select the nodes | |||
for which the comparison is true. | for which the comparison is true. | ||
For example, the | For example, the following expression selects all <code>item</code> children | ||
following expression selects all < | of <code>order</code> elements that have a <code>price</code> attribute | ||
of < | |||
whose value is greater than 9.99: | whose value is greater than 9.99: | ||
< | <p class="code">order/item[@price > 9.99] | ||
</p> | |||
</ | |||
XmlDoc API predicate comparisons differ from XPath 1 comparisons: | XmlDoc API predicate comparisons differ from XPath 1 comparisons: | ||
<ul> | <ul> | ||
Line 1,123: | Line 1,157: | ||
in the XPath standard. | in the XPath standard. | ||
As is explained in the syntax discussion below, you can use only a Location | As is explained in the syntax discussion below, you can use only a Location | ||
Path or the number( | Path or the number( ) function, | ||
followed by a comparison operator, followed by a literal. | followed by a comparison operator, followed by a literal. | ||
<li>The XmlDoc API uses XPath 2 comparisons. | <li>The XmlDoc API uses XPath 2 comparisons. | ||
The XPath 1 standard does not provide for exception conditions | The XPath 1 standard does not provide for exception conditions | ||
and it does not provide for ordered string comparisons. | and it does not provide for ordered string comparisons. | ||
This is also true for Microsoft .Net, which follows the XPath 1 standard. | This is also true for Microsoft .Net, which follows the XPath 1 standard. | ||
<p> | |||
The XmlDoc API follows the XPath 2 standard by providing for exceptions | The XmlDoc API follows the XPath 2 standard by providing for exceptions | ||
(implemented as request cancellation conditions) and providing | (implemented as request cancellation conditions) and providing | ||
for ordered string comparisons. | for ordered string comparisons. </p> | ||
</ul> | </ul> | ||
As of the current version, the only forms of comparisons are these: | As of the current version, the only forms of comparisons are these four: | ||
<p class="syntax">position() <span class="term">relOp integer</span> | |||
<span class="term">LocPath relOp "stringLiteral"</span> | |||
<span class="term">LocPath relOp numericLiteral</span> | |||
number(<span class="term">[LocPath]</span>) <span class="term">relOp numericLiteral</span> | |||
</p> | |||
Where: | Where: | ||
<dl> | <dl> | ||
<dt>position( | <dt>position( ) | ||
<dd>This function is discussed in [[#Predicates supported in the current version|Predicates supported in the current version]]. | <dd>This function is discussed in [[#Predicates supported in the current version|Predicates supported in the current version]]. | ||
<dt><i | |||
<dt><i>relOp</i> | |||
<dd>One of the comparisons: <tt>=</tt>, <tt>!=</tt>, <tt><</tt>, | <dd>One of the comparisons: <tt>=</tt>, <tt>!=</tt>, <tt><</tt>, | ||
<tt><=</tt>, <tt>></tt>, <tt>>=</tt> | <tt><=</tt>, <tt>></tt>, <tt>>=</tt> | ||
<dt><i | |||
<dd>An integer, whose precision is limited to 15 decimal digits (as in | <dt><i>integer</i> | ||
<dt><i | <dd>An integer, whose precision is limited to 15 decimal digits (as in SOUL). | ||
<dt><i>LocPath</i> | |||
<dd>An XPath location expression | <dd>An XPath location expression | ||
(that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]]). | (that is, production [1], LocPath, in [[#XPath supported in the current version|XPath supported in the current version]]). | ||
As of the current version, such an expression still has the limitation that it | As of the current version, such an expression still has the limitation that it | ||
may not contain a predicate. | may not contain a predicate. | ||
<dt><i | |||
<dt><i>"stringLiteral"</i> | |||
<dd>A quoted string literal value, which must not exceed 255 bytes. | <dd>A quoted string literal value, which must not exceed 255 bytes. | ||
<p> | |||
The XmlDoc API has always supported ordered string comparisons, | The XmlDoc API has always supported ordered string comparisons, | ||
but the XPath 1 standard does not. | but the XPath 1 standard does not. | ||
For more information about these comparisons, | For more information about these comparisons, | ||
see [[#Ordered string comparisons|Ordered string comparisons]]. | see [[#Ordered string comparisons|Ordered string comparisons]]. </p> | ||
<dt><i | |||
<dt><i>numericLiteral</i> | |||
<dd>A numeric literal value, | <dd>A numeric literal value, | ||
whose precision is limited to 15 decimal digits (as in | whose precision is limited to 15 decimal digits (as in SOUL). | ||
For additional format and size limitations, see | For additional format and size limitations, see [[#Restrictions and limitations|Restrictions and limitations]]. | ||
<p> | |||
For more information about support for comparisons of a Location Path | For more information about support for comparisons of a Location Path | ||
to a numeric literal, see [[#Direct numeric comparison|Direct numeric comparison]]. | to a numeric literal, see [[#Direct numeric comparison|Direct numeric comparison]].</p> | ||
<p> | |||
Numeric literals in predicate comparisons are supported in the XmlDoc API | Numeric literals in predicate comparisons are supported in the XmlDoc API.</p> | ||
A comparison using the number( | <dt>number([<i>LocPath</i>]) | ||
<dd>The number( ) function with an optional Location Path argument. | |||
<p> | |||
A comparison using the number( ) function is very similar to | |||
comparison of a Location Path to a numeric literal ([[#Direct numeric comparison|Direct numeric comparison]]). | comparison of a Location Path to a numeric literal ([[#Direct numeric comparison|Direct numeric comparison]]). | ||
Comparing the result of number( | Comparing the result of number( ) to a literal gives a result | ||
according to their relative values and to the comparison operator. | according to their relative values and to the comparison operator. | ||
For example, < | For example, <code>shirt[number(@size) > 30]</code> | ||
selects nodes that have a size greater than 30. | selects nodes that have a size greater than 30. </p> | ||
<p> | |||
The significant difference between using a Location Path and using a | The significant difference between using a Location Path and using a | ||
number( | number( ) function in a numeric comparison is that the request is | ||
cancelled in the former case if a node in the comparison is non-numeric. | cancelled in the former case if a node in the comparison is non-numeric. | ||
This difference is discussed briefly below and in greater detail | This difference is discussed briefly below and in greater detail | ||
in | in [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]. </p> | ||
<p> | |||
These are the effects of the function's ''LocPath'' argument | These are the effects of the function's ''LocPath'' argument | ||
(they are consistent with the XPath 2 standard): | (they are consistent with the XPath 2 standard): </p> | ||
<ul> | <ul> | ||
<li>If the result of the ''LocPath'' argument is a single node, | <li>If the result of the ''LocPath'' argument is a single node, number( ) | ||
number( | |||
converts the value of the node, after stripping leading and trailing whitespace, | converts the value of the node, after stripping leading and trailing whitespace, | ||
to a number, or to the special value <tt>NaN</tt> | to a number, or to the special value <tt>NaN</tt> | ||
( | ("Not a Number") if the stripped value of the node is not numeric. | ||
<li>If the result of the argument has more than one node, the request | <li>If the result of the argument has more than one node, the request | ||
is cancelled. | is cancelled. | ||
For further details, see [[#number( | For further details, see [[#number( ) comparisons that cause request cancellation|number( ) comparisons that cause request cancellation]]. | ||
<li>If the ''LocPath'' argument is the empty nodeSet, the result of the | <li>If the ''LocPath'' argument is the empty nodeSet, the result of the | ||
number( | number( ) function is <tt>NaN</tt>. | ||
<p> | |||
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]] for examples. | See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]] for examples. </p> | ||
'''Note:''' | <p class="note">'''Note:''' | ||
In XPath 1 (which has no exception conditions), | In XPath 1 (which has no exception conditions), | ||
if there is more than one node in the nodeSet argument, | if there is more than one node in the nodeSet argument, | ||
the value of the first node (in document order) is used. | the value of the first node (in document order) is used. </p> | ||
<li>If you omit the ''LocPath'' argument, | <li>If you omit the ''LocPath'' argument, | ||
the default argument | the default argument "." (the context node) is used; that is, | ||
the node that is being filtered by the predicate gets | the node that is being filtered by the predicate gets | ||
converted to a numeric value. | converted to a numeric value. | ||
<p> | |||
For example: in the following XPath expression, the number( | For example: in the following XPath expression, the number( ) | ||
function converts the value of the < | function converts the value of the <code>size</code> Attribute | ||
to a number: | to a number:</p> | ||
< | <p class="code">/*/shirt/@size[number() > 10] | ||
</p> | |||
</ | |||
</ul> | </ul> | ||
If the number( | If the number( ) result that is | ||
compared to a literal is <tt>NaN</tt>, the comparison is always false (or, | compared to a literal is <tt>NaN</tt>, the comparison is always false (or, | ||
in the case of the <tt>!=</tt> operator, is always true). | in the case of the <tt>!=</tt> operator, is always true). | ||
This is important to note, because it means number( | This is important to note, because it means number( ) can be used to | ||
avoid the request cancellation to which numeric comparisons are subject | avoid the request cancellation to which numeric comparisons are subject | ||
if the nodes evaluated by a predicate may be non-numeric. | if the nodes evaluated by a predicate may be non-numeric. | ||
For further discussion, see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]. | For further discussion, see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]. | ||
If you are using number( | If you are using number( ) to avoid request cancellation for a numeric | ||
comparison because the nodes evaluated by a predicate may be non-numeric, | comparison because the nodes evaluated by a predicate may be non-numeric, | ||
and you are using the | and you are using the "not equals" comparison (<tt>!=</tt>), | ||
remember that an empty nodeSet argument will give a true comparison result | remember that an empty nodeSet argument will give a true comparison result | ||
(as a consequence of the rule for comparing <tt>NaN</tt>). | (as a consequence of the rule for comparing <tt>NaN</tt>). | ||
You can filter out the nodes included by empty nodeSet comparisons | You can filter out the nodes included by empty nodeSet comparisons | ||
by expanding the | by expanding the Location Path expression from <code>number(<i>LocPath</i>)</code> to | ||
< | <code><i>LocPath</i></code> <b>and</b> <code>number(<i>LocPath</i>)</code>, as described in | ||
[[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]]. | |||
<p class="note">'''Note:''' | |||
'''Note:''' | |||
In the XmlDoc API, | In the XmlDoc API, | ||
the number( | the number( ) function must be immediately followed by a comparison | ||
operator and a numeric literal. | operator and a numeric literal. | ||
This limitation is not required by XPath 1 or XPath 2. | This limitation is not required by XPath 1 or XPath 2. </p> | ||
</dl> | </dl> | ||
If an XPath expression contains a | ====Ordered string comparisons==== | ||
If an XPath expression contains a Location Path subexpression and | |||
a quoted string with an ordered comparison (that is, | a quoted string with an ordered comparison (that is, | ||
a comparison other than | a comparison other than "=" and "!="), the result is based | ||
on a byte-by-byte ordered comparison between each item of | on a byte-by-byte ordered comparison between each item of | ||
the nodeSet result of the subexpression and the literal string value. | the nodeSet result of the subexpression and the literal string value. | ||
Consider the following example: | Consider the following example: | ||
< | <p class="code">%nlis = %nod:SelectNodes('order[@date>"2007-01-01"]') | ||
</p> | |||
</ | If the value of the <code>date</code> Attribute node of an <code>order</code> | ||
If the value of the < | |||
Element child is, | Element child is, | ||
for example, | for example, <code>2007-05-17</code>, that order Element node will be included in | ||
the result. | the result. | ||
Line 1,272: | Line 1,311: | ||
Note, however, that | Note, however, that | ||
most practical ordered comparisons involve numeric values, which are | most practical ordered comparisons involve numeric values, which are | ||
supported | supported. | ||
In XPath 1, any ordered comparison is done by first converting each | In XPath 1, any ordered comparison is done by first converting each | ||
Line 1,280: | Line 1,319: | ||
Therefore, the XPath 1 result of the above example would always be | Therefore, the XPath 1 result of the above example would always be | ||
empty, because the literal is a non-numeric value. | empty, because the literal is a non-numeric value. | ||
====Direct numeric comparison==== | |||
If an XPath expression contains a Location Path subexpression | If an XPath expression contains a Location Path subexpression | ||
compared to a literal numeric value, the result is true if any node in | compared to a literal numeric value, the result is true if any node in | ||
the subexpression result, converted to a numeric value, has the | the subexpression result, converted to a numeric value, has the | ||
specified relationship ( | specified relationship ("=", "<", etc.) to the literal value. | ||
value. | |||
In the following example, an < | In the following example, an <code>order</code> child is in the result if | ||
it has an < | it has an <code>item</code> child whose <code>price</code> Attribute node is greater | ||
than 99.99: | than 99.99: | ||
< | <p class="code">%nlis = %nod:SelectNodes('order[item@price > 99.99]') | ||
</p> | |||
</ | |||
For more examples, | For more examples, see [[#xpncmp|Successful direct numeric comparisons]] below. | ||
see | |||
If any node value used in the direct numeric comparison is non-numeric, | If any node value used in the direct numeric comparison is non-numeric, | ||
the request is cancelled. | the request is cancelled. | ||
For examples, see | For examples, see [[#xpncan|Direct numeric comparisons that cause request cancellation]] | ||
below. | below. | ||
The discussion that follows makes references to | |||
the following <code>Clothes</code> document: | |||
<p class="code"><Clothes> | |||
<shirt size="32" type="dress" sku="100"/> | |||
<shirt size="33" type="sport" sku="101"/> | |||
<shirt size="M" type="sport" sku="102"/> | |||
<shirt size="34" type="frilly" sku="103"/> | |||
</Clothes> | |||
</p> | |||
<ol> | <ol> | ||
<div id="xpncmp"></div> | <div id="xpncmp"></div> | ||
Line 1,322: | Line 1,357: | ||
the comparison is true if the numeric value, after stripping | the comparison is true if the numeric value, after stripping | ||
leading and trailing whitespace, of any of the nodes | leading and trailing whitespace, of any of the nodes | ||
in the | in the Location Path result has the specified relationship | ||
to the numeric literal. | to the numeric literal. | ||
If none of the nodes has the relationship (which includes the | If none of the nodes has the relationship (which includes the | ||
case that the Location Path result is empty), the result of the | case that the Location Path result is empty), the result of the | ||
comparison is false. | comparison is false. | ||
<p> | |||
For example, | For example, | ||
using the Clothes document described above, | using the <code>Clothes</code> document described above, | ||
the following statement prints < | the following statement prints <code>sku="100"</code>. </p> | ||
< | <p class="code">%doc:Print('/*/shirt[@size<040]/@sku') | ||
</p> | |||
</ | <p> | ||
Note that the ''numeric'' value of the node and the | Note that the ''numeric'' value of the node and the | ||
''numeric'' value of the literal are compared, | ''numeric'' value of the literal are compared, | ||
so the leading zero in < | so the leading zero in <code>040</code> here is ignored. | ||
An equivalent comparison could be < | An equivalent comparison could be <code>@size<040.00</code>, etc. | ||
The following statement prints < | The following statement prints <code>sku="101"</code>: </p> | ||
< | <p class="code">%doc:Print('/*/shirt[@size<40 and @type="sport"]/@sku') | ||
</p> | |||
</ | <p> | ||
The following statement prints <code>sku="103"</code>; | |||
the comparison of the <code>size</code> Attribute is processed | |||
for only one Element, which has a numeric <code>size</code>. | |||
As discussed below, this request would fail if the order | |||
of the attribute subexpressions were reversed. </p> | |||
<p class="code">%doc:Print('/*/shirt[@type="frilly" and @size<40]/@sku') | |||
</p> | |||
<div id="xpncan"></div> | <div id="xpncan"></div> | ||
<li>Direct numeric comparisons that cause request cancellation | <li>Direct numeric comparisons that cause request cancellation | ||
Line 1,358: | Line 1,391: | ||
a value that is non-numeric after stripping leading and trailing | a value that is non-numeric after stripping leading and trailing | ||
whitespace, the request is cancelled. | whitespace, the request is cancelled. | ||
<p> | |||
For example, | For example, using the <code>Clothes</code> document described above, | ||
using the Clothes document described above, | |||
the following statement causes the request to be cancelled when the size | the following statement causes the request to be cancelled when the size | ||
attribute ( | attribute (<code>M</code>) of the second sport shirt is compared to the number 40 | ||
(note the difference between this XPath expression and the last one | (note the difference between this XPath expression and the last one | ||
in | in [[#xpncmp|Successful direct numeric comparisons]] above): </p> | ||
< | <p class="code">%doc:Print('/*/shirt[@size<40 and @type="frilly"]/@sku') | ||
</p> | |||
</ | <p> | ||
The following statement causes the request to be cancelled (at the same Element), | The following statement causes the request to be cancelled (at the same Element), | ||
because SelectNodes continues after the first selection, unlike the | because SelectNodes continues after the first selection, unlike the | ||
Print example with the same XPath expression | Print example with the same XPath expression | ||
in | in [[#xpncmp|Successful direct numeric comparisons]] above): </p> | ||
< | <p class="code">%sh = %doc:SelectNodes('/*/shirt[@size<40 and @type="sport"]/@sku') | ||
</p> | |||
<p> | |||
</ | |||
If you want the request cancellation to be avoided in statements like these, | If you want the request cancellation to be avoided in statements like these, | ||
consider using the number( | consider using the number( ) function (see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]).</p> | ||
</ol> | </ol> | ||
====number(LocPath) comparisons for non-numeric data==== | |||
The number( | The number( ) function can often be used to avoid request cancellation | ||
due to the presence of non-numeric data in a direct numeric comparison. | due to the presence of non-numeric data in a direct numeric comparison. | ||
For example, | For example, both statements from | ||
both statements from | [[#xpncan|Direct numeric comparisons that cause request cancellation]] | ||
above can avoid request cancellation | |||
above | |||
can avoid request cancellation | |||
(''when used with the particular document discussed in that section'') | (''when used with the particular document discussed in that section'') | ||
if the | if the Location Path <code>@size</code> is "converted" | ||
using the number( | using the number( ) function. | ||
The following statement prints < | The following statement prints <code>sku="103"</code>, even though the <code>size</code> | ||
Attribute equal to < | Attribute equal to <code>M</code> is processed before the selected Element: | ||
< | <p class="code">%doc:Print('/*/shirt[number(@size)<40 and @type="frilly"]/@sku') | ||
</p> | |||
</ | |||
Similarly, the following statement succeeds even though the < | Similarly, the following statement succeeds even though the <code>size</code> | ||
Attribute equal to < | Attribute equal to <code>M</code> is processed (and not selected): | ||
< | <p class="code">%sh = %doc:SelectNodes('/*/shirt[number(@size)<40 and @type="sport"]/@sku') | ||
</p> | |||
</ | |||
Comparisons with the number( ) function are always false for a non-numeric node value, unless the comparison is <code>!=</code>. | |||
false for a non-numeric node value, unless the comparison is | The following statements both print <code>None found</code>: | ||
<p class="code">Print %doc:ValueDefault( - | |||
The following statements both print < | '/*/shirt[number(@size) < 40 and @sku="102"]/@size', 'None found') | ||
< | |||
Print %doc:ValueDefault( - | |||
'/*/shirt[number(@size) >= 40 and @sku="102"]/@size', 'None found') | |||
</p> | |||
</ | |||
The following, however, prints < | The following, however, prints <code>M</code>, the size of shirt with | ||
this SKU | this SKU. Its size is not less than 40 nor greater than or | ||
equal to 40, but it is not equal to 40 (nor any other number): | equal to 40, but it is not equal to 40 (nor any other number): | ||
< | <p class="code">Print %doc:ValueDefault( - | ||
'/*/shirt[number(@size) != 40 and @sku="102"]/@size', 'None found') | |||
</p> | |||
</ | <blockquote class="note"> | ||
'''Note:''' | <p>'''Note:''' | ||
Before substituting the number( | Before substituting the number( ) function into a direct numeric comparison, | ||
you should be aware of two differences | you should be aware of two differences | ||
between direct numeric comparison and the use of number( | between direct numeric comparison and the use of number( ): </p> | ||
<ul> | <ul> | ||
<li>Although number( | <li>Although number( ) can be used to prevent non-numeric nodes from causing | ||
request cancellation, a different number( | request cancellation, a different number( ) condition can cause cancellation: | ||
if the argument nodeSet contains more than one node. | if the argument nodeSet contains more than one node. | ||
See [[#number( | See [[#number( ) comparisons that cause request cancellation|number( ) comparisons that cause request cancellation]]. </li> | ||
<li>The result of the | |||
of number( | <li>The result of the <code>!=</code> comparison is <code>true</code> if the value | ||
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]], which follows. | of the number( ) argument nodeSet is empty. | ||
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]], which follows. </li> | |||
</ul> | </ul> | ||
</blockquote> | |||
When a predicate contains the number( | |||
the | ====number(LocPath) != n, LocPath result is empty node-set==== | ||
When a predicate contains the number( ) function followed by | |||
the <code>!=</code> comparison, if the nodeSet result is empty, the result of | |||
the comparison is true; if any other comparison operator is used, the result is | the comparison is true; if any other comparison operator is used, the result is | ||
false. | false. | ||
For example, consider this document: | For example, consider this document: | ||
< | <p class="code"><t> | ||
<w a="1"/> | |||
<x a="PI"/> | |||
<y a="e" b="1"/> | |||
<z a="e" b="2"/> | |||
</t> | |||
</p> | |||
</ | |||
If you are using a numeric comparison to search for Attribute < | If you are using a numeric comparison to search for Attribute <code>a</code>, | ||
you should use the number( | you should use the number( ) function to avoid request cancellation, | ||
because < | because <code>a</code> has non-numeric values. | ||
The following statement sets the result nodelist to the | The following statement sets the result nodelist to the | ||
Element < | Element <code>w</code>: | ||
< | <p class="code">%nlis = %doc:SelectNodes('/t/*[number(@a) = 1]') | ||
</p> | |||
</ | |||
The following statement sets the result nodelist to the | The following statement sets the result nodelist to the | ||
Elements < | Elements <code>x</code>, <code>y</code>, and <code>z</code>: | ||
< | <p class="code">%nlis = %doc:SelectNodes('/t/*[number(@a) != 1]') | ||
</p> | |||
</ | |||
The < | The <code>b</code> Attribute, however, does not have any non-numeric | ||
values, so it can be used without number( | values, so it can be used without number( ). | ||
Each of the following two statements sets the result nodelist to the | Each of the following two statements sets the result nodelist to the | ||
Element < | Element <code>y</code>: | ||
< | <p class="code">%nlis = %doc:SelectNodes('/t/*/[@b = 1]') | ||
%nlis = %doc:SelectNodes('/t/*[number(@b) = 1]') | |||
</p> | |||
</ | |||
However, the following two statements differ in their result: | However, the following two statements differ in their result: | ||
< | <p class="code">%nlis = %doc:SelectNodes('/t/*[@b != 1]') | ||
%nlis = %doc:SelectNodes('/t/*[number(@b) != 1]') | |||
</p> | |||
</ | |||
The first sets the result nodelist to the | The first sets the result nodelist to the | ||
Element < | Element <code>z</code>, while the second includes | ||
Elements < | Elements <code>w</code> and <code>x</code> as well as the Element <code>z</code>. | ||
Since they do not contain the < | Since they do not contain the <code>b</code> | ||
Attribute, the result of < | Attribute, the result of <code>number(@b)!=1</code> at elements <code>w</code> | ||
and < | and <code>x</code> is <code>true</code>. | ||
If you want to make number( ) similar to a direct comparison in this | |||
respect, you "and" the Location Path argument with the | |||
number( ) factor in the predicate. | |||
So, for example, the following sets the result nodelist to the | |||
Element <code>z</code>, just like the direct comparison approach: | |||
<p class="code">%nlis = %doc:SelectNodes('/t/*[@b and number(@b) != 1]') | |||
</p> | |||
<p class="note">'''Note:''' | |||
The other way in which number( ) differs from direct comparison | |||
is described in [[#number( ) comparisons that cause request cancellation|number( ) comparisons that cause request cancellation]].</p> | |||
====number( ) comparisons that cause request cancellation==== | |||
When a predicate contains the number( ) function, | |||
When a predicate contains the number( | |||
the request is cancelled if the value of | the request is cancelled if the value of | ||
the nodeSet argument to the number( | the nodeSet argument to the number( ) function has more than one node. | ||
For example, consider this document: | For example, consider this document: | ||
< | <p class="code"><t> | ||
<x a="1" b="2"/> | |||
<y b="pi" a="3.14159265"/> | |||
</t> | |||
</p> | |||
</ | |||
If you are searching for all Elements that have any Attribute | If you are searching for all Elements that have any Attribute | ||
greater than 1, you can use the | greater than 1, you can use the Location Path <code>@*</code> as | ||
a wildcard comparison for any Attribute. | a wildcard comparison for any Attribute. | ||
However, you cannot use direct comparison, because some of the | However, you cannot use direct comparison, because some of the | ||
attributes are non-numeric. | attributes are non-numeric. | ||
So, you might try to use < | So, you might try to use <code>number(@*)</code>, as in the | ||
following example: | following example: | ||
< | <p class="code">%nlis = %doc:SelectNodes('/t/*[number(@*) > 1]') | ||
</p> | |||
</ | |||
However, this will cause a request cancellation, because the value of | However, this will cause a request cancellation, because the value of | ||
< | <code>@*</code> contains more than one node. | ||
In such situations, | In such situations, | ||
you must decide which node is to be converted | you must decide which node is to be converted | ||
to a number for the comparison. | to a number for the comparison. | ||
In this case, you probably want to use: | In this case, you probably want to use: | ||
< | <p class="code">%nlis = %doc:SelectNodes( - | ||
'/t/*[number(@a) > 1 or number(@b) > 1]') | |||
</p> | |||
</ | This will set the result nodelist to Elements <code>x</code> | ||
This will set the result nodelist to Elements < | and <code>y</code>. | ||
and < | |||
[[Category: Janus SOAP]] | [[Category: Janus SOAP]] | ||
[[Category:Overviews]] | [[Category:Overviews]] |
Latest revision as of 14:20, 29 May 2019
This article has information to help you use XPath arguments to various XmlDoc API methods. Most of the information is taken from the XPath 1 standard, which is the authoritative reference.
References to Version 2 of the XPath standard in this manual are to the XPath 2 standard, which became a W3C Recommendation on January 23, 2007.
The five sections in this topic explain, respectively, the following:
- How to understand the components of an XPath expression, which is composed of steps.
- The syntax of XPath.
- Some subtle aspects of XPath.
- Specific XPath axis combinations to avoid.
- The subset of XPath supported by the current version of the XmlDoc API.
XPath operation
The purpose of XPath is to select a subset of nodes from a document. This selection is done using an expression, as described by the PathExpr production (these syntax productions for XPath are shown in XPath syntax). The simple form (that is, without parentheses) of a PathExpr expression is called a Location Path. (the LocPath production ([1]) in the syntax).
A Location Path consists of a series of Steps (Step ([4]) production). Each Step operates by taking an input set of nodes from the preceding step, and creating an output set of nodes. The output of the last step is the set of nodes selected by the XPath expression.
An example XPath expression is:
pitm[2]/partnum
This expression contains two Steps (the slash symbol ( / ) is used to separate the Steps in a Location Path).
Often a Step will start with an element name, which selects all the child
elements with that name.
In the above example, partnum
children of pitm
elements are selected.
These child relationships are one kind of relationship between the
input to a Step and the first part of the algorithm;
the kinds of relationships, or Axes, are shown
in the AxisName ([6]) production.
The element names in the above example are a form of NodeTest, described by the NodeTest ([7]) production. The NodeTest is used to restrict the set of nodes.
Square brackets ([ ]) in a Step surround another form of restriction, which is called a Predicate, given by the production ([8]) with the same name. A Predicate is a much more open-ended type of restriction, allowing various functions and operations, including Booleans.
The operation of a Step is as follows:
- A Step consists of an Axis, NodeTest, and zero or more Predicates.
- The input to a Step is a set of context nodes.
- The Axis produces sets of nodes, one set for each context node.
- Each of these sets is filtered by the NodeTest.
- Each of the resulting sets is filtered by the first Predicate.
- Each of the sets which are output by a Predicate is filtered by the following Predicate, if any.
- The final filtered sets are combined (using set union), and the result is the set of nodes that becomes input to the following Step.
- The result of the final Step is the result of the Location Path.
Axes
The various forms of the AxisName ([6]) production generate
nodes based on a context node using the simple tree relationships
described by the name.
For example, attribute::
(abbreviated as @, the at sign)
generates the set of all attributes of a context node.
The XmlDoc API supports the following axes (be sure to also read Performance considerations: Document order, certain axes):
- ancestor
- Contains the parent of the context node, its parent, and so on, to and including the Root node of the XmlDoc.
- ancestor-or-self
- The same contents as ancestor, except that it also includes the context node.
- attribute
- Contains the attributes of the context node, which must be an element.
- child
- Contains the children of the context node.
- descendant
- Contains the context node children, their children, and so on. Since the children of a node do not include its attributes, this axis does not include any attributes (so, this is not equivalent to a "sub-tree").
- descendant-or-self
- The same contents as descendant, except that it also
includes the context node.
Specified by //, an
abbreviation for the step consisting of descendant-or-self::node(),
this axis can be used, for example, to locate an element by its name,
if the "path" to it is not known:
//foo
- following
- Contains all the nodes that are after the context node in document order, excluding any descendants and excluding attribute nodes.
- following-sibling
- Contains all the siblings of the context node that are positioned after that node. Contains no siblings if context node is an attribute.
- parent
- Contains the parent of the context node. Each node has only one parent, except the root node, which has no parent.
- preceding-sibling
- Contains all the siblings of the context node that are positioned before that node. Contains no siblings if context node is an attribute.
- self
- Contains the context node itself.
The following XPath axes are not supported in the current version:
- namespace
- In keeping with the XPath 2 recommendation, Sirius does not plan to support this axis at any time. You can obtain the information provided by namespace declarations by using certain XmlDoc API methods, for example, the URI function on an XmlNode.
- preceding
- Support for this axis may be added in a later version.
NodeTests
The various forms of the NodeTest ([7]) production filter nodes as follows:
- NodeType '(' ')'
- This selects any node that has the respective node type; for example,
comment()
selects all Comment nodes in a node set. - 'processing-instruction' '(' Lit ')'
- This selects any Processing Instruction node, if the target name is equal to the value of Lit.
- '*' | NCName ':' '*' | QName
- These forms test the name of a node, after restricting the type of node
to the "principal node type" of the Axis, as follows:
- Name tests in the attribute:: Axis restrict to Attribute nodes.
- Name tests in any other Axis restrict to Element nodes.
The name tests then filter the resulting nodes as follows:
- '*'
- This selects a node of the selected type regardless of the node's name. The "selected type" is the principal node type of the node subset selected by the preceding Axis. The default Axis type is child, so the default node type is Element.
- NCName ':' '*'
- This selects a node of the selected type if the node has an associated namespace equal to the URI associated with NCName.
- QName
- This selects a node of the selected type if the node has the same name as QName.
Namespace URI associations and QName equality are discussed in Names and namespaces.
Predicates
Each nodeSet that is the result of NodeTest filtering is input to the series of Predicates in the Step. Each Predicate's result sets are passed to the following one, and a union of the results of the last Predicate (or the NodeTest, if there are no Predicates) forms the result of the Step.
There are a variety of Predicates, and except for a numeric Predicate,
a Predicate selects a node if the value of the Predicate, converted to
a Boolean, is true
.
Within a single step:
- Multiple Predicates are allowed.
- Location paths within Predicates can themselves contain Predicates.
Common forms of predicates use Location Path expressions,
XPath functions, or a combination of these.
For example, the following path selects all contact
children of the second cust
element that have
a fax
child element:
/active/cust[position(2)]/contact[fax]
Location Paths (described in XPath operation) in a predicate can be any supported Location Path, including multi-step and absolute expressions.
You can also use a location path or the number( ) function, followed by a comparison operator and literal, as a predicate of an XPath expression, as described in Comparison tests in predicates.
For a description of the XPath predicates currently supported in the XmlDoc API, see Predicates supported in the current version
Functions
There are many functions defined for XPath; this section merely gives a sample of some of them. Furthermore, as of the current version, many of these are not supported. See XPath functions supported in the current version for a list of the XPath functions currently supported. All supported functions in the following list are shown as a link to the section which may contain additional notes for using the function in the XmlDoc API.
In the XmlDoc API, XPath functions are used only in predicates.
Here are some XPath functions that return a numeric result:
- last( )
- Returns the size (number of nodes) of the set that the predicate is filtering. Not supported in the XmlDoc API.
- position( )
- Returns the position of the node in the set that the predicate
is filtering
The position in the set is dependent on the Axis in effect for the
Predicate.
The following axes use the reverse of document order to arrange the
nodes in a node set:
- ancestor
- ancestor-or-self
- preceding
- preceding-sibling
All other axes use document order to arrange the nodes in a node set.
Note: If a PathExpr is parenthesized and followed by Predicates, and if position( ) is used in those predicates, the axis in effect is the child axis.
- count(nodeSet)
- Returns the number of nodes in the argument nodeSet
Notice that this function uses a nodeSet type argument.
As in this example, a LocPath can be passed:
/book/chapter[count(section) >= 3]
This expression will select all chapters that have three or more sections.
Not supported in the XmlDoc API.
- number(object?)
- Returns the numeric value of the argument (after stripping leading and trailing blanks if it is a string object). If the argument cannot be converted to a number, it returns the special value "NaN" ("Not A Number"), which is not equal to any other object (including another expression whose result is NaN). The default argument is the context node (of the "containing" expression, which, for our purposes, is the context node of the predicate containing the number( ) function). If the argument is a nodeSet, the value of the first node (in document order) of the argument nodeSet is converted to a numeric value as above. If the argument is the empty nodeSet, NaN is returned.
Here are some XPath functions that return a string result:
- string(object?)
-
This function, like several other XPath functions, allows different
kinds of arguments: for example, it can be used to convert a number
to a string.
The string( ) function is implicitly used when a comparison is made between
a node and a string value, for example:
/book/chapter[@title = 'Introduction']
In this case, each node in the node set that is the result of the
@title
PathExpr is converted using string( ), then compared to the string literalIntroduction
.The default argument of the string( ) function is the context node of the expression. The string( ) function, when given a node set argument, uses the string value of the first node, in document order, of that node set.
Not supported in the XmlDoc API.
- substring(string, number, number?)
- Returns the substring of the first argument, starting at the position specified by the second argument, and for the number of characters specified by the third argument (or the remainder of the string, if the third argument is omitted). As with most XPath expressions, conversions are freely done, so if the first argument is not a manifest string type, it is converted to one using the string( ) function. Not supported in the XmlDoc API.
This XPath function returns a boolean result:
- not(boolean)
- Returns
true
if its argument is false, andfalse
otherwise. For example,not(paragraph[3])
selects a node if it is not the thirdparagraph
child.
XPath syntax
This section contains a version of the XPath syntax. See XML syntax for an explanation of the syntax conventions. The syntax below has been changed from that in the XPath Recommendation in these ways:
- Names of some non-terminals have been changed (for example, “Lit” rather than “Literal”).
- Some productions have been collapsed. This introduces superficial ambiguity that is dealt with as needed, for example, showing the precedence of operators.
- A few of the productions have been moved, to illustrate that the PathExpr is the syntax goal. It is the form of expression that selects a set of nodes, which is the purpose of XPath in the XmlDoc API. In other places in this manual, an “XPath expression” means a PathExpr.
For a "cross-reference" to the productions as contained in the XPath Recommendation, see XPath syntax cross-reference.
See also XPath supported in the current version.
[A19] PathExpr ::= LocPath | PrimaryExpr Predicate+ | PrimaryExpr Predicate* '/' RelativeLocPath | PrimaryExpr Predicate* '//' RelativeLocPath [B15] PrimaryExpr ::= Variable | Lit | Number | FunctionCall | '(' UnaryExpr ')' | '(' Expr ')' [C27] UnaryExpr ::= PathExpr ('|' PathExpr)* | '-' UnaryExpr | PrimaryExpr [1] LocPath ::= RelativeLocPath | AbsoluteLocPath [2] AbsoluteLocPath ::= '/' RelativeLocPath? | '//' RelativeLocPath [3] RelativeLocPath ::= Step ('/' Step)* | RelativeLocPath '//' Step [4] Step ::= '.' /* self::node() */ | '..' /* parent::node() */ | (AxisName '::' | '@')? NodeTest Predicate* [6] AxisName ::= 'ancestor' | 'ancestor-or-self' | 'attribute' | 'child' | 'descendant' | 'descendant-or-self' | 'following' | 'following-sibling' | 'namespace' | 'parent' | 'preceding' | 'preceding-sibling' | 'self' [7] NodeTest ::= '*' | NCName ':' '*' | QName | NodeType '(' ')' | 'processing-instruction' '(' Lit ')' [8] Predicate ::= '[' Expr ']' [16] FunctionCall ::= FunctionName '(' ( Expr ( ',' Expr )*)? ')' [21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )* [23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )* [24] RelExpr ::= NumExpr ( ('<' | '>' | '<=' | '>=') NumExpr )* [25] NumExpr ::= UnaryExpr ( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )* [29] Lit ::= '"' [^"]* '"' | "'" [^']* "'" [30] Number ::= [0-9]+ ('.' [0-9]*)? [35] FunctionName ::= QName - NodeType [36] Variable ::= '$' QName [38] NodeType ::= 'comment' | 'text' | 'processing-instruction' | 'node'
XPath syntax notes
For information about XPath functions, see Functions.
- In [A19], [2], and [3], the double slash (//) is an
abbreviation for
/descendant-or-self::node()/
- When an at sign (@) is used in a Step ([4]), it is an
abbreviation for
attribute::
- The syntax for Step ([4]) notes that it may begin directly with a NodeTest. In that case, child:: is implied before the NodeTest.
- The syntax for QName and NCName are given in Name and namespace syntax.
- The precedence of expressions from
Expr ([21]), EqExpr ([23]),
RelExpr ([24]), and NumExpr ([25]) is as follows
(lowest precedence first):
- or (Short-circuit evaluation)
- and (Short-circuit evaluation)
- =, != (On node sets geared to one operand singleton)
- <=, <, >=, > (Node sets geared to one operand singleton; no string ordering)
- +, -
- *, div, mod
The operators are all left-associative. For example,
3 > 2 > 1
is the same as(3 > 2) > 1
, which evaluates to false. - The only forms of PrimaryExpr ([B15]) that can create node sets,
and so be used in a PathExpr ([A19]), are the id( ) function and
parenthesized LocPath ([1]) (or parenthesized unions of them, with
|
([C27])).
See also the notes in XPath supported in the current version.
XPath syntax cross-reference
Here is a listing of all first productions contained in various numbered sections and unnumbered subsections from the XPath Recommendation. This may be helpful if you want to cross-reference the productions shown in XPath syntax with those in the XPath Recommendation.
- Section in XPath Recommendation
- Production number and name
- 2 Location Paths
- [1] LocationPath
- 2.1 Location Steps
- [4] Step
- 2.2 Axes
- [6] AxisName
- 2.3 Node Tests
- [7] NodeTest
- 2.4 Predicates
- [8] Predicate
- 2.5 Abbreviated Syntax
- [10] AbbreviatedAbsoluteLocationPath
- 3.1 Basics
- [14] Expr
- 3.2 Function Calls
- [16] FunctionCall
- 3.3 NodeSets
- [18] UnionExpr
- 3.5 Numbers
- [25] AdditiveExpr
- 3.7 Lexical Structure
- [28] ExprToken
. . .
[39] ExprWhitespace (last production)
Some notes on XPath usage
The following subsections describe some subtle issues in XPath, which the XmlDoc API implements exactly as specified in the recommendations.
Note: Predicates in XPath expressions, which are widely used, are enclosed within square bracket characters ([ ]). However, simply typing square bracket characters in your XPath expressions risks invalid character errors arising from a mismatch between a 3270-keyboard codepage and the codepage set for the Online with the UNICODE command.
Model 204 7.6 maintenance added XMHTL entities for left and right square brackets in response to this problem. These entities are used in some of the predicate examples shown below.
"//" and "."
As mentioned in Performance considerations: Document order, certain axes, the descendant-or-self axis (commonly appearing in an XPath expression with the // abbreviation) should generally be avoided, due to possibly incurring extra CPU and DKRD overhead. In addition, if you have verified the performance questions, be sure that you understand the meaning of //. For example:
- The expression //chapter[1] is not the first chapter in the document; it is the first chapter child of each element in the document.
- An XPath expression that begins with // is an absolute
expression; if you want to use it at the start of a relative XPath
expression, make use of the
.
step:%chapter = %doc:SelectSingleNode('/book/chapter[title="Aerobics"]') %sections = %chapter('.//section')
Attributes: not children, and excluded from most axes
One subtle point to observe is that attributes are not children of their parents! As stated in the XPath Recommendation:
- 2.2 Axes
- . . .
- the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes
Also, Attributes cannot be parents, and they are explicitly excluded from some axes.
The only axes which may include Attribute nodes are:
- ancestor-or-self
- attribute (abbreviated @)
- descendant-or-self (abbreviated //)
- self
Note that the child axis is not in the above list, and so it will never include an Attribute node; thus, since child is the default axis, an Attribute node will never the result of a step which does not explicitly include one of the above (abbreviated or unabbreviated) axes.
Order of nodes: node sets versus nodelists
The order of nodes in an XML document is the order in which the node (or its start-tag, in the case of an Element node) first occurs in the serial form of the document. Thus, in document order, the Root node is first, an Element node occurs before its Attribute nodes and Namespace declarations, which appear before the children of the Element, and so on.
For the sake of simplicity and to be consistent with XSLT, the order of nodes in an XmlNodelist in the XmlDoc API is also document order.
In XPath, however, document order is not always used. In an XPath expression step, the axis implies an order. This order is important for the position( ) and last( ) predicate functions, which filter a node based on its order in a node set. This order is the same as document order, except for the following axes (for which the order is the reverse of document order):
- ancestor
- ancestor-or-self
- preceding (not supported by the XmlDoc API)
- preceding-sibling
For example, consider the following document:
<top> <a/> <c/> <d/> <e/> </top>
Using XPath to select the first of the "following" siblings
(/*/c/following-sibling::*[1]
)
yields the equivalent of the first element in document order: d
.
However, selecting the first of the preceding siblings
(/*/c/preceding-sibling::*[1]
)
yields the first element in reverse document order: b
.
This reverse ordering
is apparent in some contexts but not in others.
For example, the XPath expression in the following statement is used
against the document above (call it %doc
) to select
nodes into an XmlNodelist:
%nodelist = %doc:SelectNodes('/*/c/preceding-sibling::*()')
The XmlDoc API (re)arranges the two found nodes in %nodelist
into document
order: a
and b
, in that order, and
the following statement selects b
, the second of the nodes:
Print %nodelist:Item(2):LocalName
Yet, if you use the following statement (instead of the previous two)
in an attempt to directly select
the "second" preceding sibling, the result is node a
:
Print %doc:LocalName('/*/c/preceding-sibling::*[2]')
The Item method, above, selects the second node in the set of
document-ordered nodes in the XmlNodelist.
The position( ) function above (the 2
within the brackets)
selects the second node in the
set of reverse-document-ordered nodes passed from the preceding-sibling axis
after filtering by the *
NodeTest.
Performance considerations: Document order, certain axes
This section discusses the performance implications of evaluating certain XPath expressions. The expressions of concern have a common characteristic — they are not simple XPath expressions.
Simple XPath expressions, which have no special performance considerations, are any of these:
- One or more steps containing only child, attribute, or self axes.
- A parent axis used in the first step, after which may be one or more steps containing only child, attribute, or self axes.
- A following-sibling axis used alone.
The rest of this section considers XPath expressions that are not simple and therefore might have negative performance implications. If your use of XPath is confined to the simple expressions defined above, the following discussion is not your concern.
The XPath expression arguments of methods like SelectNodes and UnionSelected in the XmlNodelist class designate a set of nodes. In addition to these "set-valued" methods, XPath expressions can be used in many XmlDoc API methods (SelectSingleNode, Value, DeleteSubtree, QName, and more) to operate on a single node that satisfies the expression. For a simple XPath expression (described above), the "single node" methods may be able to determine the desired node by scanning fewer nodes, giving better performance than the set-valued methods for the same expression.
The single-node XPath selection by the XmlDoc API returns the first node in document order, but with non-simple XPath expressions, this is not the same as the first node found by the XPath internal selection algorithm, which may visit nodes in a different order. In those cases, an entire subtree is examined to determine the first node, in document order, that the XPath expression selects.
In other words, given an XPath expression expr that uses any of the axis cases described below in The extra-processing expressions and given any single-node selection method XMeth, this expression:
%obj:XMeth(expr)
scans as many nodes as:
%obj:SelectNodes(expr):Item(1):XMeth
For all other XPath expressions, the number of nodes scanned by the first of these two approaches may be significantly lower, because the first node internally selected will also be the first node in document order.
For example, consider the following document:
<top> <a> </a> </top>
When, say, Value('/*/*/b/@x')
is evaluated, the document search
ends when the first match is found (and the Value method returns 1
).
But when Value('//b/@x')
is evaluated, the document search first
finds the match x=2
, then it continues searching the entire
document for all matches, to ensure that the match which is lowest in
document order (x=1
) is the result.
The performance implications of the expressions that involve extra processing apply to the set-valued methods as well. The set methods must produce their results in document order, but the nodes selected during XPath evaluation may be selected in an order (due to the selection algorithm) that differs from document order.
The extra-processing expressions
Extra processing can occur in the following cases:
- The presence of the preceding-sibling axis
- The presence of the ancestor axis
- The presence of the ancestor-or-self axis
- The presence of the following axis, if it is not the first step in the expression
- The presence of any of these axis-combinations:
- One of the following axes:
- descendant
- descendant-or-self
- following
- parent, if it is not the first step in the expression
followed, in a subsequent step, by any of these axes:
- parent
- child
- following-sibling
- descendant
- descendant-or-self
- The parent axis, if it is not the first step in the expression,
followed, in a subsequent step, by the
- attribute axis.
- One of the following axes:
In addition to the cost of the actual XPath search performed with the above expressions, they can incur an additional "one-time" cost for XPath evaluation. If the document has been modified in such a way that the internal order of the nodes cannot be guaranteed to be the same as document order (this will always happen with any of the XML Insert..Before methods, and usually will happen with any of the Add.. methods), then the entire document (not only the subtree being searched) must be scanned so that the order is adjusted. This does not involve any internal movement of the nodes, but it does require a full scan. This adjustment "fixes up" the XmlDoc and it will remain "fixed up" for subsequent XPath searches, until such time as the XmlDoc is subsequently updated in such a way that the internal order does not guarantee document order.
Note:
- One important exception to the above rules is the
descendant-or-self::node()
step followed immediately by the child axis without any predicate. An example of the usual way to specify this is://chapter
In this case, the internal node selection algorithm operates in document order, and no extra processing is incurred.
Even with this special case, it is better to avoid the descendant-or-self step (specified explicitly or by using
//
) if your document structure lends itself to explicitly specifying the "intermediate" elements with*
(or even better, with their names) that should be matched.- The considerations described in this section only apply to the "outer" XPath expression; they do not apply to any expression within a predicate. Although it is still better, for the sake of efficiency, to prune the search by explicitly specifying "intermediate" elements rather than using
//
, there is no efficiency concern due to the internal order of node selection with an XPath predicate such as the following:Print %d:Value('/book/chapter' With '[.//credit/details/@auth="Dave"]')
- In conclusion, except when you must use the "//chapter" exception discussed in Note 1, above, avoid these extra-processing axes and axis combinations (especially in outer XPath expressions) if your documents are relatively large and performance is a consideration.
XPath supported in the current version
This section contains a condensed excerpt of the XPath syntax, showing only those parts of XPath used in the current version. It also explains any differences in the result of XPath expressions in the XmlDoc API versus that specified in the XPath 1 standard.
See also:
- XML syntax, for an explanation of the syntax conventions used in this manual
- XPath syntax, for the full XPath 1 syntax.
- Predicates supported in the current version and XPath functions supported in the current version
In most cases, the syntax below is a subset of the XPath 1 standard; the syntax of NumericLit ([30]), however, is an extension of XPath 1, whose comparable production (Number, [30]) is limited to the production below of DecimalNumber ([30d]).
[1] LocPath ::= RelativeLocPath | '/' RelativeLocPath? [3] RelativeLocPath ::= Step ('/' Step)* [4] Step ::= '.' /* self::node() */ | '..' /* parent::node() */ | (AxisName '::' | '@')? NodeTest Predicate* [6] AxisName ::= 'ancestor' | 'ancestor-or-self' | 'attribute' | 'child' | 'descendant' | 'descendant-or-self' | 'following' | 'following-sibling' | 'parent' | 'preceding-sibling' | 'self' [7] NodeTest ::= '*' | NCName ':' '*' | QName | 'node()' | 'comment()' | 'text()' | 'processing-instruction(' StringLit? ')' [8] Predicate ::= '[' PredExpr ']' | 'not' '(' PredExpr ')' PredExpr ::= 'position()' CmpOp PositiveInteger | PositiveInteger /* “simple position test” */ | PathExpr /* “Existence test” */ | Comparison | PredExpr ('and' | 'or') PredExpr | '(' PredExpr ')' Comparison ::= PathExpr CmpOp StringLit | PathExpr CmpOp NumericLit | 'number' '(' PathExpr? ')' CmpOp NumericLit CmpOp ::= '=' | '!=' | '<' | '>' | '<=' | '>=' [29] StringLit ::= '"' [^"]* '"' | "'" [^']* "'" [30] NumericLit ::= DecimalNumber ('E' ('+' | '-')? NonNegInteger)? [30d] DecimalNumber ::= ('+' | '-')? NonNegInteger Fraction? | ('+' | '-')? NonNegInteger '.' | ('+' | '-')? Fraction [30f] Fraction ::= '.' NonNegInteger [30p] PositiveInteger ::= [1-9] [0-9]* [30p] NonNegInteger ::= [0-9]+
Explanatory notes
The following notes are intended only to explain the above syntax; they do not present any limitations on XPath support in the XmlDoc API:
- When the at sign (@) is used in a Step ([4]), it is an
abbreviation for
attribute::
- The syntax for Step ([4]) notes that it may begin directly with a NodeTest. In that case, child:: is implied before the NodeTest.
- The syntax for QName is given in Name and namespace syntax.
- A node is selected by a PositiveInteger predicate if the position of the node, in the set which the predicate is filtering, is equal to that PositiveInteger.
- A node is selected by an Existence test if the result of the PathExpr, using that node as the context node, is non-empty.
- A node is selected by a Comparison if any node in the result of the PathExpr, using that node as the context node, holds the specified relationship to the Lit.
Restrictions and limitations
The following notes concern limitations on XPath support in the XmlDoc API:
- One way to summarize the XPath productions that are not supported
in the current version
is to list the XPath operators that
are not supported, as shown in the following table:
<td/)
Unsupported operators Meaning Comments + - * div mod arithmetic | union The UnionSelected method can be used to form the union of two nodesets. Note: Parentheses are allowed for grouping within Boolean expressions within predicates, but this is the only place they are supported.
- As of the current version, XPath function support is limited enough that it can be shown in the syntax above. However, for clarity, see XPath functions supported in the current version for a list of supported functions and for any differences between the the XmlDoc API implementation and the XPath 1 definition of a function.
- The Boolean operators (and, or) and relational operators (= != < <= > >=) are supported only in (some) predicates (see Comparison tests in predicates).
- XPath variables ($var_name) are not supported.
- The numeric constants +/- infinity are not supported.
- The size of an XPath expression is limited to approximately 26 steps, if each has an NCName NodeTest.
- In the XmlDoc API, a numeric value (either a literal or a node value)
may be of any form available in SOUL.
In particular, "E-format" literals, such as
1.003E-5
(even though they are not very common in XML documents) may be specified. The same form of numbers is available in XPath 2. XPath 1 only allows decimal numbers; it does not allow E-format literals nor node values. - The precision used in the XmlDoc API XPath support is that provided by SOUL — namely, 15 decimal digits.
- If the XPath support in the XmlDoc API attempts to convert a long string (that is, longer than 255 bytes) or a number whose absolute value is beyond the capabilities of SOUL (maximum absolute value approximately 7.237E75), the request is cancelled.
Predicates supported in the current version
The XmlDoc API supports these predicates:
- Two types of Location-Path-expression predicates:
- A Location Path (that is, production [1], LocPath in XPath supported in the current version) used as an existence test
If the nodeSet that results from the Location Path is non-empty, the predicate evaluates as true. The usual (but not only) purpose of this predicate is to select a node if it has at least one attribute or element with a given name.
For example, the following expression selects (as the XPath argument to the SelectNodes method, for example) all
contact
children ofcust
elements, if thecust
element has aninvoice
child element andcontact
has afax
child element:'/active/cust[invoice]/contact[fax]':u
Note: This example substitutes the
[
and]
entities for left and right square-bracket characters from the keyboard, for the reason explained at the beginning of this section. Notice also the accompanying use of the U method. The value of the expression is:/active/cust[invoice]/contact[fax]
Most of the remaining predicate examples below omit the entities in order to more clearly display the XPath grammar.
- A Location Path expression with a comparison operator and literal
For example,
@price > 200
selects a node if the numeric value of the node'sprice
Attribute is greater than 200.See Comparison tests in predicates for further discussion.
- A Location Path (that is, production [1], LocPath in XPath supported in the current version) used as an existence test
- These types of function predicates:
- A "simple" position test using a numeric literal n
This test is equivalent to the implicit use of the position( ) function in the predicate term
position()=n
.For example:
/book/chapter[2]/section[9]/paragraph[3]
- The number( ) function with a location path argument, followed by a comparison to a numeric literal
For example,
number(@size) > 30
selects a node if the numeric value of the node'ssize
Attribute is greater than 30.This predicate differs from the similar Location Path example above (
@price > 200
) primarily in that it allows the Attribute value to be non-numeric. The previous Location Path example cancels the request if a numeric comparison is performed with aprice
Attribute whose value is non-numeric.See Comparison tests in predicates for further discussion.
- The position( ) function, followed by a comparison operator,
followed by an integer literal, which may be negative or zero.
For example:
/book/chapter[position()>1]/section[2]
- The not( ) function, which returns the opposite boolean value
of its boolean argument.
For example:
/book/chapter[2]/section[9]/not(paragraph[3])
- A "simple" position test using a numeric literal n
- Nested predicates.
For example, this statement selects each
Chapter
whose firstSection
has aRacy
attribute:%lis = %bk:SelectNodes('Chapter[Section[1 and @Racy]]':u)
Note: As in an earlier example, the
[
and]
entities in this example require Model 204 7.6 or higher. The value of the XPath argument is:Chapter[Section[1 and @Racy]]
- Multiple predicates in a single step.
For example, using the position() function to filter based on the position of nodes from the preceding predicate, rather than from the step's NodeTest:
/book/chapter[author="Alex"] [2]
The preceding two-predicate step selects the second
chapter
child that is authored byAlex
, while the following expression selects the secondchapter
child of thebook
, if its author isAlex
:/book/chapter[author="Alex" and 2]
Parentheses for grouping in Boolean expressions are supported. For example:
chapter[@type="methods" and (@class="Stringlist" or @class="Daemon")]
- Combination predicates.
Predicates may combine any of these supported functions and supported Location Path expressions using the and and or Boolean operators.
For example:
/active/cust[invoice and position>1]
XPath functions supported in the current version
The following XPath functions are supported in the current version, and Functions gives their XPath 1 definitions. Any differences between the XPath 1 definition and the the XmlDoc API implementation are shown below.
Note:
In discussing XPath functions, the name of the function followed
by an empty pair of parentheses (for example, number()
)
is sometimes used to name the function, whether or not
the particular function being discussed takes arguments.
- position( )
Performs as specified in the XPath standard.
- not(bool)
The function argument is a Boolean expression, and the function result is
true
if the value of the argument isfalse
, and it isfalse
otherwise.Notes:
Performs as specified in the XPath standard.
- The result of the not() function applied to a comparison
expression is different than the result of the same expression with
the complementary comparison.
For example,
this statement selects children that have the value of the status
attribute equal to "pending":
%lis = %nod:SelectNodes('*[@status="pending"]')
This statement selects children that have the value of the status attribute equal to something other than “pending”:
%lis = %nod:SelectNodes('*[@status!="pending"]')
This statement selects children that have the value of the status attribute equal to something other than “pending” or that have no status attribute:
%lis = %nod:SelectNodes('*[not(@status="pending")]')
- number(nodeset?)
The XmlDoc API number( ) function differs from the XPath 1 definition as follows:
- XPath 1 allows a variety of argument types (for example, a string literal); the XmlDoc API allows only a nodeSet argument.
- In XPath 1, if a nodeSet argument to number( )
contains more than one node, the first node
(in document order) is converted to a number and returned.
In the XmlDoc API, if the argument result contains more than one node, the request is cancelled, which is consistent with the XPath 2 standard.
- The definition of a numeric value for number( ) (after stripping leading and trailing whitespace) is the same as the NumericLit production ([30]) in XPath supported in the current version. This is consistent with the XPath 2 standard, and is an extension of the XPath 1 definition of number( ), which only accepts numbers of the form DecimalNumber ([30d]).
Comparison tests in predicates
In the XPath standard (XPath 1 and XPath 2), either operand in a comparison test in a predicate may be any form of XPath expression. The predicate evaluates as true if the comparison is true of at least one node in the resulting nodeSet, and typically the purpose of the predicate is to select the nodes for which the comparison is true.
For example, the following expression selects all item
children
of order
elements that have a price
attribute
whose value is greater than 9.99:
order/item[@price > 9.99]
XmlDoc API predicate comparisons differ from XPath 1 comparisons:
- In XmlDoc API predicates, comparison operands are more restricted than in the XPath standard. As is explained in the syntax discussion below, you can use only a Location Path or the number( ) function, followed by a comparison operator, followed by a literal.
- The XmlDoc API uses XPath 2 comparisons.
The XPath 1 standard does not provide for exception conditions
and it does not provide for ordered string comparisons.
This is also true for Microsoft .Net, which follows the XPath 1 standard.
The XmlDoc API follows the XPath 2 standard by providing for exceptions (implemented as request cancellation conditions) and providing for ordered string comparisons.
As of the current version, the only forms of comparisons are these four:
position() relOp integer LocPath relOp "stringLiteral" LocPath relOp numericLiteral number([LocPath]) relOp numericLiteral
Where:
- position( )
- This function is discussed in Predicates supported in the current version.
- relOp
- One of the comparisons: =, !=, <, <=, >, >=
- integer
- An integer, whose precision is limited to 15 decimal digits (as in SOUL).
- LocPath
- An XPath location expression (that is, production [1], LocPath, in XPath supported in the current version). As of the current version, such an expression still has the limitation that it may not contain a predicate.
- "stringLiteral"
- A quoted string literal value, which must not exceed 255 bytes.
The XmlDoc API has always supported ordered string comparisons, but the XPath 1 standard does not. For more information about these comparisons, see Ordered string comparisons.
- numericLiteral
- A numeric literal value,
whose precision is limited to 15 decimal digits (as in SOUL).
For additional format and size limitations, see Restrictions and limitations.
For more information about support for comparisons of a Location Path to a numeric literal, see Direct numeric comparison.
Numeric literals in predicate comparisons are supported in the XmlDoc API.
- number([LocPath])
- The number( ) function with an optional Location Path argument.
A comparison using the number( ) function is very similar to comparison of a Location Path to a numeric literal (Direct numeric comparison). Comparing the result of number( ) to a literal gives a result according to their relative values and to the comparison operator. For example,
shirt[number(@size) > 30]
selects nodes that have a size greater than 30.The significant difference between using a Location Path and using a number( ) function in a numeric comparison is that the request is cancelled in the former case if a node in the comparison is non-numeric. This difference is discussed briefly below and in greater detail in number(LocPath) comparisons for non-numeric data.
These are the effects of the function's LocPath argument (they are consistent with the XPath 2 standard):
- If the result of the LocPath argument is a single node, number( ) converts the value of the node, after stripping leading and trailing whitespace, to a number, or to the special value NaN ("Not a Number") if the stripped value of the node is not numeric.
- If the result of the argument has more than one node, the request is cancelled. For further details, see number( ) comparisons that cause request cancellation.
- If the LocPath argument is the empty nodeSet, the result of the
number( ) function is NaN.
See number(LocPath) != n, LocPath result is empty node-set for examples.
Note: In XPath 1 (which has no exception conditions), if there is more than one node in the nodeSet argument, the value of the first node (in document order) is used.
- If you omit the LocPath argument,
the default argument "." (the context node) is used; that is,
the node that is being filtered by the predicate gets
converted to a numeric value.
For example: in the following XPath expression, the number( ) function converts the value of the
size
Attribute to a number:/*/shirt/@size[number() > 10]
If the number( ) result that is compared to a literal is NaN, the comparison is always false (or, in the case of the != operator, is always true). This is important to note, because it means number( ) can be used to avoid the request cancellation to which numeric comparisons are subject if the nodes evaluated by a predicate may be non-numeric. For further discussion, see number(LocPath) comparisons for non-numeric data.
If you are using number( ) to avoid request cancellation for a numeric comparison because the nodes evaluated by a predicate may be non-numeric, and you are using the "not equals" comparison (!=), remember that an empty nodeSet argument will give a true comparison result (as a consequence of the rule for comparing NaN). You can filter out the nodes included by empty nodeSet comparisons by expanding the Location Path expression from
number(LocPath)
toLocPath
andnumber(LocPath)
, as described in number(LocPath) != n, LocPath result is empty node-set.Note: In the XmlDoc API, the number( ) function must be immediately followed by a comparison operator and a numeric literal. This limitation is not required by XPath 1 or XPath 2.
Ordered string comparisons
If an XPath expression contains a Location Path subexpression and a quoted string with an ordered comparison (that is, a comparison other than "=" and "!="), the result is based on a byte-by-byte ordered comparison between each item of the nodeSet result of the subexpression and the literal string value.
Consider the following example:
%nlis = %nod:SelectNodes('order[@date>"2007-01-01"]')
If the value of the date
Attribute node of an order
Element child is,
for example, 2007-05-17
, that order Element node will be included in
the result.
This behavior has always been available in the XmlDoc API: if a comparison literal is bracketed in double or single quotation marks, a string comparison is performed, whether or not the literal has a numeric format (this is consistent with XPath 2 string comparisons). Note, however, that most practical ordered comparisons involve numeric values, which are supported.
In XPath 1, any ordered comparison is done by first converting each operand to a numeric value and then performing the comparison, and the result of any ordered comparison of a non-numeric node value is false. Therefore, the XPath 1 result of the above example would always be empty, because the literal is a non-numeric value.
Direct numeric comparison
If an XPath expression contains a Location Path subexpression compared to a literal numeric value, the result is true if any node in the subexpression result, converted to a numeric value, has the specified relationship ("=", "<", etc.) to the literal value.
In the following example, an order
child is in the result if
it has an item
child whose price
Attribute node is greater
than 99.99:
%nlis = %nod:SelectNodes('order[item@price > 99.99]')
For more examples, see Successful direct numeric comparisons below.
If any node value used in the direct numeric comparison is non-numeric, the request is cancelled. For examples, see Direct numeric comparisons that cause request cancellation below.
The discussion that follows makes references to
the following Clothes
document:
<Clothes> <shirt size="32" type="dress" sku="100"/> <shirt size="33" type="sport" sku="101"/> <shirt size="M" type="sport" sku="102"/> <shirt size="34" type="frilly" sku="103"/> </Clothes>
- Successful direct numeric comparisons
If a predicate contains a comparison of a Location Path to a numeric literal, the comparison is true if the numeric value, after stripping leading and trailing whitespace, of any of the nodes in the Location Path result has the specified relationship to the numeric literal. If none of the nodes has the relationship (which includes the case that the Location Path result is empty), the result of the comparison is false.
For example, using the
Clothes
document described above, the following statement printssku="100"
.%doc:Print('/*/shirt[@size<040]/@sku')
Note that the numeric value of the node and the numeric value of the literal are compared, so the leading zero in
040
here is ignored. An equivalent comparison could be@size<040.00
, etc. The following statement printssku="101"
:%doc:Print('/*/shirt[@size<40 and @type="sport"]/@sku')
The following statement prints
sku="103"
; the comparison of thesize
Attribute is processed for only one Element, which has a numericsize
. As discussed below, this request would fail if the order of the attribute subexpressions were reversed.%doc:Print('/*/shirt[@type="frilly" and @size<40]/@sku')
- Direct numeric comparisons that cause request cancellation
If any of the nodes used in a direct numeric comparison has a value that is non-numeric after stripping leading and trailing whitespace, the request is cancelled.
For example, using the
Clothes
document described above, the following statement causes the request to be cancelled when the size attribute (M
) of the second sport shirt is compared to the number 40 (note the difference between this XPath expression and the last one in Successful direct numeric comparisons above):%doc:Print('/*/shirt[@size<40 and @type="frilly"]/@sku')
The following statement causes the request to be cancelled (at the same Element), because SelectNodes continues after the first selection, unlike the Print example with the same XPath expression in Successful direct numeric comparisons above):
%sh = %doc:SelectNodes('/*/shirt[@size<40 and @type="sport"]/@sku')
If you want the request cancellation to be avoided in statements like these, consider using the number( ) function (see number(LocPath) comparisons for non-numeric data).
number(LocPath) comparisons for non-numeric data
The number( ) function can often be used to avoid request cancellation due to the presence of non-numeric data in a direct numeric comparison.
For example, both statements from
Direct numeric comparisons that cause request cancellation
above can avoid request cancellation
(when used with the particular document discussed in that section)
if the Location Path @size
is "converted"
using the number( ) function.
The following statement prints sku="103"
, even though the size
Attribute equal to M
is processed before the selected Element:
%doc:Print('/*/shirt[number(@size)<40 and @type="frilly"]/@sku')
Similarly, the following statement succeeds even though the size
Attribute equal to M
is processed (and not selected):
%sh = %doc:SelectNodes('/*/shirt[number(@size)<40 and @type="sport"]/@sku')
Comparisons with the number( ) function are always false for a non-numeric node value, unless the comparison is !=
.
The following statements both print None found
:
Print %doc:ValueDefault( - '/*/shirt[number(@size) < 40 and @sku="102"]/@size', 'None found') Print %doc:ValueDefault( - '/*/shirt[number(@size) >= 40 and @sku="102"]/@size', 'None found')
The following, however, prints M
, the size of shirt with
this SKU. Its size is not less than 40 nor greater than or
equal to 40, but it is not equal to 40 (nor any other number):
Print %doc:ValueDefault( - '/*/shirt[number(@size) != 40 and @sku="102"]/@size', 'None found')
Note: Before substituting the number( ) function into a direct numeric comparison, you should be aware of two differences between direct numeric comparison and the use of number( ):
- Although number( ) can be used to prevent non-numeric nodes from causing request cancellation, a different number( ) condition can cause cancellation: if the argument nodeSet contains more than one node. See number( ) comparisons that cause request cancellation.
- The result of the
!=
comparison istrue
if the value of the number( ) argument nodeSet is empty. See number(LocPath) != n, LocPath result is empty node-set, which follows.
number(LocPath) != n, LocPath result is empty node-set
When a predicate contains the number( ) function followed by
the !=
comparison, if the nodeSet result is empty, the result of
the comparison is true; if any other comparison operator is used, the result is
false.
For example, consider this document:
<t> <w a="1"/> <x a="PI"/> <y a="e" b="1"/> <z a="e" b="2"/> </t>
If you are using a numeric comparison to search for Attribute a
,
you should use the number( ) function to avoid request cancellation,
because a
has non-numeric values.
The following statement sets the result nodelist to the
Element w
:
%nlis = %doc:SelectNodes('/t/*[number(@a) = 1]')
The following statement sets the result nodelist to the
Elements x
, y
, and z
:
%nlis = %doc:SelectNodes('/t/*[number(@a) != 1]')
The b
Attribute, however, does not have any non-numeric
values, so it can be used without number( ).
Each of the following two statements sets the result nodelist to the
Element y
:
%nlis = %doc:SelectNodes('/t/*/[@b = 1]') %nlis = %doc:SelectNodes('/t/*[number(@b) = 1]')
However, the following two statements differ in their result:
%nlis = %doc:SelectNodes('/t/*[@b != 1]') %nlis = %doc:SelectNodes('/t/*[number(@b) != 1]')
The first sets the result nodelist to the
Element z
, while the second includes
Elements w
and x
as well as the Element z
.
Since they do not contain the b
Attribute, the result of number(@b)!=1
at elements w
and x
is true
.
If you want to make number( ) similar to a direct comparison in this
respect, you "and" the Location Path argument with the
number( ) factor in the predicate.
So, for example, the following sets the result nodelist to the
Element z
, just like the direct comparison approach:
%nlis = %doc:SelectNodes('/t/*[@b and number(@b) != 1]')
Note: The other way in which number( ) differs from direct comparison is described in number( ) comparisons that cause request cancellation.
number( ) comparisons that cause request cancellation
When a predicate contains the number( ) function, the request is cancelled if the value of the nodeSet argument to the number( ) function has more than one node.
For example, consider this document:
<t> <x a="1" b="2"/> <y b="pi" a="3.14159265"/> </t>
If you are searching for all Elements that have any Attribute
greater than 1, you can use the Location Path @*
as
a wildcard comparison for any Attribute.
However, you cannot use direct comparison, because some of the
attributes are non-numeric.
So, you might try to use number(@*)
, as in the
following example:
%nlis = %doc:SelectNodes('/t/*[number(@*) > 1]')
However, this will cause a request cancellation, because the value of
@*
contains more than one node.
In such situations,
you must decide which node is to be converted
to a number for the comparison.
In this case, you probably want to use:
%nlis = %doc:SelectNodes( - '/t/*[number(@a) > 1 or number(@b) > 1]')
This will set the result nodelist to Elements x
and y
.