XML Language Family Detailed Examples • Most information contained in these slide comes from: h ttp://www.w3.org, http://www.zvon.org/ • These slides are intended to be used as a tutorial on XML and related technologies • Slide author: Jürgen Mangler ([email protected]) • This section contains examples on: XPath, XPointer
XML Language Family Detailed Examples. Most information contained in these slide comes from: h ttp://www.w3.org, http://www.zvon.org/ These slides are intended to be used as a tutorial on XML and related technologies Slide author: Jürgen Mangler ([email protected]) - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
XML Language FamilyDetailed Examples
• Most information contained in these slide comes from: h ttp://www.w3.org, http://www.zvon.org/
• These slides are intended to be used as a tutorial on XML and related technologies
XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations [XSLT] and XPointer. The primary purpose of XPath is to address parts of an XML document.
• XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values.
• XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax.
• XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.
• In addition to its use for addressing, XPath is also designed to feature a natural subset that can be used for matching (testing whether or not a node matches a pattern); this use of XPath is described in XSLT (next chapter).
• XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes.
The basic XPath syntax is similar to filesystem addressing. If the path starts with the slash / , then it represents an absolute path to the required element.
/AAA/CCCSelect all elements CCC which are children of the root element AAA
The expression in square brackets can further specify an element. A number in the brackets gives the position of the element in the selected set. The function last() selects the last element in the selection.
/papers/paper[last()]Select the last BBB child of element AAA
Values of attributes can be used as selection criteria. Function normalize-space removes leading and trailing spaces and replaces sequences of whitespace characters by a single space.
//student[normalize-space(@name)='hauer']Select BBB elements which have an attribute name with value bbb, leading and trailing spaces are removed before comparison
The above example illustrates how axes work. Starting with node an axe would select the equal named nodes. This example is also the base for the next two pages.
parent
nodefollowing-sibling
preceding-sibling
descendant descendant
The following main axes are available:• the child axis contains the children of the context
node• the descendant axis contains the descendants of the
context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes
• the parent axis contains the parent of the context node, if there is one
• the following-sibling axis contains all the following siblings of the context node; if the context node is an attribute node or namespace node, the following-sibling axis is empty
• the preceding-sibling axis contains all the preceding siblings of the context node; if the context node is an attribute node or namespace node, the preceding-sibling axis is empty
The child axis contains the children of the context node. The child axis is the default axis and it can be omitted. The descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes.
//CCC/descendant::DDDSelect elements DDD which have CCC among its ancestors
<CCC> <DDD> <EEE> </DDD> </EEE> </DDD> </CCC>
/AAAEquivalent of /child::AAA
<AAA> <BBB/> <CCC/> </AAA>
XPointer is intended to be the basis of fragment identifiers only for the text/xml and application/xml media types (they can point only to documents of these types).
Pointing to fragments of remote documents is analogous to the use of anchors in HTML. Roughly: document#xpointer(…)
If there are forbidden characters in your expression, you must deal with them somehow.When XPointer appears in an XML document, special characters must be escaped according to directions in XML.
• The characters < or & must be escaped using < and &.
• Any unbalanced parenthesis must be escaped using circumflex (^)
If your elements have an ID-type attribute, you can address them directly using the value of the ID-type attribute. (Don't forget: you must have an attribute defined as an ID type in your DTD!)Using ID-type attributes, you can easily include or jump to parts of documents.The example below selects node with id("b1").
xpointer(id("b1"))
<book> <book id="b1" name="XML">Bad book.</book> <book id="b2" name="JAVA"> Good book. <additional>Makes me sleep like a baby.</additional> </book> <book id="123" name="42">All answers on only one page.</book></book>
The specification defines one full form and one shorthand form (which is an abbreviation of the full one).
<AAA> <BBB myid="b1" bbb="111">Text in the first element BBB.</BBB> <BBB myid="b2" bbb="222"> Text in another element BBB. <DDD ddd="999">Text in more nested element.</DDD> <DDD ddd="888">Text in more nested element.</DDD> <DDD ddd="777">Text in more nested element.</DDD> </BBB> <CCC ccc="123" xxx="321">Again some text in some element.</CCC> </AAA>
• Short Form: /1/2/3• Full Form: xpointer(/*[1]/*[2]/*[3])
A location of type point is defined by a node, called the container node (node that contains the point), and a non-negative integer, called the index.(//AAA, //AAA/BBB are the container nodes, [1], [2] is used if more than one container node of the same name exists)
When the container node of a point is of a node type that cannot have child nodes (such as text nodes, comments, and processing instructions), then the index is an index into the characters of the string-value of the node; such a point is called a character-point.You can use this to write a link that behaves like a search function. It always jumps to the first appearance of a string, e.g. the word "another".
<AAA> <BBB bbb="111">Text in the first element BBB.</BBB> <BBB bbb="222"> Text in a▼nother element BBB. <DDD ddd="999">Text in more nested element.</DDD> </BBB> <CCC ccc="123" xxx="321">Again some text in some element.</CCC></AAA>
The range function returns ranges covering the locations in the argument location-set. For each location x in the argument location-set, a range location representing the covering range of x is added to the result location set.
xpointer(range(//AAA/BBB[2]))
<AAA> <BBB bbb="111"/> <BBB bbb="222"> Text in another element BBB. </BBB> <CCC ccc="123" xxx="321"/></AAA>
The range-inside function returns ranges covering the contents of the locations in the argument location-set.
xpointer(range-inside(//AAA/BBB[2]))
<AAA> <BBB bbb="111"/> <BBB bbb="222"> Text in another element BBB. </BBB> <CCC ccc="123" xxx="321"/></AAA>
For each location x in the argument location-set, end-point adds a location of type point to the result location-set. That point represents the end point of location x.
<AAA> <BBB bbb="111">Text in the first element BBB.</BBB> <BBB bbb="222"> Text in another▼ element BBB. <DDD ddd="999">Text in more nested element.</DDD> </BBB> <CCC ccc="123" xxx="321">Again some text in some element.</CCC> </AAA>