Chapter 11. XPointers
XPointers are a non-XML syntax for identifying locations inside XML documents. An XPointer is attached to the end of the URI as its fragment identifier to indicate a particular part of an XML document rather than the entire document. XPointer syntax builds on the XPath syntax used by XSLT and covered in Chapter 9. To the four fundamental XPath data types--Boolean, node-set, number, and string--XPointer adds points and ranges, as well as the functions needed to work with these types. It also adds some shorthand syntax for particularly useful and common forms of XPath expressions.
WARNING: This chapter is based on the September 11, 2001 W3C Candidate Recommendation of XPointer. However, there are known issues with this draft, and some of the details described here are likely to change. The most current version of the XPointer recommendation can be found at http://www.w3.org/TR/xptr.
11.1. XPointers on URLs
A URL that identifies a document typically looks something like http://java.sun.com:80/products/jndi/index.html. The scheme, http in this example, tells you what protocol the application should use to retrieve the document. The authority, java.sun.com:80 in this example, tells you from which host the application should retrieve the document. The authority may also contain the port to connect to that host and the username and password to use. The path, /products/jndi/index.html in this example, tells you which file in which directory to ask the server for. This may not always be a real file in a real filesystem, but it should be a complete document the server knows how to generate and return. All of this you're already familiar with, and XPointer doesn't change any of it.
You probably also know that some URLs contain fragment identifiers that point to a particular named anchor inside the document the URL locates. This is separated from the path by the sharp sign #. For example, if we were to add the fragment download to the previous URL, then it would become http://java.sun.com:80/products/jndi/index.html#download. When a web browser follows a link to this URL, it looks for a named anchor in the document at http://java.sun.com:80/products/jndi/index.html with the name download such as this one:
It would then scroll the browser window to the position in the document where the anchor with that name is found. This is a simple and straightforward system, and it works well for HTML's simple needs. However, it has one major drawback: to link to a particular point of a particular document, you must be able to modify the document to which you're linking in order to insert a named anchor at the point to which you want to link. XPointer endeavors to eliminate this restriction by allowing you to specify where you want to link to using full XPath expressions as fragment identifiers. Furthermore, XPointer expands on XPath by providing operations to select particular points in or ranges of an XML document that do not necessarily coincide with any one node or set of nodes. For instance, an XPointer can describe the range of text currently selected by the mouse.
The most basic form of XPointer is simply an XPath expression -- often, though not necessarily, a location path--enclosed in the parentheses of xpointer( ). For example, these are all acceptable XPointers:
xpointer(/) xpointer(//first_name) xpointer(id('sec-intro')) xpointer(/people/person/name/first_name/text( )) xpointer(//middle_initial[position( )=1]/../first_name) xpointer(//profession[.="physicist"]) xpointer(/child::people/child::person[@index<4000]) xpointer(/child::people/child::person/attribute::id)
Not all of these XPointers necessarily refer to a single element. Depending on which document the XPointer is evaluated relative to, an XPointer may identify zero, one, or more than one node. Most commonly the nodes identified are elements, but they can also be attribute nodes or text nodes, as well as points or ranges.
If you're uncertain whether a given XPointer will locate something, you can back it up with an alternative XPointer. For example, this XPointer looks first for first_name elements. However, if it doesn't find any, it looks for last_name elements instead:
The last_name elements will be found only if there are no first_name elements. You can string as many of these XPointer parts together as you like. For example, this XPointer looks first for first_name elements. If it doesn't find any, it then seeks out last_name elements. If it doesn't find any of those, it looks for middle_initial elements. If it doesn't find any of those, it returns an empty node set:
No special separator character or whitespace is required between the individual xpointer( ) parts, though whitespace is allowed. This XPointer means the same thing:
xpointer(//first_name) xpointer(//last_name) xpointer(//middle_initial)
Copyright © 2002 O'Reilly & Associates. All rights reserved.