XPointer and XLink (Webmaster in a Nutshell, 3rd Edition)

The final pieces of XML we cover are XPointer and XLink. These are separate standards in the XML family dedicated to working with XML links. Before we delve into them, however, we should warn you that the standards described here are not final as of publication time.

It's important to remember that an XML link is only an assertion of a relationship between pieces of documents; how the link is actually presented to a user depends on a number of factors, including the application processing the XML document.

10.10.3. XPointer

XPointer is designed to resolve the problem of locating an element or range of elements in an XML document. It is possible to do this in HTML if the element is referenced by an <a name="name"> tag. Here, a link is made for the section of the document using the <a href="url#name"> tag.

10.10.3.1. Fragment-identifier syntax

As we saw earlier, XML has this type of functionality through its unique identifiers. It is possible to locate an element with an identifier using a link such as the following:

document.xml#identifier

where identifier is a valid XPointer fragment identifier. However, this form is a simplification that is tolerated for compatibility with previous versions. The most common syntax for an XPointer fragment identifier is:

document.xml#xpointer(xpath)

Here xpath is an expression consistent with the XPath specification. It is the right thing to do in this case because it can be used to locate a node-set within a document. The link document.xml#identifier can be rewritten as:

document.xml#xpointer(id("identifier"))

There is a third possible form made up of a whole number separated by slashes. Each whole number selects an nth child from its predecessor in the expression.

Several fragment identifiers can be combined by placing them one after the other. For example:

document.xml#xpointer(...)xpointer(...)...

The application must evaluate the fragments, from left to right, and use the first valid fragment. This functionality is useful for two reasons:

It offers several solutions, the first of which is based on suppositions that may prove to be false (and produce an error). For example, we can try to locate a fragment in a document using an identifier, then (if no ID was defined) using the attribute value with the name id. We would write the fragment:
```
xpointer(id("conclusion"))xpointer(//*[@id=conclusion])
```
It also allows for future specifications. If an XPointer application encounters an expression that does not begin with xpointer, it will simply ignore it and move on to the next expression.

As we mentioned earlier, the XPointer application is responsible for link rendering, but it is also responsible for error handling. If the link's URL is wrong or if the fragment identifier is not valid, it is up to the application to manage the situation (by displaying an error message, for example).

10.10.3.2. XPointer datatypes

Earlier we showed you how to locate an XML node within a document. XPointer goes even further by defining the point, range, and position (location) types:

Point: Can precede or follow a node (point of type node) or a character (thus, a point of type character).
Range: Is defined as the content of a document between two points (where the starting point cannot be located after the ending point within a document). A range cannot be reduced to a set of nodes and characters because it can include fragments of the former.
Position: Is a generalized concept of the node. It can be a node, a point, or a range.

Equipped with these new datatypes, XPointer can set out to locate a resource in an XML document.

10.10.3.3. Manipulation of points, ranges, and positions

A range is defined using the to operator. This operator is enclosed in starting points (to the left) and ending points (to the right). The second point is calculated using the first point as a reference. For example, to make a range from the beginning of the first paragraph to the end of the last paragraph in a section where the ID is XPointer, you would write:

xpointer(id("XPointer")/para[1] to
id("XPointer")/para[last( )])

or:

xpointer(id("XPointer")/para[1] to
following-sibling::para[last( )])

A range defined this way may be compared with the selection a user can make in a document with a mouse.

Naturally, XPointer also has functions to manipulate points and ranges. The available functions are:

string-range(positions, string[, offset][, length])

This function can be used to search for strings in a document and return a set of positions where they appear. The first argument is an XPath expression—a set of positions where the search must take place. The second is the string being searched. To search for the string XML in <chapter> elements where the title attribute is XPointer, we would write the expression:

string-range(//chapter[@title=XPointer], "XML")

To index the word XML by pointing to the first occurrence of the word in an element such as <para>, use the following expression:

string-range(//para, "XML")[1]

This function takes two other optional arguments. The third argument, offset, is a number that indicates the first character to be included in the result range offset from the beginning of the string searched for. The fourth argument, length, gives the length of the result range. By default, offset has a value of 1, thus the result range begins before the first character in the string. length has a default value such that the result range covers the entire string searched.

range(positions)

This function takes an XPath expression and returns a set of ranges (a location set) where each includes the positions passed as parameters. It can be used to convert a set of positions (which may be nodes) to a set comprising ranges only.

range-inside(positions)

This function takes an XPath expression and returns a set of ranges (a location set) for each of the positions passed as arguments.

start-point(positions)

This function takes an XPath expression and returns the starting point of the range for each of the positions passed as arguments. The result is a set of points.

end-point(positions)

This function takes an XPath expression and returns the end point of the range for each of the positions passed as arguments. The result is a set of points.

here( )

This function is defined only within an XML document. It returns a unique position comprising the element containing the XPointer expression or the attribute that contains it.

origin( )

This function can be used only for links triggered by the user. It returns the element's position to the original link.

10.10.4. XLink

Now that we know about XPointer, let's take a look at some inline links:

<?xml version="1.0"?>
<simpledoc xmlns:xlink="http://www.w3.org/1999/xlink">
<title>An XLink Demonstration</title>
<section id="target-section">
   <para>This is a paragraph in the first section.</para>
   <para>More information about XLink can be found at
      <reference xlink:type="simple" 
      xlink:href="http://www.w3.org"> 
      the W3C
      </reference>.
   </para>
</section>
<section id="origin-section">
   <para>
   This is a paragraph in the second section.
   </para>
   <para>
   You should go read
      <reference xlink:type="simple" 
      xlink:href="#target-section"> 
      the first section
      </reference>
   first.
   </para>
</section>
</simpledoc>

The first link states that the text the W3C is linked to the URL http://www.w3.org. How does the browser know? Simple. An HTML browser knows that every <a> element is a link because the browser has to handle only one document type. In XML, you can make up your own element type names, so the browser needs some way of identifying links.

XLink provides the &xlink:type; attribute for link identification. A browser knows it has found a simple link when any element sets the &xlink:type; attribute to a value of simple. A simple link is like a link in HTML—one-way and beginning at the point in the document where it occurs. (In fact, HTML links can be recast as XLinks with minimal effort.) In other words, the content of the link element can be selected for traversal at the other end. Returning to the source document is left to the browser.

Once an XLink processor has found a simple link, it looks for other attributes that it knows:

&xlink:href;

This attribute is deliberately named to be familiar to anyone who has used the Web before. Its value is the URI of the other end of the link; it can refer to an entire document or to a point or element within that document. If the target is in an XML document, the fragment part of the URI is an XPointer.

This attribute must be specified, since without it, the link is meaningless. It is an error not to include it.

&xlink:role;

This describes the nature of the object at the other end of the link. XLink doesn't predefine any roles; you might use a small set to distinguish different types of links in your documents, such as cross-references, additional reading, and contact information. A style sheet might take a different action (such as presenting the link in a different color) based on the role, but the application won't do anything automatically.

&xlink:title;

A title for the resource at the other end of the link can be provided, identical to HTML's title attribute for the <a> element. A browser might display the title as a tool tip; an aural browser might read the title when the user pauses at the link before selecting it. A style sheet might also make use of the information, perhaps to build a list of references for a document.

&xlink:show;

This attribute suggests what to do when the link is traversed. It can take the following values:

embed: The content at the other end of the link should be retrieved and displayed where the link is. An example of this behavior in HTML is the <img> element, whose target is usually displayed within the document.
replace: When the link is activated, the browser should replace the current view with a view of the resource targeted by the link. This is what happens with the <a> element in HTML: the new page replaces the current one.
new: The browser should somehow create a new context, if possible, such as opening a new window.
other: This value specifies behavior that isn't described by the other values. It is up to the application to determine how to display the link.
none: This specifies no behavior.

You do not need to give a value for this attribute. Remember that a link primarily asserts a relationship between data; behavior is best left to a style sheet. So unless the behavior is paramount (as it might be in some cases of embed), it is best not to use this attribute.

&xlink:actuate;

The second of the behavioral attributes specifies when the link should be activated. It can take the following values:

onRequest: The application waits until the user requests that the link be followed, as the <a> element in HTML does.
onLoad: The link should be followed immediately when it is encountered by the application; this is what most HTML browsers do with <img> elements, unless the user has turned off image loading.
other: The link is activated by other means, not specified by XLink. This is usually defined by other markup in the document.
none: This indicates no information about the activation of the link and may be used when the link has no current meaningful target or action.

10.10. XPointer and XLink

10.10.1. Unique Identifiers

10.10.2. ID References

10.10.3. XPointer

10.10.3.1. Fragment-identifier syntax

10.10.3.2. XPointer datatypes

10.10.3.3. Manipulation of points, ranges, and positions

10.10.4. XLink

10.10.5. Building Extended Links

10.10.5.1. Extended links

10.10.5.2. Resource elements

10.10.5.3. Locator elements

10.10.5.4. Arc elements

10.10.5.5. Title elements

10.10.5.6. Linkbases

10.10.6. XBase