10.10. XPointer and XLink
The final pieces of XML we cover are XPointer and XLink. These are
separate standards in the XML family dedicated to working with XML
links. Before we delve into them, however, we should warn you that
the standards described here are not final as of publication time.
It's important to remember that an XML link is only
an assertion of a relationship between pieces of
documents; how the link is actually presented to a user depends on a
number of factors, including the application processing the XML
document.
10.10.1. Unique Identifiers
To create a link, we must first have a labeling scheme for XML
elements. One way to do this is to assign an identifier to specific
elements we want to reference using an ID attribute:
<paragraph id="attack">
Suddenly the skies were filled with aircraft.
</paragraph>
You can think of IDs in XML documents as street addresses: they
provide a unique identifier for an element within a document.
However, just as there might be an identical address in a different
city, an element in a different document might have the same ID.
Consequently, you can tie together an ID with the
document's URI, as shown here:
http://www.oreilly.com/documents/story.xml#attack
The combination of a document's URI and an
element's ID should uniquely identify that element
throughout the universe. Remember that an ID attribute does not need
to be named id, as shown in the first example. You
can name it anything you want, as long as you define it as an XML ID
in the document's DTD. (However, using
id is preferred in the event that the XML
processor does not read the DTD.)
Should you give an ID to every element in your documents? No. Odds
are that most elements will never be referenced.
It's best to place IDs on items that a reader would
want to refer to later, such as chapter and section divisions, as
well as important items, such as term definitions.
10.10.2. ID References
The easiest way to refer to an ID attribute is with an ID reference,
or IDREF. Consider this example:
<?xml version="1.0" standalone="yes"?>
<DOCTYPE document [
<!ELEMENT document (employee*)>
<!ELEMENT employee (#PCDATA)>
<!ATTLIST employee empnumber ID #REQUIRED>
<!ATTLIST employee boss IDREF #IMPLIED>
]>
<employee emplabel="emp123">Jay</employee>
<employee emplabel="emp124">Kay</employee>
<employee emplabel="emp125" boss="emp123">Frank</employee>
<employee emplabel="emp126" boss="emp124">Hank</employee>
As with ID attributes, an IDREF is typically declared in the DTD.
However, if you're in an environment where the
processor might not read the DTD, you should call your ID references
IDREF.
The chief benefit of using an IDREF is that a validating parser can
ensure that every one points to an actual element; unlike other forms
of linking, an IDREF is guaranteed to refer to something within the
current document.
As we mentioned earlier, the IDREF only asserts a relationship of
some sort; the style sheet and the browser will determine what is to
be done with it. If the referring element has some content, it might
become a link to the target. But if the referring element is empty,
the style sheet might instruct the browser to perform some other
action.
As for the linking behavior, remember that in HTML a link can point
to an entire document (which the browser will download and display,
positioned at the top) or to a specific location in a document (which
the browser will display, usually positioned with that point at the
top of the screen). However, linking changes drastically in XML. What
does it mean to have a link to an entire element, which might be a
paragraph (or smaller) or an entire group of chapters? The XML
application attempts some kind of guess, but the display is best
controlled by the style sheet. For now, it's best to
simply make a link as meaningful as you can.
10.10.3. XPointer
XPointer is designed to resolve the problem of
locating an element or range of elements in an XML document. It is
possible to do this in HTML if the element is referenced by an
<a
name="name">
tag. Here, a link is made for the section of the document using the
<a
href="url#name">
tag.
10.10.3.1. Fragment-identifier syntax
As we saw earlier, XML has this type of functionality through its
unique identifiers. It is possible to locate an element with an
identifier using a link such as the following:
document.xml#identifier
where identifier is a valid XPointer
fragment identifier. However, this form is a simplification that is
tolerated for compatibility with previous versions. The most common
syntax for an XPointer fragment identifier is:
document.xml#xpointer(xpath)
Here xpath is an expression consistent
with the XPath specification. It is the right thing to do in this
case because it can be used to locate a node-set within a document.
The link document.xml#identifier can be
rewritten as:
document.xml#xpointer(id("identifier"))
There is a third possible form made up of a whole number separated by
slashes. Each whole number selects an nth child
from its predecessor in the expression.
Several fragment identifiers can be combined by placing them one
after the other. For example:
document.xml#xpointer(...)xpointer(...)...
The application must evaluate the fragments, from left to right, and
use the first valid fragment. This functionality is useful for two
reasons:
-
It offers several solutions, the first of which is based on
suppositions that may prove to be false (and produce an error). For
example, we can try to locate a fragment in a document using an
identifier, then (if no ID was defined) using the attribute value
with the name id. We would write the fragment:
xpointer(id("conclusion"))xpointer(//*[@id=conclusion])
-
It also allows for future specifications. If an XPointer application
encounters an expression that does not begin with
xpointer, it will simply ignore it and move on to
the next expression.
As we mentioned earlier, the XPointer application is responsible for
link rendering, but it is also responsible for error handling. If the
link's URL is wrong or if the fragment identifier is
not valid, it is up to the application to manage the situation (by
displaying an error message, for example).
10.10.3.2. XPointer datatypes
Earlier we showed you how to locate an XML node within a document.
XPointer goes even further by defining the
point, range, and
position (location) types:
- Point
-
Can precede or follow a node (point of type node)
or a character (thus, a point of type character).
- Range
-
Is defined as the content of a document between two points (where the
starting point cannot be located after the ending point within a
document). A range cannot be reduced to a set of nodes and characters
because it can include fragments of the former.
- Position
-
Is a generalized concept of the node. It can be a node, a point, or a
range.
Equipped with these new datatypes, XPointer can set out to locate a
resource in an XML document.
10.10.3.3. Manipulation of points, ranges, and positions
A range is defined using the to operator. This
operator is enclosed in starting points (to the left) and ending
points (to the right). The second point is calculated using the first
point as a reference. For example, to make a range from the beginning
of the first paragraph to the end of the last paragraph in a section
where the ID is XPointer, you would write:
xpointer(id("XPointer")/para[1] to
id("XPointer")/para[last( )])
or:
xpointer(id("XPointer")/para[1] to
following-sibling::para[last( )])
A range defined this way may be compared with the selection a user
can make in a document with a mouse.
Naturally, XPointer also has functions to manipulate points and
ranges. The available functions are:
- string-range(positions, string[, offset][, length])
-
This function can be used to search for strings in a document and
return a set of positions where they appear. The first argument is an
XPath expression—a set of positions where the search must take
place. The second is the string being searched. To search for the
string XML in <chapter>
elements where the title attribute is
XPointer, we would write the expression:
string-range(//chapter[@title=XPointer], "XML")
To index the word XML by pointing to the first
occurrence of the word in an element such as
<para>, use the following expression:
string-range(//para, "XML")[1]
This function takes two other optional arguments. The third argument,
offset, is a number that indicates the
first character to be included in the result range offset from the
beginning of the string searched for. The fourth argument,
length, gives the length of the result
range. By default, offset has a value of
1, thus the result range begins before the first character in the
string. length has a default value such
that the result range covers the entire string searched.
- range(positions)
-
This function takes an XPath expression and returns a set of ranges
(a location set) where each includes the positions passed as
parameters. It can be used to convert a set of positions (which may
be nodes) to a set comprising ranges only.
- range-inside(positions)
-
This function takes an XPath expression and returns a set of ranges
(a location set) for each of the positions passed as arguments.
- start-point(positions)
-
This function takes an XPath expression and returns the starting
point of the range for each of the positions passed as arguments. The
result is a set of points.
- end-point(positions)
-
This function takes an XPath expression and returns the end point of
the range for each of the positions passed as arguments. The result
is a set of points.
- here( )
-
This function is defined only within an XML document. It returns a
unique position comprising the element containing the XPointer
expression or the attribute that contains it.
- origin( )
-
This function can be used only for links triggered by the user. It
returns the element's position to the original link.
10.10.4. XLink
Now that we know about XPointer, let's take a look
at some inline links:
<?xml version="1.0"?>
<simpledoc xmlns:xlink="http://www.w3.org/1999/xlink">
<title>An XLink Demonstration</title>
<section id="target-section">
<para>This is a paragraph in the first section.</para>
<para>More information about XLink can be found at
<reference xlink:type="simple"
xlink:href="http://www.w3.org">
the W3C
</reference>.
</para>
</section>
<section id="origin-section">
<para>
This is a paragraph in the second section.
</para>
<para>
You should go read
<reference xlink:type="simple"
xlink:href="#target-section">
the first section
</reference>
first.
</para>
</section>
</simpledoc>
The first link states that the text the W3C is linked to the URL
http://www.w3.org. How does the
browser know? Simple. An HTML browser knows that every
<a> element is a link because the browser
has to handle only one document type. In XML, you can make up your
own element type names, so the browser needs some way of identifying
links.
XLink provides the &xlink:type; attribute for
link identification. A browser knows it has found a simple link when
any element sets the &xlink:type; attribute to
a value of simple. A simple link is like a link in
HTML—one-way and beginning at the point in the document where
it occurs. (In fact, HTML links can be recast as XLinks with minimal
effort.) In other words, the content of the link element can be
selected for traversal at the other end. Returning to the source
document is left to the browser.
Once an XLink processor has found a simple link, it looks for other
attributes that it knows:
- &xlink:href;
-
This attribute is deliberately named to be familiar to anyone who has
used the Web before. Its value is the URI of the other end of the
link; it can refer to an entire document or to a point or element
within that document. If the target is in an XML document, the
fragment part of the URI is an XPointer.
This attribute must be specified, since without it, the link is
meaningless. It is an error not to include it.
- &xlink:role;
-
This describes the nature of the object at the other end of the link.
XLink doesn't predefine any roles; you might use a
small set to distinguish different types of links in your documents,
such as cross-references, additional reading, and contact
information. A style sheet might take a different action (such as
presenting the link in a different color) based on the role, but the
application won't do anything automatically.
- &xlink:title;
-
A title for the resource at the other end of the link can be
provided, identical to HTML's
title attribute for the
<a> element. A browser might display the
title as a tool tip; an aural browser might read the title when the
user pauses at the link before selecting it. A style sheet might also
make use of the information, perhaps to build a list of references
for a document.
- &xlink:show;
-
This attribute suggests what to do when the link is traversed. It can
take the following values:
- embed
-
The content at the other end of the link should be retrieved and
displayed where the link is. An example of this behavior in HTML is
the <img> element, whose target is usually
displayed within the document.
- replace
-
When the link is activated, the browser should replace the current
view with a view of the resource targeted by the link. This is what
happens with the <a> element in HTML: the
new page replaces the current one.
- new
-
The browser should somehow create a new context, if possible, such as
opening a new window.
- other
-
This value specifies behavior that isn't described
by the other values. It is up to the application to determine how to
display the link.
- none
-
This specifies no behavior.
You do not need to give a value for this attribute. Remember that a
link primarily asserts a relationship between
data; behavior is best left to a style sheet. So unless the behavior
is paramount (as it might be in some cases of
embed), it is best not to use this attribute.
- &xlink:actuate;
-
The second of the behavioral attributes specifies when the link
should be activated. It can take the following values:
- onRequest
-
The application waits until the user requests that the link be
followed, as the <a> element in HTML does.
- onLoad
-
The link should be followed immediately when it is encountered by the
application; this is what most HTML browsers do with
<img> elements, unless the user has turned
off image loading.
- other
-
The link is activated by other means, not specified by XLink. This is
usually defined by other markup in the document.
- none
-
This indicates no information about the activation of the link and
may be used when the link has no current meaningful target or action.
10.10.5. Building Extended Links
XLink has much more to offer, including links to multiple documents
and links between disparate documents (where the XML document
creating the links does not even contain any links).
10.10.5.1. Extended links
An XLink application recognizes extended links by the presence of an
&xlink:type="extended"; attribute that
distinguishes it from a simple link (such as those used in HTML). An
extended link may have semantic attributes
(&xlink:role; and
&xlink:title;) that function just as they do
for a simple link.
In addition, an extended link may be one of four types as defined by
its
xlink:type="type"
attribute:
- resource
-
Supplies the local resource for the link (generally the text used to
materialize the link)
- locator
-
Supplies a URI for the remote document participating in the link
- arc
-
Supplies a description of the potential paths among the documents
participating in the extended link
- title
-
Supplies a label for the link
Consider this example of an extended link supplying an XML
bibliography:
<biblio xlink:type="extended">
<text xlink:type="resource"
xlink:role="text">XML Bibliography</text>
<book xlink:type="locator" xlink:role="book"
xlink:href="xmlgf.xml"
xlink:title="XML Pocket Reference"/>
<book xlink:type="locator" xlink:role="book"
xlink:href="lxml.xml"
xlink:title="Learning XML"/>
<author xlink:type="locator" xlink:role="author"
xlink:href="robert-eckstein.xml"
xlink:title="Robert Eckstein"/>
<author xlink:type="locator" xlink:role="author"
xlink:href="erik-ray.xml"
xlink:title="Erik Ray"/>
<arc xlink:type="arc"/>
</biblio>
The extended link will probably be represented graphically as a menu
with an entry for each element, except for the last one
(arc), which has no graphical representation.
However, the graphical representation of the link is the
application's responsibility. Let's
look at the role of each of the elements.
10.10.5.4. Arc elements
Arc elements have the &xlink:type="arc";
attribute and determine the potential traversals among resources, as
well as the behavior of the XLink application during such traversals.
Arc elements may be represented as arrows in a diagram, linking
resources that participate in an extended link.
XLink applications use the following arc attributes:
Attribute
|
Description
|
xlink:type
|
arc (fixed value)
|
xlink:from
|
Indicates the role of the resource of the originating arc
|
xlink:to
|
Indicates the role of the resource of the destination arc
|
xlink:show
|
new, replace,
embed, other, or
none: tells the XLink application how to display
the resource to which the arc is pointing
|
xlink:actuate
|
onLoad, onRequest,
other, or none: tells the XLink
application the circumstances under which the traversal is made
|
xlink:arcrole
|
Role of the arc
|
xlink:title
|
Text that may be used to represent the arc
|
The values of the &xlink:show; and
&xlink:actuate; attributes have the same
meaning as they do with simple links.
Let's go back to our example of the bibliography,
where we could define the following arc:
<arc xlink:from="text" xlink:to="book"
xlink:show="new" xlink:actuate="onRequest"/>
The arc creates a link from the text displayed by the navigator (a
resource where the role is text) to the
descriptive page from the book (remote resource where the role is
book). The page must be displayed in a new window
(&xlink:show="new";) when the user clicks the
mouse button (&xlink:actuate="onRequest";).
To include the author's biography in the card for
the book, we will define the following arc:
<arc xlink:from="book" xlink:to="author"
xlink:show="embed" xlink:actuate="onLoad"/>
&xlink:show="embed"; indicates that the
destination of the arc (the biography) must be included in the card
for the book (origin of the arc) and that the destination must be
included when the book page is loaded
(&xlink:actuate="onLoad";).
Finally, we need to indicate that the absence of the
&xlink:from; or
&xlink:to; attribute indicates that the origin
or destination of the arc corresponds to all the roles defined in the
link. Thus, the arc in our example (<arc
xlink:type="arc"/>) authorizes all the traversals
possible among the resources of the extended link.
10.10.5.5. Title elements
Elements with a type of <title> tell the
XLink application the title of the extended link. This element is
needed when you want titles to have markers (for example, to put the
text in bold) or if you want to provide titles in multiple languages.
A <title> element must have the
&xlink:type="title"; attribute.
As there may be a large number of attributes for the elements
participating in an extended link, we recommend using the default
values in the DTD. This eliminates the need to include fixed-value
attributes for an element.
For example, because the &xlink:type;
attribute of the <biblio> element always has
extended as the value, we could declare the
<biblio> element in the DTD as follows:
<!ELEMENT biblio (text, book+, author+, arc+)>
<!ATTLIST biblio xlink:type (extended) #FIXED "extended">
We would not need to indicate the type, and if we proceed the same
way for the other elements in the extended link, we could write the
following link:
<biblio>
<text>XML Bibliography</text>
<book xlink:href="xmlgf.xml"
xlink:title="XML Pocket Reference"/>
<book xlink:href="lxml.xml"
xlink:title="Learning XML"/>
<author xlink:href="robert-eckstein.xml"
xlink:title="Robert Eckstein"/>
<author xlink:href="erik-ray.xml"
xlink:title="Erik Ray"/>
<arc/>
</biblio>
By limiting ourselves to the strict minimum (attributes where the
value is fixed do not need to be written), we gain readability.
10.10.5.6. Linkbases
As indicated earlier, an extended link with no resource-type element
(local resource) is described as being out-of-line. Therefore, this
type of link is not defined in any files to which it points. It may
be convenient to regroup extended links in XML files called
linkbases.
This raises the question of the location of such XML files. If we
have no way of finding the linkbases associated with a given file
(not provided in the W3C specification), we must indicate the URI in
one of the files participating in the link. This is possible with the
&xlink:role; attribute with the value
&xlink:extended-linkset;.
The XLink application recognizes the attribute and can look for the
associated linkbase where the URI is indicated by the
&xlink:href; attribute. For example, to link
the linkbase of the URI linkbase.xml to an XML
file, we could use an element with the following syntax:
<linkbase>
<uri xlink:role="XLink:extended-linkset"
xlink:href="linkbase.xml"/>
</linkbase>
We can indicate as many linkbases in a file as we want. A linkbase
can itself contain a reference to another linkbase. It is up to the
XLink application to manage circular references and limit the depth
of the search for linkbases.
10.10.6. XBase
XBase is a W3C specification currently in development. XBase can be
used to change the base of URIs in an XML document (which, by
default, is the document's directory). XLink
processors take XBase into consideration in order to manage URIs,
using the
xml:base="URI"
attribute as follows:
<base xml:base="http://www.oreilly.com/bdl/"/>
<linkbase>
<uri xlink:role="xlink:extended-linkset"
xlink:href="linkbase.xml"/>
</linkbase>
The linkbase.xml linkbase is searched for in the
http://www.oreilly.com/bdl/
directory, not in the directory of the document where the request was
made to load the linkbase.
Loading of the base continues in the nodes that descend from the node
in which the base is defined (this is the same behavior as the
&xml:lang; and
&xml:space; attributes).
 |  |  | 10.9. XPath |  | IV. JavaScript |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|