14.17 Walking the Document Node Tree
NN 6, IE 5
14.17.1 Problem
You want to iterate through the entire
document node tree in search of nodes meeting desired criteria.
14.17.2 Solution
The following getLikeElements(
) function returns a collection of
elements that share the same tag name, attribute name, and attribute
value (specified as arguments):
function getLikeElements(tagName, attrName, attrValue) {
var startSet;
var endSet = new Array( );
if (tagName) {
startSet = document.getElementsByTagName(tagName);
} else {
startSet = (document.all) ? document.all :
document.getElementsByTagName("*");
}
if (attrName) {
for (var i = 0; i < startSet.length; i++) {
if (startSet[i].getAttribute(attrName)) {
if (attrValue) {
if (startSet[i].getAttribute(attrName) = = attrValue) {
endSet[endSet.length] = startSet[i];
}
} else {
endSet[endSet.length] = startSet[i];
}
}
}
} else {
endSet = startSet;
}
return endSet;
}
14.17.3 Discussion
You can omit one or more arguments of the getLikeElements(
) function in specific combinations. For example, if you
omit all three arguments, you receive a collection of all elements in
the document. Specify only the first argument (the tag name) to
retrieve all elements with the same tag name. If you supply the tag
name and attribute name only, the returned collection contains
elements that have the same tag name and have the same attribute
specified, regardless of attribute value. If you specify an attribute
value, you must also pass an attribute name. For empty arguments,
pass either an empty string or null when they
precede nonempty arguments. The following invocations of
getLikeElements( ) are all valid:
var collection = getLikeElements( );
var collection = getLikeElements("td");
var collection = getLikeElements("", "class");
var collection = getLikeElements("", "class", "highlight");
var collection = getLikeElements("td", "align", "center");
Use caution, however, when retrieving input
elements that have value attributes. Netscape
returns only those elements with explicitly set
value attributes, while IE returns all
input elements because the browser automatically
assigns a value attribute to
input elements such as radio and checkbox buttons.
Another variation on the notion of walking a document tree is to use
a script to diagram the document to reveal its nested node structure.
Object model facilities for retrieving all elements in a document
completely flatten the node hierarchy. To preserve the hierarchy and
track it, you can use a routine like the following
walkChildNodes( ) function, which accumulates a string
that reveals the node structure of any object passed as the first
parameter of the function. The function invokes itself recursively as
it dives into nested hierarchies, and internally passes the second
argument to help the function keep track of which nested level it is
currently processing.
function walkChildNodes(objRef, n) {
var obj;
if (objRef) {
if (typeof objRef = = "string") {
obj = document.getElementById(objRef);
} else {
obj = objRef;
}
} else {
obj = (document.body.parentElement) ?
document.body.parentElement : document.body.parentNode;
}
var output = "";
var indent = "";
var i, group, txt;
if (n) {
for (i = 0; i < n; i++) {
indent += "+---";
}
} else {
n = 0;
output += "Child Nodes of <" + obj.tagName .toLowerCase( );
output += ">\n= == == == == == == == == == ==\n";
}
group = obj.childNodes;
for (i = 0; i < group.length; i++) {
output += indent;
switch (group[i].nodeType) {
case 1:
output += "<" + group[i].tagName.toLowerCase( );
output += (group[i].id) ? " ID=" + group[i].id : "";
output += (group[i].name) ? " NAME=" + group[i].name : "";
output += ">\n";
break;
case 3:
txt = group[i].nodeValue.substr(0,15);
output += "[Text:\"" + txt.replace(/[\r\n]/g,"<cr>");
if (group[i].nodeValue.length > 15) {
output += "...";
}
output += "\"]\n";
break;
case 8:
output += "[!COMMENT!]\n";
break;
default:
output += "[Node Type = " + group[i].nodeType + "]\n";
}
if (group[i].childNodes.length > 0) {
output += walkChildNodes(group[i], n+1);
}
}
return output;
}
To invoke the walkChildNodes( ) function to
capture the node structure of a document's
body element, the call looks like the following:
walkChildNodes(document.body);
Output from walkChildNodes( ) displays the tags of
each element node (with their IDs, if assigned), and samples of text
nodes to help you identify them. The following trace shows the body
of a document containing the Recipe 14.1 script plus a portion of the
table from the discussion of Recipe 14.15:
Child Nodes of <body>
= == == == == == == == == == ==
<h1>
+---[Text:"Welcome to Gian..."]
<h2>
+---[Text:"We Love"]
+---<script>
+---[Text:" Windows "]
+---<noscript>
+---[Text:"Users!"]
<hr>
<form>
+---<table>
+---+---<tbody ID=myTBody>
+---+---+---<tr>
+---+---+---+---<td>
+---+---+---+---+---<input>
+---+---+---+---<td>
+---+---+---+---+---[Text:"Item 1"]
+---+---+---<tr>
+---+---+---+---<td>
+---+---+---+---+---<input>
+---+---+---+---<td>
+---+---+---+---+---[Text:"Item 2"]
+---+---+---<tr>
+---+---+---+---<td>
+---+---+---+---+---<input>
+---+---+---+---</td>
You can use the walkChildNodes( ) function as a
diagnostic tool, particularly for dynamically created HTML content.
If you embed the function into the document as well as into a
temporary textarea element, your content creation
function can end with a call to walkChildNodes( )
to output the results to the textarea for closer
inspection, and comparison against what you think the node hierarchy
should be.
One last technique to be aware of is the W3C DOM
TreeWalker object, which is available in Netscape 7
and later (but not in IE as of Version 6). The
TreeWalker object is a live, hierarchical list of
nodes that meet criteria defined by the
document.createTreeWalker(
) method. The list assumes the same
parent-descendant hierarchy for its items as the nodes to which its
items point. The createTreeWalker(
) method describes the node where the list
begins and which nodes (or classes of nodes) are exempt from the list
by way of filtering.
The TreeWalker object maintains a kind of pointer
inside the list (so that your scripts don't have
to). Methods of this object let scripts access the next or previous
node (or sibling, child, or parent node) in the list, while moving
the pointer in the direction indicated by the method you chose. If
scripts modify the document tree after the
TreeWalker is created, changes to the document
tree are automatically reflected in the sequence of nodes in the
TreeWalker.
While fully usable in an HTML document, the
TreeWalker can be even more valuable when working
with an XML data document. For example, the W3C DOM does not provide
a quick way to access all elements that have a particular attribute
name (something that the XPath standard can do easily on the server).
But you can define a TreeWalker to point only to
nodes that have the desired attribute and quickly access those nodes
sequentially (i.e., without having to script more laborious looping
through all nodes in search of the desired elements). For example,
the following filter function allows only those nodes that contain
the author attribute to be a member of a
TreeWalker object:
function authorAttrFilter(node) {
if (node.hasAttribute("author")) {
return NodeFilter.FILTER_ACCEPT;
}
return NodeFilter.FILTER_SKIP;
}
A reference to this function becomes one of the parameters to a
createTreeWalker( ) method that also limits the
list to element nodes:
var authorsOnly = document.createTreeWalker(document, NodeFilter.SHOW_ELEMENT,
authorAttrFilter, false);
You can then invoke TreeWalker object methods to
obtain a reference to one of the nodes in the list. When you invoke
the method, the TreeWalker object applies the
filter to candidates relative to the current position of the internal
pointer in the direction indicated by the method. The next document
tree node to meet the method and filter criteria is returned. Once
you have that node reference, you can access any DOM node property or
method to work with the node, independent of the items in the
TreeWalker list.
14.17.4 See Also
Recipe 1.1 for concatenating string segments to build long
strings.
|