Push Versus Pull (Java & XML, 2nd Edition)

14.3.2. Creating an XML RSS Document

The first thing you need to do to use RSS is create an RSS file. This is almost too simple to be believed: other than referencing the correct DTD and following that DTD, there is nothing at all complicated about creating an RSS document. Example 14-6 shows a sample RSS file that mytechbooks.com has modeled.

Example 14-6. Sample RSS document for mytechbooks.com

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
>
 <channel>
  <title>mytechbooks.com New Listings</title>
  <link>http://www.newInstance.com/javaxml2/techbooks</link>
  <description>
   Your online source for technical material, computers, 
   and computing books!
  </description>

  <image rdf:resource="http://newInstance.com/javaxml2/logo.gif" />

  <items>
   <rdf:Seq>
    <rdf:li resource="http://www.newInstance.com/javaxml2/techbooks" />
   </rdf:Seq>
  </items>
 </channel>

  <image rdf:about="http://newInstance.com/javaxml2/logo.gif">
   <title>mytechbooks.com</title>
   <url>http://newInstance.com/javaxml2/logo.gif</url>
   <link>http://newInstance.com/javaxml2/techbooks</link>
  </image>

  <item rdf:about="http://www.newInstance.com/javaxml2/techbooks">
   <title>Java Servlet Programming</title>
   <link>
    http://newInstance.com/javaxml2/techbooks/buy.xsp?isbn=156592391X
   </link>
   <description>
    This book is a superb introduction to Java servlets
    and their various communications mechanisms.
   </description>
  </item>
</rdf:RDF>

The root element must be RDF, in the RDF namespace, as shown in the example. Within the root element, one single channel element must appear. This has elements that describe the channel (title, link, and description), an optional image that can be associated with the channel (as well as information about that image), and then as many as 15 item elements,[23] each detailing one item related to the channel. Each item has a title, link, and description element, all of which are self-explanatory. An optional text box and button to submit the information in the book can be added as well, although these are not included in the example. For complete details of allowed elements and attributes, visit the RSS 1.0 specification online at http://groups.yahoo.com/group/rss-dev/files/specification.html.

[23] This isn't a limit set by RSS 1.0, but is used for backwards compatibility with RSS 0.9 and 0.91.

NOTE: As in previous examples, actual RSS channel documents should avoid having whitespace within the link and url elements, but rather have all information on a single line. Again, the formatting in the example does not reflect this due to printing and sizing constraints.

There is one somewhat tricky thing to watch out for, though. You'll notice that the item element (or elements) is actually not nested within the channel element at all. To create a link between items in the document and the channel, you'll want to use some RDF (the Resource Description Framework, which RSS is a descendant of) constructs:

  <items>
   <rdf:Seq>
    <rdf:li resource="http://www.newInstance.com/javaxml/techbooks" />
   </rdf:Seq>
  </items>

Here, the items element is nested within the channel element. Then, the li construct, in the RDF-defined namespace, is assigned a URI through the resource attribute. In each item you want associated with this channel, supply the about attribute (again in the RDF namespace) and assign it the same URI you used in the channel's resource descriptor:

  <item rdf:about="http://www.newInstance.com/javaxml/techbooks">
    <!-- Item content -->
  </item>

For each item with this URI, an association can be made between that item and the channel with the same URI. In other words, you've just built a link between the channel in the RSS file and the items. The same approach applies for linking a channel to an image; you use the image element in the channel element, specifying the image URL as the value of the rdf:resource attribute. You should then define an image element, not within the channel element, supplying a URL, description, and link. Finally, use the rdf:about attribute (as in the item element) to specify the same URL as provided in the channel's image element. Did you follow all of that? This is all quite a bit different from RSS 0.9 and 0.91 (covered in the first edition of this book), so you'll need to be careful not to get things mixed up between the older specification and this newer one.

It is simple enough to create RSS files programmatically; the procedure is similar to how you generated the HTML for the mytechbooks.com web site. Half of the RSS file (the information about the channel as well as the image information) is static content; only the item elements must be generated dynamically. However, just as you were getting ready to open up vi and start creating another XSL stylesheet, another requirement was dropped into your lap: the machine that will house the RSS channel is a different server than that used in our last example, and has only very outdated versions of the Apache Xalan libraries available. Because of some of the high-availability applications that also run on that machine, such as the billing system, mytechbooks.com does not want to update those libraries until change control can be stepped through, a weeklong process. However, mytechbooks.com does have newer versions of the Xerces libraries available (as XML parsing is used in the billing system), so Java APIs for handling XML are available.[24] In this example, I use JDOM to convert the XML from the Foobar Public Library into an RSS channel format. Example 14-7 does just this.

[24] Yes, this is a bit of a silly case, and perhaps not so likely to really occur. However, it does afford me the opportunity to look at another alternative for creating XML programmatically. Don't sneer too much at the absurdity of the example; all of the examples in this book, including the silly ones, stem from actual experiences consulting for real-world companies. Laughing at this scenario might mean your next project has the same silly requirements!

Example 14-7. Java servlet to convert new book listings into an RSS channel document

package com.techbooks;

import java.io.FileInputStream;
import java.io.InputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.net.URL;
import java.util.Iterator;
import java.util.List;
import javax.servlet.*;
import javax.servlet.http.*;

// JDOM
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;

public class GetRSSChannelServlet extends HttpServlet {

    /** Host to connect to for books list */
    private static final String hostname = "newInstance.com";
    /** Port number to connect to for books list */
    private static final int portNumber = 80;
    /** File to request (URI path) for books list */
    private static final String file = "/cgi/supplyBooks.pl";

    public void service(HttpServletRequest req, HttpServletResponse res) 
        throws ServletException, IOException {            
            
        res.setContentType("text/plain");
        PrintWriter out = res.getWriter();
        
        // Connect and get XML listing of books
        URL getBooksURL = new URL("http", hostname, portNumber, file);
        InputStream in = getBooksURL.openStream();

        try {
            // Request SAX Implementation and use default parser
            SAXBuilder builder = new SAXBuilder();

            // Create the document
            Document doc = builder.build(in);
            
            // Output XML
            out.println(generateRSSContent(doc));
            
        } catch (JDOMException e) {        
            out.println("Error: " + e.getMessage());
        } finally {
            out.close();
        }
    }   
    
    /**
     * <p>
     * This will generate an RSS XML document using the supplied 
     *   JDOM <code>Document</code>.
     * </p.
     *
     * @param doc <code>Document</code> to use for input.
     * @return <code>String</code> - RSS file to output.
     * @throws <code>JDOMException</code> when errors occur.
     */
    private String generateRSSContent(Document doc) throws JDOMException {
        StringBuffer rss = new StringBuffer();
        
        rss.append("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n")
           .append("<rdf:RDF ")
           .append("xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"\n")
           .append("         xmlns=\"http://purl.org/rss/1.0/\"\n")
           .append(">\n")
           .append(" <channel>\n")
           .append("  <title>mytechbooks.com New Listings</title>\n")
           .append("  <link>http://www.newInstance.com/javaxml2/techbooks")
           .append("</link>\n")
           .append("  <description>\n")
           .append("   Your online source for technical material, computers, \n")
           .append("   and computing books!\n")
           .append("  </description>\n\n")
           .append("  <image ")
           .append("rdf:resource=\"http://newInstance.com/javaxml2/logo.gif\"")
           .append(" />\n\n")
           .append("  <items>\n")
           .append("   <rdf:Seq>\n")
           .append("    <rdf:li ")
           .append("resource=\"http://www.newInstance.com/javaxml2/techbooks\"")
           .append(" />\n")
           .append("   </rdf:Seq>\n")
           .append("  </items>\n")
           .append(" </channel>\n\n")
           .append("  <image ")
           .append("rdf:about=\"http://newInstance.com/javaxml2/logo.gif\">\n")
           .append("   <title>mytechbooks.com</title>\n")
           .append("   <url>http://newInstance.com/javaxml2/logo.gif</url>\n")
           .append("   <link>http://newInstance.com/javaxml2/techbooks</link>\n")
           .append("  </image>\n\n");
           
        // Add an item for each new title with Computers as subject
        List books = doc.getRootElement().getChildren("book");
        for (Iterator i = books.iterator(); i.hasNext(); ) {
            Element book = (Element)i.next();
            if (book.getAttribute("subject")
                    .getValue()
                     .equals("Computers")) {
                // Output an item
                rss.append("<item rdf:about=\"http://www.newInstance.com/")
                   .append("javaxml2/techbooks\">\n")
                    // Add title
                   .append(" <title>")
                   .append(book.getChild("title").getContent())
                   .append("</title>\n")
                    // Add link to buy book
                   .append(" <link>")
                   .append("http://newInstance.com/javaxml2")
                   .append("/techbooks/buy.xsp?isbn=")
                   .append(book.getChild("saleDetails")
                               .getChild("isbn")
                               .getContent())
                   .append("</link>\n")
                   .append(" <description>")
                    // Add description
                   .append(book.getChild("description").getContent())
                   .append("</description>\n")                       
                   .append("</item>\n");
                        
            }
        }          
         
        rss. append("</rdf:RDF>");
        
        return rss.toString();
    }
}

By this time, nothing in this code should be the least bit surprising to you; I've imported the JDOM and I/O classes needed, and accessed the Foobar Public Library application as in the ListBooksServlet. The resulting InputStream is used to create a JDOM Document, with the default parser (Apache Xerces) and the JDOM builder based on SAX doing the work.

Then, the JDOM Document is handed off to the generateRSSContentMethod() , which prints out all of the static content for the RSS channel. This method then obtains the book elements within the XML from the library and iterates through them, ignoring those without a subject attribute equal to "Computers".

NOTE: Again, I've done some rather different things simply for illustrative purposes. For example, this code directly outputs XML; you could just as easily create a JDOM tree and output it using XMLOutputter. Of course, you could also use DOM for the entire servlet. All these are viable and perfectly legitimate options.

Finally, each element that makes it through the comparison is added to the RSS channel. Nothing very exciting here, right? Figure 14-5 shows a sample output from accessing this servlet, saved as GetRSSChannelServlet.java, through a web browser.

Figure 14-5. RSS channel generated by the GetRSSChannelServlet

With this RSS channel ready for use, mytechbooks.com has made its content available by any service provider that supports RSS! To get the ball rolling on allowing clients to use its channel, mytechbooks.com would like to ensure its RSS document is valid, and see a sample HTML rendering of it (as would you, I imagine).

Example 14-6. Sample RSS document for mytechbooks.com

Example 14-7. Java servlet to convert new book listings into an RSS channel document

Figure 14-5. RSS channel generated by the GetRSSChannelServlet

Figure 14-6. RSS formatted in HTM

14.3. Push Versus Pull

14.3.1. Rich Site Summary

14.3.2. Creating an XML RSS Document

14.3.3. Taking a Test Drive

14.3.4. What Happened to Netcenter?