I started playing around with the Web a long time ago—at least, it feels that way. The first versions of Mosaic had just showed up, Gopher and Wais were still hot technology, and I discovered an HTTP server program called Plexus. What was different was it was implemented in Perl. That made it easy to extend. CGI was not invented yet, so all we had were servlets (although we didn't call them that then). Over time, I moved from hacking on the server side to the client side but stayed with Perl as the programming language of choice. As a result, I got involved in LWP, the Perl web client library.
A lot has happened to the web since then. These days there is almost no end to the information at our fingertips: news, stock quotes, weather, government info, shopping, discussion groups, product info, reviews, games, and other entertainment. And the good news is that LWP can help automate them all.
This book tells you how you can write your own useful web client applications with LWP and its related HTML modules. Sean's done a great job of showing how this powerful library can be used to make tools that automate various tasks on the Web. If you are like me, you probably have many examples of web forms that you find yourself filling out over and over again. Why not write a simple LWP-based tool that does it all for you? Or a tool that does research for you by collecting data from many web pages without you having to spend a single mouse click? After reading this book, you should be well prepared for tasks such as these.
This book's focus is to teach you how to write
scripts against services that are set up to serve traditional web
browsers. This means services exposed through HTML. Even in a world
where people eventually have discovered that the Web can provide real
program-to-program interfaces (the current "web
services" craze), it is likely that HTML scraping
will continue to be a valuable way to extract information from the
Web. I strongly believe that Perl and LWP is one of the best tools to
get that job done. Reading
It has been fun writing and maintaining the LWP codebase, and Sean's written a fine book about using it. Enjoy!
Primary author and maintainer of LWP
Copyright © 2002 O'Reilly & Associates. All rights reserved.