The LWP modules provide the core of functionality for web programming in Perl. It contains the foundations for networking applications, protocol implementations, media type definitions, and debugging ability.
The modules LWP::Simple and LWP::UserAgent define client applications that implement network connections, send requests, and receive response data from servers. LWP::RobotUA is another client application that is used to build automated web searchers following a specified set of guidelines.
LWP::UserAgent is the primary module used in applications built with LWP. With it, you can build your own robust web client. It is also the base class for the Simple and RobotUA modules. These two modules provide a specialized set of functions for creating clients.
Additional LWP modules provide the building blocks required for web communications, but you often don't need to use them directly in your applications. LWP::Protocol implements the actual socket connections with the appropriate protocol. The most common protocol is HTTP, but mail protocols (like SMTP), FTP for file transfers, and others can be used across networks.
The following sections describe the RobotUA, Simple, and UserAgent modules of LWP.
The Robot User Agent (LWP::RobotUA) is a subclass of LWP::UserAgent, and is used to create robot client applications. A robot application requests resources in an automated fashion. Robots perform such activities as searching, mirroring, and surveying. Some robots collect statistics, while others wander the Web and summarize their findings for a search engine.
The LWP::RobotUA module defines methods to help program robot applications and observes the Robot Exclusion Standards, which web server administrators can define on their web site to keep robots away from certain (or all) areas of the site.
The first parameter,$rob = LWP::RobotUA->new( agent_name , email , [$ rules ]);
Since LWP::RobotUA is a subclass of LWP::UserAgent, the LWP::UserAgent methods are used to perform the basic client activities. The following methods are defined by LWP::RobotUA for robot-related functionality:
LWP::Simple provides an easy-to-use interface for creating a web client, although it is only capable of performing basic retrieving functions. An object constructor is not used for this class; it defines functions to retrieve information from a specified URL and interpret the status codes from the requests.
This module isn't named Simple for nothing. The following lines show how to use it to get a web page and save it to a file:
The retrieving functionsuse LWP::Simple; $homepage = 'oreilly_com.html'; $status = getstore('http://www.oreilly.com/', $homepage); print("hooray") if is_success($status);
The user-agent identifier produced by LWP::Simple is
The following list describes the functions exported by LWP::Simple:
You give the object a request, which it uses to contact the server, and the information you requested is returned. The most often used method in this module is$ua = new LWP::UserAgent;
The following methods are supplied by LWP::UserAgent: