Previous Next Table of Contents

5. Communication between browsers and Squid

Most web browsers available today support proxying and are easily configured to use a Squid server as a proxy. Some browsers support advanced features such as lists of domains or URL patterns that shouldn't be fetched through the proxy, or JavaScript automatic proxy configuration.

5.1 Netscape manual configuration

Select Network Preferences from the Options menu. On the Proxies page, click the radio button next to Manual Proxy Configuration and then click on the View button. For each protocol that your Squid server supports (by default, HTTP, FTP, and gopher) enter the Squid server's hostname or IP address and put the HTTP port number for the Squid server (by default, 3128) in the Port column. For any protocols that your Squid does not support, leave the fields blank.

Here is a screen shot of the Netscape Navigator manual proxy configuration screen.

5.2 Netscape automatic configuration

Netscape Navigator's proxy configuration can be automated with JavaScript (for Navigator versions 2.0 or higher). Select Network Preferences from the Options menu. On the Proxies page, click the radio button next to Automatic Proxy Configuration and then fill in the URL for your JavaScript proxy configuration file in the text box. The box is too small, but the text will scroll to the right as you go.

Here is a screen shot of the Netscape Navigator automatic proxy configuration screen.

You may also wish to consult Netscape's documentation for the Navigator JavaScript proxy configuration

Here is a sample auto configuration JavaScript from Oskar Pearson:


//We (www.is.co.za) run a central cache for our customers that they //access through a firewall - thus if they want to connect to their intranet //system (or anything in their domain at all) they have to connect //directly - hence all the "fiddling" to see if they are trying to connect //to their local domain. //Replace each occurrence of company.com with your domain name //and if you have some kind of intranet system, make sure //that you put it's name in place of "internal" below. //We also assume that your cache is called "cache.company.com", and //that it runs on port 8080. Change it down at the bottom. //(C) Oskar Pearson and the Internet Solution (http://www.is.co.za) function FindProxyForURL(url, host) { //If they have only specified a hostname, go directly. if (isPlainHostName(host)) return "DIRECT"; //These connect directly if the machine they are trying to //connect to starts with "intranet" - ie http://intranet //Connect directly if it is intranet.* //If you have another machine that you want them to //access directly, replace "internal*" with that //machine's name if (shExpMatch( host, "intranet*")|| shExpMatch(host, "internal*")) return "DIRECT"; //Connect directly to our domains (NB for Important News) if (dnsDomainIs( host,"company.com")|| //If you have another domain that you wish to connect to //directly, put it in here dnsDomainIs(host,"sistercompany.com")) return "DIRECT"; //So the error message "no such host" will appear through the //normal Netscape box - less support queries :) if (!isResolvable(host)) return "DIRECT"; //We only cache http, ftp and gopher if (url.substring(0, 5) == "http:" || url.substring(0, 4) == "ftp:"|| url.substring(0, 7) == "gopher:") //Change the ":8080" to the port that your cache //runs on, and "cache.company.com" to the machine that //you run the cache on return "PROXY cache.company.com:8080; DIRECT"; //We don't cache WAIS if (url.substring(0, 5) == "wais:") return "DIRECT"; else return "DIRECT"; }

5.3 Lynx and Mosaic configuration

For Mosaic and Lynx, you can set environment variables before starting the application. For example (assuming csh or tcsh):

% setenv http_proxy http://mycache.example.com:3128/ % setenv gopher_proxy http://mycache.example.com:3128/ % setenv ftp_proxy http://mycache.example.com:3128/

For Lynx you can also edit the lynx.cfg file to configure proxy usage. This has the added benefit of causing all Lynx users on a system to access the proxy without making environment variable changes for each user. For example: http_proxy:http://mycache.example.com:3128/ ftp_proxy:http://mycache.example.com:3128/ gopher_proxy:http://mycache.example.com:3128/

5.4 Redundant Auto-Proxy-Configuration

There's one nasty side-effect to using auto-proxy scripts: if you start the webbrowser it will try and load the auto-proxy-script.

If your script isn't available either because the webserver hosting the script is down or your workstation can't reach the webserver (e.g. because you're working off-line with your notebook and just want the read a previously saved HTML-file) you'll get different errors depending on the browser you use.

The Netscape browser will just return an error after a timeout (after that it tries to find the site 'www.proxy.com' if the script you use is called 'proxy.pac').

The Microsoft Internet Explorer on the other hand won't even start, no window displays, only after about 1 minute it'll display a window asking you to go on with/without proxy configuration.

The point is that your workstations always need to locate the proxy-script. I created some extra redundancy by hosting the script on two webservers (actually Apache webservers on the proxyservers themselves) and adding the following records to my primary nameserver: proxy CNAME proxy1 CNAME proxy2 The clients just refer to 'http://proxy/proxy.pac'. This script looks like this:

function FindProxyForURL(url,host) { // Hostname without domainname or host within our own domain? // Try them directly: // http://www.domain.com actually lives before the firewall, so // make an exception: if ((isPlainHostName(host)||dnsDomainIs( host,".domain.com")) && !localHostOrDomainIs(host, "www.domain.com")) return "DIRECT"; // First try proxy1 then proxy2. One server mostly caches '.com' // to make sure both servers are not // caching the same data in the normal situation. The other // server caches the other domains normally. // If one of 'm is down the client will try the other server. else if (shExpMatch(host, "*.com")) return "PROXY proxy1.domain.com:8080; PROXY proxy2.domain.com:8081; DIRECT"; return "PROXY proxy2.domain.com:8081; PROXY proxy1.domain.com:8080; DIRECT"; }

I made sure every client domain has the appropriate 'proxy' entry. The clients are automatically configured with two nameservers using DHCP.

-- Rodney van den Oever

5.5 Microsoft Internet Explorer configuration

Select Options from the View menu. Click on the Connection tab. Tick the Connect through Proxy Server option and hit the Proxy Settings button. For each protocol that your Squid server supports (by default, HTTP, FTP, and gopher) enter the Squid server's hostname or IP address and put the HTTP port number for the Squid server (by default, 3128) in the Port column. For any protocols that your Squid does not support, leave the fields blank.

Here is a screen shot of the Internet Explorer proxy configuration screen.

Microsoft is also starting to support Netscape-style JavaScript automated proxy configuration. As of now, only MSIE version 3.0a for Windows 3.1 and Windows NT 3.51 supports this feature (i.e., as of version 3.01 build 1225 for Windows 95 and NT 4.0, the feature was not included).

If you have a version of MSIE that does have this feature, elect Options from the View menu. Click on the Advanced tab. In the lower left-hand corner, click on the Automatic Configuration button. Fill in the URL for your JavaScript file in the dialog box it presents you. Then exit MSIE and restart it for the changes to take effect. MSIE will reload the JavaScript file every time it starts.

5.6 Netmanage Internet Chameleon WebSurfer configuration

Netmanage WebSurfer supports manual proxy configuration and exclusion lists for hosts or domains that should not be fetched via proxy (this information is current as of WebSurfer 5.0). Select Preferences from the Settings menu. Click on the Proxies tab. Select the Use Proxy options for HTTP, FTP, and gopher. For each protocol that enter the Squid server's hostname or IP address and put the HTTP port number for the Squid server (by default, 3128) in the Port boxes. For any protocols that your Squid does not support, leave the fields blank.

Take a look at this screen shot if the instructions confused you.

On the same configuration window, you'll find a button to bring up the exclusion list dialog box, which will let you enter some hosts or domains that you don't want fetched via proxy. It should be self-explanatory, but you might look at this screen shot just for fun anyway.

5.7 Opera 2.12 proxy configuration

Select Proxy Servers... from the Preferences menu. Check each protocol that your Squid server supports (by default, HTTP, FTP, and Gopher) and enter the Squid server's address as hostname:port (e.g. mycache.example.com:3128 or 123.45.67.89:3128). Click on Okay to accept the setup.

Notes:

-- Hume Smith

5.8 How can I make my users' browsers use my cache without configuring the browsers for proxying?

You can do transparent caching on Linux, Solaris, and BSD derivations. The trick is to get the operating system to forward certain IP packets to the application. This document currently contains only instruction for configuring transparent caching on Linux and Solaris.

Here are the important settings in squid.conf: http_port 80 icp_port 3130 httpd_accel virtual 80 httpd_accel_with_proxy on

Note, virtual is the magic word here! You don't necessarily need to use port 80, but the examples below assume that you will.

Transparent proxying for Solaris, SunOS, and BSD systems

See the IP Filter package pages.

Transparent proxying for Linux

by Rodney van den Oever

Note: Transparent proxying does NOT work with Linux 2.0.30! Linux 2.0.29 is known to work well.

Warning: this technique has several significant shortcomings!

  1. The access.log will not show hostnames in the URLs.
  2. Instead it prints raw IP addresses. This is because the destination address is determined with the getsockname(2) system call. This means the use of a parent or sibling doesn't work correctly anymore. The parent or sibling itself logs the URL by name not by IP address. These URLs are different so no cache HIT occurs. This means that you lose the benefit of reducing traffic in a caching hierarchy if you do transparent caching.
  3. This method only supports the HTTP protocol, not gopher or FTP
  4. Since the browser wasn't set up to use a proxy server, it uses the FTP protocol (with destination port 21) and not the required HTTP protocol. You can't setup a redirection-rule to the proxy server since the browser is speaking the wrong protocol. A similar problem occurs with gopher. Normally all proxy requests are translated by the client into the HTTP protocol, but since the client isn't aware of the redirection, this never happens.

If you can live with the side-effects, go ahead and compile your kernel with firewalling and redirection support. Here are the important parameters from /usr/src/linux/.config/:

# # Code maturity level options # CONFIG_EXPERIMENTAL=y # # Networking options # CONFIG_FIREWALL=y # CONFIG_NET_ALIAS is not set CONFIG_INET=y CONFIG_IP_FORWARD=y # CONFIG_IP_MULTICAST is not set CONFIG_IP_FIREWALL=y # CONFIG_IP_FIREWALL_VERBOSE is not set CONFIG_IP_MASQUERADE=y CONFIG_IP_TRANSPARENT_PROXY=y CONFIG_IP_ALWAYS_DEFRAG=y # CONFIG_IP_ACCT is not set CONFIG_IP_ROUTER=y

Go to the Linux IP Firewall and Accounting page, obtain the source distribution to ipfwadm and install it. You'll use ipfwadm to setup the redirection rules. I added this rule to the script that runs from /etc/rc.d/rc.inet1/ (Slackware) which sets up the interfaces at boot-time. The redirection should be done before any other Input-accept rule. To really make sure it worked I disabled the forwarding (masquerading) I normally do.

/etc/rc.d/rc.firewall/:

#!/bin/sh # rc.firewall Linux kernel firewalling rules FW=/sbin/ipfwadm # Flush rules, for testing purposes for i in I O F # A # If we enabled accouting too do ${FW} -$i -f done # Default policies: ${FW} -I -p rej # Incoming policy: reject (quick error) ${FW} -O -p acc # Output policy: accept ${FW} -F -p den # Forwarding policy: deny # Input Rules: # Loopback-interface (local access, eg, to local nameserver): ${FW} -I -a acc -S localhost/32 -D localhost/32 # Local Ethernet-interface: # Redirect to Squid proxy server: ${FW} -I -a acc -P tcp -D default/0 80 -r 80 # Accept packets from local network: ${FW} -I -a acc -P all -S localnet/8 -D default/0 -W eth0 # Only required for other types of traffic (FTP, Telnet): # Forward localnet with masquerading (udp and tcp, no icmp!): ${FW} -F -a m -P tcp -S localnet/8 -D default/0 ${FW} -F -a m -P udp -S localnet/8 -D default/0

Here all traffic from the local LAN with any destination gets redirected to the local port 80. Rules can be viewed like this: IP firewall input rules, default policy: reject type prot source destination ports acc all 127.0.0.1 127.0.0.1 n/a acc/r tcp 10.0.0.0/8 0.0.0.0/0 * -> 80 => 80 acc all 10.0.0.0/8 0.0.0.0/0 n/a acc tcp 0.0.0.0/0 0.0.0.0/0 * -> *

I did some testing on Windows 95 with both Microsoft Internet Explorer 3.01 and Netscape Communicator pre-release and it worked with both browsers with the proxy-settings disabled.

At one time squid seemed to get in a loop when I pointed the browser to the local port 80. But this could be avoided by adding a reject rule for client to this address: ${FW} -I -a rej -P tcp -S localnet/8 -D hostname/32 80 IP firewall input rules, default policy: reject type prot source destination ports acc all 127.0.0.1 127.0.0.1 n/a rej tcp 10.0.0.0/8 10.0.0.1 * -> 80 acc/r tcp 10.0.0.0/8 0.0.0.0/0 * -> 80 => 80 acc all 10.0.0.0/8 0.0.0.0/0 n/a acc tcp 0.0.0.0/0 0.0.0.0/0 * -> *

NOTE on resolving names: Instead of just passing the URLs to the proxy server, the browser itself has to resolve the URLs. Make sure the workstations are setup to query a local nameserver, to minimize outgoing traffic.

If you're already running a nameserver at the firewall or proxy server (which is a good idea anyway IMHO) let the workstations use this nameserver.

Additional notes from Richard Ayres

I'm using such a setup. The only issues so far have been that:

  1. It's fairly useless to use my service providers parent caches (cache-?.www.demon.net) because by proxying squid only sees IP addresses, not host names and demon aren't generally asked for IP addresses by other users;
  2. Linux kernel 2.0.30 is a no-no as transparent proxying is broken (I use 2.0.29);
  3. Client browsers must do host name lookups themselves, as they don't know they're using a proxy;
  4. The Microsoft Network won't authorize its users through a proxy, so I have to specifically *not* redirect those packets (my company is a MSN content provider).

Aside from this, I get a 30-40% hit rate on a 50MB cache for 30-40 users and am quite pleased with the results.

Transparent proxying with Cisco

by John Saunders

This works with at least IOS 11.1 and later I guess. Possibly earlier, as I'm no CISCO expert I can't say for sure. If your router is doing anything more complicated that shuffling packets between an ethernet interface and either a serial port or BRI port, then you should work through if this will work for you.

First define a route map with a name of proxy-redirect (name doesn't matter) and specify the next hop to be the machine Squid runs on. ! route-map proxy-redirect permit 10 match ip address 110 set ip next-hop 203.24.133.2 ! Define an access list to trap HTTP requests. The first line allows the Squid host direct access so an routing loop is not formed. ! access-list 110 deny tcp host 203.24.133.2 any eq www access-list 110 permit tcp any any eq www ! Apply the route map to the ethernet interface. ! interface Ethernet0 ip policy route-map proxy-redirect !


Previous Next Table of Contents