SQUID Frequently Asked Questions: Troubleshooting

home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco

10. Troubleshooting

10.1 Why am I getting ``Proxy Access Denied?''

If squid is in httpd-accelerator mode, it will accept normal HTTP requests and forward them to a HTTP server, but it will not honor proxy requests. If you want your cache to also accept proxy-HTTP requests then you must enable this feature: http_accel_with_proxy on Alternately, you may have misconfigured one of your ACLs. Check the access.log and squid.conf files for clues.

10.2 I can't get `local_domain` to work; Squid is caching the objects from my local servers.

The local_domain directive does not prevent local objects from being cached. It prevents the use of sibling caches when fetching local objects. If you want to prevent objects from being cached, use the cache_stoplist or http_stop configuration options (depending on your version).

10.3 I get `Connection Refused` when the cache tries to retrieve an object located on a sibling, even though the sibling thinks it delivered the object to my cache.

If the HTTP port number is wrong but the ICP port is correct you will send ICP queries correctly and the ICP replies will fool your cache into thinking the configuration is correct but large objects will fail since you don't have the correct HTTP port for the sibling in your squid.conf file. If your sibling changed their http_port, you could have this problem for some time before noticing.

10.4 Running out of filedescriptors

If you see the Too many open files error message, you are most likely running out of file descriptors. This may be due to running Squid on an operating system with a low filedescriptor limit. This limit is often configurable in the kernel or with other system tuning tools. There are two ways to run out of file descriptors: first, you can hit the per-process limit on file descriptors. Second, you can hit the system limit on total file descriptors for all processes.

For Linux, have a look at filehandle.patch.linux by Michael O'Reilly

For Solaris, add the following to your /etc/system file to increase your maximum file descriptors per process:

set rlim_fd_max = 4096 set rlim_fd_cur = 1024

You should also #define SQUID_FD_SETSIZE in include/config.h to whatever you set rlim_fd_max to. Going beyond 4096 may break things in the kernel.

Solaris' select(2) only handles 1024 descriptors, so if you need more, edit srcMakefile/ and enable $(USE_POLL_OPT). Then recompile squid.

For FreeBSD (by Torsten Sturm <torsten.sturm@axis.de>):

How do I check my maximum filedescriptors?
Do sysctl -a and look for the value of kern.maxfilesperproc.
How do I increase them? sysctl -w kern.maxfiles=XXXX sysctl -w kern.maxfilesperproc=XXXX Warning: You probably want maxfiles > maxfilesperproc if you're going to be pushing the limit.
What is the upper limit?
I don't think there is a formal upper limit inside the kernel. All the data structures are dynamically allocated. In practice there might be unintended metaphenomena (kernel spending too much time searching tables, for example).

For most BSD-derived systems (SunOS, 4.4BSD, OpenBSD, FreeBSD, NetBSD, BSD/OS, 386BSD, Ultrix) you can also use the ``brute force'' method to increase these values in the kernel (requires a kernel rebuild):

How do I check my maximum filedescriptors?
Do pstat -T and look for the files value, typically expressed as the ratio of currentmaximum/.
How do I increase them the easy way?
One way is to increase the value of the maxusers variable in the kernel configuration file and build a new kernel. This method is quick and easy but also has the effect of increasing a wide variety of other variables that you may not need or want increased.
Is there a more precise method?
Another way is to find the param.c file in your kernel build area and change the arithmetic behind the relationship between maxusers and the maximum number of open files.

Here are a few examples which should lead you in the right direction:

SunOS
Change the value of nfile in usr/kvm/sys/conf.common/param.c/tt> by altering this equation: int nfile = 16 * (NPROC + 16 + MAXUSERS) / 10 + 64; Where NPROC is defined by: #define NPROC (10 + 16 * MAXUSERS)
FreeBSD (from the 2.1.6 kernel)
Very similar to SunOS, edit /usr/src/sys/conf/param.c and alter the relationship between maxusers and the maxfiles and maxfilesperproc variables: int maxfiles = NPROC*2; int maxfilesperproc = NPROC*2; Where NPROC is defined by: #define NPROC (20 + 16 * MAXUSERS) The per-process limit can also be adjusted directly in the kernel configuration file with the following directive: options OPEN_MAX=128
BSD/OS (from the 2.1 kernel)
Edit /usr/src/sys/conf/param.c and adjust the maxfiles math here: int maxfiles = 3 * (NPROC + MAXUSERS) + 80; Where NPROC is defined by: #define NPROC (20 + 16 * MAXUSERS) You should also set the OPEN_MAX value in your kernel configuration file to change the per-process limit.

NOTE: After you rebuild/reconfigure your kernel with more filedescriptors, you must then recompile Squid. Squid's configure script determines how many filedescriptors are available, so you must make sure the configure script runs again as well. For example: cd squid-1.1.x make realclean ./configure --prefix=/usr/local/squid make

10.5 My squid dies periodically, and I see log entries complaining about being unable to `malloc(3)` more memory, but my system has lots of RAM available!

In addition to maximum file descriptor limits, many systems also have limits on the maximum amount of memory that can be devoted to a process, especially for non-root processes. BSD/OS happens to have a fairly low limit which you may want to increase. Edit your kernel configuration file and change (or add) these lines as appropriate:

options DFLDSIZ=67108864 # 64 meg default max data size (was 16) options MAXDSIZ=134217728 # 128 meg max data size (was 64) This method requires a kernel rebuild and reboot.

To increase the data size for Digital UNIX, edit the file /etc/sysconfigtab and add the entry... proc: per-proc-data-size=1073741824 Or, with csh, use the limit command, such as zpoprp.zpo.dec.com > limit datasize 1024M

Editing /etc/sysconfigtab requires a reboot, but the limit command doesn't.

10.6 What are these strange lines about removing objects?

For example: 97/01/23 22:31:10| Removed 1 of 9 objects from bucket 3913 97/01/23 22:33:10| Removed 1 of 5 objects from bucket 4315 97/01/23 22:35:40| Removed 1 of 14 objects from bucket 6391

These log entries are normal, and do not indicate that squid has reached cache_swap_high.

Consult your cache information page in cachemgr.cgi for a line like this:

Storage LRU Expiration Age: 364.01 days

Objects which have not been used for that amount of time are removed as a part of the regular maintenance. You can set an upper limit on the LRU Expiration Age value with reference_age in the config file.

10.7 Why can't I set `cache_effective_user` to `nobody` under Linux?

Some users have reported that setting cache_effective_user to nobody under Linux does not work, and the server reports: FATAL: Don't run Squid as root, set 'cache_effective_user'! However, it appears that using any cache_effective_user other than nobody will succeed. One solution is to create a user account for Squid and set cache_effective_user to that.

Alternately you can change the UID for the nobody account from 65535 to 65534.

10.8 Can I change a Windows NT FTP server to list directories in Unix format?

Why, yes you can! Select the following menus:

Start
Programs
Microsoft Internet Server (Common)
Internet Service Manager

This will bring up a box with icons for your various services. One of them should be a little ftp ``folder.'' Double click on this.

You will then have to select the server (there should only be one) Select that and then choose ``Properties'' from the menu and choose the ``directories'' tab along the top.

There will be an option at the bottom saying ``Directory listing style.'' Choose the ``Unix'' type, not the ``MS-DOS'' type.

--Oskar Pearson <oskar@is.co.za>

10.9 Why does Squid use so much memory!?

One reason that Squid is fast and able to handle a lot of requests with a single process is because it uses a lot of memory. First, please see these other related FAQ entries:

Many users have found improved performance when linking Squid with an external malloc library. See Using GNU malloc.

10.10 Why am I getting ``Ignoring MISS from non-peer x.x.x.x?''

You are receiving ICP MISSes (via UDP) from a parent or sibling cache whose IP address your cache does not know about. This may happen in two situations.

If the peer is multihomed, it is sending packets out an interface which is not advertized in the DNS. Unfortunately, this is a configuration problem at the peer site. You can tell them to either add the IP address interface to their DNS, or use Squid's 'udp_outgoing_address' option to force the replies out a specific interface. For example:
on your parent squid.conf: udp_outgoing_address proxy.parent.com on your squid.conf: cache_host proxy.parent.com parent 3128 3130
You can also see this warning when sending ICP queries to multicast addresses. For security reasons, Squid requires your configuration to list all other caches listening on the multicast group address. If an unknown cache listens to that address and sends replies, your cache will log the warning message. To fix this situation, either tell the unknown cache to stop listening on the multicast address, or if they are legitimate, add them to your configuration file.

10.11 DNS lookups for domain names with underscores (_) always fail.

The standards for naming hosts ( RFC 952, RFC 1101) do not allow underscores in domain names:

A "name" (Net, Host, Gateway, or Domain name) is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.).

The resolver library that ships with recent versions of BIND enforces this restriction, returning an error for any host with underscore in the hostname. The best solution is to complain to the hostmaster of the offending site, and ask them to rename their host.

10.12 Why am I getting access denied from a sibling cache?

The answer to this is somewhat complicated, so please hold on. NOTE: most of this text is taken from ICP and the Squid Web Cache.

An ICP query does not include any parent or sibling designation, so the receiver really has no indication of how the peer cache is configured to use it. This issue becomes important when a cache is willing to serve cache hits to anyone, but only handle cache misses for its paying users or customers. In other words, whether or not to allow the request depends on if the result is a hit or a miss. To accomplish this, Squid acquired the miss_access feature in October of 1996.

The necessity of ``miss access'' makes life a little bit complicated, and not only because it was awkward to implement. Miss access means that the ICP query reply must be an extremely accurate prediction of the result of a subsequent HTTP request. Ascertaining this result is actually very hard, if not impossible to do, since the ICP request cannot convey the full HTTP request. Additionally, there are more types of HTTP request results than there are for ICP. The ICP query reply will either be a hit or miss. However, the HTTP request might result in a ``304 Not Modified'' reply sent from the origin server. Such a reply is not strictly a hit since the peer needed to forward a conditional request to the source. At the same time, its not strictly a miss either since the local object data is still valid, and the Not-Modified reply is quite small.

One serious problem for cache hierarchies is mismatched freshness parameters. Consider a cache C using ``strict'' freshness parameters so its users get maximally current data. C has a sibling S with less strict freshness parameters. When an object is requested at C, C might find that S already has the object via an ICP query and ICP HIT response. C then retrieves the object from S.

In an HTTP/1.0 world, C (and C's client) will receive an object that was never subject to its local freshness rules. Neither HTTP/1.0 nor ICP provides any way to ask only for objects less than a certain age. If the retrieved object is stale by Cs rules, it will be removed from Cs cache, but it will subsequently be fetched from S so long as it remains fresh there. This configuration miscoupling problem is a significant deterrent to establishing both parent and sibling relationships.

HTTP/1.1 provides numerous request headers to specify freshness requirements, which actually introduces a different problem for cache hierarchies: ICP still does not include any age information, neither in query nor reply. So S may return an ICP HIT if its copy of the object is fresh by its configuration parameters, but the subsequent HTTP request may result in a cache miss due to any Cache-control: headers originated by C or by C's client. Situations now emerge where the ICP reply no longer matches the HTTP request result.

In the end, the fundamental problem is that the ICP query does not provide enough information to accurately predict whether the HTTP request will be a hit or miss. In fact, the current ICP Internet Draft is very vague on this subject. What does ICP HIT really mean? Does it mean ``I know a little about that URL and have some copy of the object?'' Or does it mean ``I have a valid copy of that object and you are allowed to get it from me?''

So, what can be done about this problem? We really need to change ICP so that freshness parameters are included. Until that happens, the members of a cache hierarchy have only two options to toally eliminate the ``access denied'' messages from sibling caches:

Make sure all members have the same refresh_rules parameters.
Do not use miss_access at all. Promise your sibling cache administrator that your cache is properly configured and that you will not abuse their generosity. The sibling cache administrator can check his log files to make sure you are keeping your word.

If neither of these is realistic, then the sibling relationship should not exist.

10.13 Cannot bind socket FD NN to *:8080 (125) Address already in use

This means that another processes is already listening on port 8080 (or whatever you're using). It could mean that you have a Squid process already running, or it could be from another program. To verify, use the netstat command: netstat -naf inet | grep LISTEN That will show all sockets in the LISTEN state. You might also try netstat -naf inet | grep 8080 If you find that some process has bound to your port, but you're not sure which process it is, you might be able to use the excellent lsof program. It will show you which processes own every open file descriptor on your system.

10.14 icpDetectClientClose: ERROR xxx.xxx.xxx.xxx: (32) Broken pipe

This means that the client socket was closed by the client before Squid was finished sending data to it. Squid detects this by trying to read(2) some data from the socket. If the read(2) call fails, then Squid konws the socket has been closed. Normally the read(2) call returns ECONNRESET: Connection reset by peer and these are NOT logged. Any other error messages (such as EPIPE: Broken pipe are logged to cache.log. See the ``intro'' of section 2 of your Unix manual for a list of all error codes.

10.15 How come Squid doesn't work with NTLM Authorization.

We are not sure. We were unable to find any detailed information on NTLM (thanks Microsoft!), but here is our best guess:

Squid transparently passes the NTLM request and response headers between clients and servers. The encrypted challenge and response strings most likely encode the IP address of the client. Because the proxy is passing these strings and is connected with a different IP address, the authentication scheme breaks down. This implies that if NTLM authentication works at all with proxy caches, the proxy would need to intercept the NTLM headers and process them itself.

If anyone knows more about NTLM and knows the above to be false, please let us know.

10.16 The default parent option isn't working!

This message was received at squid-bugs:

If you have only ony parent, configured as: cache_host xxxx parent 3128 3130 no-query default nothing is sent to the parent; neither UDP packets, nor TCP connections.

Simply adding default to a parent does not force all requests to be sent to that parent. The term default is perhaps a poor choice of words. A default parent is only used as a last resort. If the cache is able to make direct connections, direct will be preferred over default. If you want to force all requests to your parent cache(s), use the inside_firewall option: inside_firewall none

10.17 ``Hot Mail'' complains about: Intrusion Logged. Access denied.

``Hot Mail'' is proxy-unfriendly and requires all requests to come from the same IP address. You can fix this by adding to your squid.conf: hierarchy_stoplist hotmail.com

10.18 My Squid becomes very slow after it has been running for some time.

This is most likely because Squid is using more memory than it should be for your system. When the Squid process becomes large, it experiences a lot of paging. This will very rapidly degrade the performance of Squid. Memory usage is a complicated problem. There are a number of things to consider.

First, examine the Cache Manager Info ouput and look at these two lines: Number of TCP connections: 121104 Page faults with physical i/o: 16720 Note, if your system does not have the getrusage() function, then you will not see the page faults line.

Divide the number of page faults by the number of connections. In this case 16720/121104 = 0.14. Ideally this ratio should be in the 0.0 - 0.1 range. It may be acceptable to be in the 0.1 - 0.2 range. Above that, however, and you will most likely find that Squid's performance is unacceptably slow.

If the ratio is too high, you will need to make some changes to lower the amount of memory Squid uses. There are a number of things to try:

Buy more memory for your system.
Try a different malloc library, such as GNU malloc.
Reduce the cache_mem parameter in the config file.
Turn the memory_pools off in the config file.
Reduce the cache_swap parameter in your config file. This will reduce the number of objects Squid keeps. Your hit ratio may go down a little, but your cache will perform better.
Reduce the maximum_object_size parameter. You won't be able to cache the larger objects, and your byte volume hit ratio may go down, but Squid will perform better overall.
Try the ``NOVM'' version of Squid.

10.19 WARNING: Failed to start 'dnsserver'

This could be a permission problem. Does the Squid userid have permission to execute the dnsserver program?

You might also try testing dnsserver from the command line: > echo oceana.nlanr.net | ./dnsserver Should produce something like: $name oceana.nlanr.net $h_name oceana.nlanr.net $h_len 4 $ipcount 1 132.249.40.200 $aliascount 0 $ttl 82067 $end

10.20 Sending in Squid bug reports

Bug reports for Squid should be sent to the squid-bugs alias. Any bug report must include

The Squid version
Your Operating System type and version

crashes and core dumps

There are two conditions under which squid will exit abnormally and generate a coredump. First, a SIGSEGV or SIGBUS signal will cause Squid to exit and dump core. Second, many functions include consistency checks. If one of those checks fail, Squid calls abort() to generate a core dump.

The core dump file will be left in either one of two locations:

The current directory when Squid was started
The first cache_dir directory if you have used the cache_effective_user option.

If you cannot find a core file, then either Squid does not have permission to write in its current directory, or perhaps your shell limits (csh and clones) are preventing the core file from being written. If you suspect the current directory is not writable, you can add


        cd /tmp

to your script which starts Squid (e.g. RunCache).

Once you have located the core dump file, use a debugger such as dbx or gdb to generate a stack trace: tirana-wessels squid/src 270% gdb squid /T2/Cache/core GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15.1 (hppa1.0-hp-hpux10.10), Copyright 1995 Free Software Foundation, Inc... Core was generated by `squid'. Program terminated with signal 6, Aborted. [...] (gdb) where #0 0xc01277a8 in _kill () #1 0xc00b2944 in _raise () #2 0xc007bb08 in abort () #3 0x53f5c in __eprintf (string=0x7b037048 "", expression=0x5f <Address 0x5f out of bounds>, line=8, filename=0x6b <Address 0x6b out of bounds>) #4 0x29828 in fd_open (fd=10918, type=3221514150, desc=0x95e4 "HTTP Request") at fd.c:71 #5 0x24f40 in comm_accept (fd=2063838200, peer=0x7b0390b0, me=0x6b) at comm.c:574 #6 0x23874 in httpAccept (sock=33, notused=0xc00467a6) at client_side.c:1691 #7 0x25510 in comm_select_incoming () at comm.c:784 #8 0x25954 in comm_select (sec=29) at comm.c:1052 #9 0x3b04c in main (argc=1073745368, argv=0x40000dd8) at main.c:671

If possible, you might keep the coredump file around for a day or two. It is often helpful if we can ask you to send additional debugger output, such as the contents of some variables.

Non-fatal bugs

If you find a non-fatal bug, such as incorrect HTTP processing, please send us a section of your cache.log with full debugging to demonstrate the problem. The cache.log file can become very large, so alternatively, you may want to copy it to an FTP or HTTP server where we can download it.

To enable full debugging on a running squid process, use the -k debug command line option: % ./squid -k debug Use the same command to restore Squid to normal debugging.

10.21 fork: (12) Cannot allocate memory

When Squid is reconfigured (SIGHUP) or the logs are rotated (SIGUSR1), some of the helper processes (ftpget, dnsserver) must be killed and restarted. If your system does not have enough virtual memory, the Squid process may not be able to fork to start the new helper processes. The best way to fix this is to increase your virtual memory by adding swap space. Normally your system uses raw disk partitions for swap space, but most operating systems also support swapping on regular files (Digital Unix excepted). See your system manual pages for swap, swapon, and mkfile.

10.22 FATAL: ipcache_init: DNS name lookup tests failed

Squid normally tests your system's DNS configuration before it starts server requests. Squit tries to resolve some common DNS names, as defined in the dns_testnames configuration directive. If Squid cannot resolve these names, it could mean that your DNS nameserver is unreachable or not running, or your /etc/resolv.conf file may contain incorrect information.

To disable this feature, use the -D command line option.

Note, Squid does NOT use the dnsservers to test the DNS. The test is performed internally, before the dnsservers start.

10.23 FATAL: Failed to make swap directory /var/spool/cache: (13) Permission denied

Starting with version 1.1.15, we have required that you first run squid -z to create the swap directories on your filesystem. If you have set the cache_effective_user option, then the Squid process takes on the given userid before making the directories. If the cache_dir directory (e.g. /var/spool/cache) does not exist, and the Squid userid does not have permission to create it, then you will get the ``permission denied'' error. This can be simply fixed by manually creating the cache directory. # mkdir /var/spool/cache # chown <userid> <groupid> /var/spool/cache # squid -z

Alternatively, if the directory already exists, then your operating system may be returning ``Permission Denied'' instead of ``File Exists'' on the mkdir() system call. This patch by Miquel van Smoorenburg should fix it.

Previous Next Table of Contents

bigmir)net