24.16 Trimming a Huge DirectorySome implementations of the BSD fast filesystem never truncate directories. That is, when you delete a file, the filesystem marks its directory entry as "invalid," but doesn't actually delete the entry. The old entry can be re-used when someone creates a new file, but will never go away. Therefore, the directories themselves can only get larger with time. Directories usually don't occupy a huge amount of space, but searching through a large directory is noticeably slow. So you should avoid letting directories get too large. On many UNIX systems, the only way to "shrink a directory" is to move all of its files somewhere else and then remove it; for example:
This method also works on V7-ish filesystems. It cannot be applied to the root of a filesystem. Other implementations of the BSD fast filesystem do truncate directories. They do this after a complete scan of the directory has shown that some number of trailing fragments are empty. Complete scans are forced for any operation that places a new name into the directory - such as creat (2) or link (2). In addition, new names are always placed in the earliest possible free slot. Hence, on these systems there is another way to shrink a directory. [How do you know if your BSD filesystem truncates directories? Try the pseudo-code below (but use actual commands), and see if it has an effect. -ML ] while (the directory can be shrunk) { mv (file in last slot) (some short name) mv (the short name) (original name) } This works on the root of a filesystem as well as subdirectories. Neither method should be used if some external agent (for example, a daemon) is busy looking at the directory. The first method will also fail if the external agent is quiet but will resume and hold the existing directory open (for example, a daemon program, like sendmail , that rescans the directory, but which is currently stopped or idle). The second method requires knowing a "safe" short name - i.e., a name that doesn't duplicate any other name in the directory. I have found the second method useful enough to write a shell script to do the job. I call the script squoze :
[The ls -f option lists entries in the order they appear in the directory; it doesn't sort. -JP ] This script does not handle filenames with embedded newlines. It is, however, safe to apply to a sendmail queue while sendmail is stopped. - in comp.unix.admin on Usenet, 22 August 1991 |
|