14.3. rm and Its Dangers
Under Unix, you use the rm command to delete files. The command is simple enough; you just type rm followed by a list of files. If anything, rm is too simple. It's easy to delete more than you want, and once something is gone, it's permanently gone. There are a few hacks that make rm somewhat safer, and we'll get to those momentarily. But first, here's a quick look at some of the dangers.
To understand why it's impossible to reclaim deleted files, you need to know a bit about how the Unix filesystem works. The system contains a "free list," which is a list of disk blocks that aren't used. When you delete a file, its directory entry (which gives it its name) is removed. If there are no more links (Section 10.3) to the file (i.e., if the file only had one name), its inode (Section 14.2) is added to the list of free inodes, and its datablocks are added to the free list.
Well, why can't you get the file back from the free list? After all, there are DOS utilities that can reclaim deleted files by doing something similar. Remember, though, Unix is a multitasking operating system. Even if you think your system is a single-user system, there are a lot of things going on "behind your back": daemons are writing to log files, handling network connections, processing electronic mail, and so on. You could theoretically reclaim a file if you could "freeze" the filesystem the instant your file was deleted -- but that's not possible. With Unix, everything is always active. By the time you realize you made a mistake, your file's data blocks may well have been reused for something else.
When you're deleting files, it's important to use wildcards carefully. Simple typing errors can have disastrous consequences. Let's say you want to delete all your object (.o) files. You want to type:
% rm *.o
But because of a nervous twitch, you add an extra space and type:
% rm * .o
It looks right, and you might not even notice the error. But before you know it, all the files in the current directory will be gone, irretrievably.
If you don't think this can happen to you, here's something that actually did happen to me. At one point, when I was a relatively new Unix user, I was working on my company's business plan. The executives thought, so as to be "secure," that they'd set a business plan's permissions so you had to be root (Section 1.18) to modify it. (A mistake in its own right, but that's another story.) I was using a terminal I wasn't familiar with and accidentally created a bunch of files with four control characters at the beginning of their name. To get rid of these, I typed (as root):
# rm ????*
This command took a long time to execute. When about two-thirds of the directory was gone, I realized (with horror) what was happening: I was deleting all files with four or more characters in the filename.
The story got worse. They hadn't made a backup in about five months. (By the way, this article should give you plenty of reasons for making regular backups (Section 38.3).) By the time I had restored the files I had deleted (a several-hour process in itself; this was on an ancient version of Unix with a horrible backup utility) and checked (by hand) all the files against our printed copy of the business plan, I had resolved to be very careful with my rm commands.
[Some shells have safeguards that work against Mike's first disastrous example -- but not the second one. Automatic safeguards like these can become a crutch, though . . . when you use another shell temporarily and don't have them, or when you type an expression like Mike's very destructive second example. I agree with his simple advice: check your rm commands carefully! -- JP]
Copyright © 2003 O'Reilly & Associates. All rights reserved.