[Chapter 23] 23.9 delete: Protecting Files from Accidental Deletion

23.9 delete: Protecting Files from Accidental Deletion

The problem of protecting users from accidental file deletion is one that many people have encountered, and therefore there are many solutions of different types already implemented and available. Which solution you choose depends on the features you want it to have and on how you want it to do its job. Many people do not use the shell-script solutions described above ( 23.8 ) , because they are too slow or too unreliable or because they don't allow deleted files to be recovered for long enough.

For example, Purdue University runs a large network of many different machines that utilize some local file space and some NFS ( 1.33 ) file space. Their file recovery system, entomb , replaces certain system calls (for example, open (2), unlink (2)) with entomb functions that check to see if a file would be destroyed by the requested system call; if so, the file is backed up (by asking a local or remote entomb daemon to do so) before the actual system call is performed.

The advantages of this system are that you don't have to create any new applications to do safe file removal - the standard rm program will automatically do the right thing, as will mv and any other programs that have the potential of erasing files. Even cat a b > a is recoverable.

A disadvantage of this system is that you have to have the source code for your UNIX system and be able to recompile its utilities in order to link them against the entomb libraries. Furthermore, if you wish to install this system on your machines, you have to be able to install it on all of them. If someone learns entomb on a machine you manage and then wants to use it on a workstation in a private lab for which you do not have source code, it can't be done. Also, there is a danger of people getting used to entomb being there to save them if they make mistakes, and then losing a file when they use rm or mv on a system that doesn't have entomb .

If you don't have strict control over all the machines on which you want to have file-deletion protection, or if you don't have source code and therefore can't use something like entomb , there are several other options available. One of them is the delete package, written at MIT.

delete

delete overcomes several of the disadvantages of entomb . It is very simple, compiles on virtually any machine, and doesn't require any sort of superuser access to install. This means that if you learn to use delete on one system and then move somewhere else, you can take it with you by getting the source code and simply recompiling it on the new system. Furthermore, delete intentionally isn't named rm , so that people who use it know they are using it and therefore don't end up believing that files removed with rm can be recovered. However, this means that users have to be educated to use delete instead of rm when removing files.

delete works by renaming files with a prefix that marks them as deleted. For example, delete foo would simply rename the file foo to .#foo . Here's an example of the delete , undelete , lsdel , and expunge commands in action:

-A


The directory starts with three files:


% 

ls


a       b       c  


One of the files is deleted:


% 

delete a




The deleted file doesn't show up with normal ls because the name
now starts with a dot (.). However, it shows up when files starting with .
are listed or when the lsdel command is used:


% 

ls


b       c
% 

ls -A


.#a     b       c
% 

lsdel


a



Bringing the file back with undelete leaves us back where we started:


% 

undelete a


% 

ls


a       b       c


We can delete everything:


% 

delete *


% 

lsdel


a  b  c


We can expunge individual files or the current working directory:


% 

expunge a


% 

lsdel


b  c
% 

expunge




After the last expunge, there are no files left at all:


% 

lsdel


% 

ls -A


%

The technique used by delete has some advantages and some disadvantages. The advantages include:

It works on any filesystem type - local, NFS, AFS, RFS, whatever. You don't have to have special daemons running on your file servers in order for it to work, and there are no daemons to go down and prevent deleted file archiving from taking place.
It maintains the directory locations in which deleted files are stored so that they can be undeleted in the same locations.
It maintains file permissions and ownership so that undeleted files can be restored with them. Furthermore, deleted files can be undeleted by anyone who had permission to delete them in the first place, not just by the one individual who deleted them.

Disadvantages include:

Deleted files are counted against a user's disk quota ( 24.17 ) until they are actually permanently removed (either by the system, a few days after they are deleted, or by the user with the expunge command that is part of the delete package). Some people would actually call this an advantage, because it prevents people from using deleted file space to store large files (something which is possible with entomb ).
Deleted files show up when a user does ls -a . This is considered a relatively minor disadvantage by most people, especially since files starting with a dot (. ) are supposed to be hidden ( 16.11 ) most of the time.
Deleted files have to be searched for in filesystem trees in order to expunge them, rather than all residing in one location as they do with entomb . This, too, is usually considered a minor disadvantage, since most systems already search the entire filesystem ( 23.22 ) each night automatically in order to delete certain temporary files.
Only the entomb program protects files. A user can still blow away a file with mv , cat a b > a , etc. If your main concern is eliminating accidental file deletions with rm , this isn't much of a problem; furthermore, it is not clear that the extra overhead required to run something like entomb is worth the advantage gained (even if it is possible to do what entomb needs at your site).

entomb and delete represent the two main approaches to the problem of protection from accidental file erasure. Other packages of this sort choose one or the other of these basic techniques in order to accomplish their purposes.

- JIK