The problem of protecting users from accidental file deletion is one
that many people have encountered, and therefore there are many
solutions of different types already implemented and available.
Which solution you choose depends on
the features you want it to have and on how you want it to do its job.
Many people do not use the
shell-script solutions described
above (
23.8
)
,
because they are too slow or too unreliable or
because they don't allow deleted files to be recovered for long
enough.
For example, Purdue University runs a large network of many different
machines that utilize some local file space and some
NFS (
1.33
)
file space.
Their file recovery system,
entomb
, replaces certain system calls
(for example,
open
(2),
unlink
(2)) with
entomb
functions that check to
see if a file would be destroyed by the requested system call; if so,
the file is backed up (by asking a local or remote
entomb
daemon
to do so) before the actual system call is performed.
The advantages of this system are that you don't have to create any
new applications to do safe file removal - the standard
rm
program
will automatically do the right thing, as will
mv
and any other
programs that have the potential of erasing files.
Even
cat a b > a
is recoverable.
A disadvantage of this system is that you have to have the source
code for your UNIX system and be able to recompile its utilities in
order to link them against the
entomb
libraries. Furthermore, if
you wish to install this system on your machines, you have to be able
to install it on
all
of them. If someone learns
entomb
on a
machine you manage and then wants to use it on a workstation in a
private lab for which you do not have source code, it can't be done.
Also, there is a danger of people getting used to
entomb
being
there to save them if they make mistakes, and then losing a file when
they use
rm
or
mv
on a system that doesn't have
entomb
.
If you don't have strict control over all the machines on which you want to
have file-deletion protection, or if you don't have source code and
therefore can't use something like
entomb
, there are several other
options available. One of them is the
delete
package,
written at MIT.
delete
|
delete
overcomes several of the disadvantages of
entomb
. It
is very simple, compiles on virtually any machine, and doesn't require
any sort of superuser access to install. This means that if you learn
to use
delete
on one system and then move somewhere else, you can
take it with you by getting the source code and simply recompiling it
on the new system. Furthermore,
delete
intentionally isn't named
rm
, so that people who use it know they are using it and therefore
don't end up believing that files removed with
rm
can be
recovered. However, this means that users have to be educated to use
delete
instead of
rm
when removing files. |
delete
works by renaming files with a prefix that marks them as
deleted. For example,
delete foo
would simply rename the
file
foo
to
.#foo
. Here's an example of the
delete
,
undelete
,
lsdel
, and
expunge
commands in action:
-A
|
The directory starts with three files:
%
ls
a b c
One of the files is deleted:
%
delete a
The deleted file doesn't show up with normal ls because the name
now starts with a dot (.). However, it shows up when files starting with .
are listed or when the lsdel command is used:
%
ls
b c
%
ls -A
.#a b c
%
lsdel
a
Bringing the file back with undelete leaves us back where we started:
%
undelete a
%
ls
a b c
We can delete everything:
%
delete *
%
lsdel
a b c
We can expunge individual files or the current working directory:
%
expunge a
%
lsdel
b c
%
expunge
After the last expunge, there are no files left at all:
%
lsdel
%
ls -A
%
|
The technique used by
delete
has some advantages and some
disadvantages. The advantages include:
-
It works on any filesystem type - local, NFS, AFS, RFS, whatever.
You don't have to have special daemons running on your file servers
in order for it to work, and there are no daemons to go down and
prevent deleted file archiving from taking place.
-
It maintains the directory locations in which deleted files are
stored so that they can be undeleted in the same locations.
-
It maintains file permissions and ownership so that undeleted files
can be restored with them. Furthermore, deleted files can be
undeleted by anyone who had permission to delete them in the first
place, not just by the one individual who deleted them.
Disadvantages include:
-
Deleted files are counted against a user's
disk quota (
24.17
)
until they are
actually permanently removed (either by the system, a few days after
they are deleted, or by the user with the
expunge
command that
is part of the
delete
package). Some people would actually call
this an advantage, because it prevents people from using deleted
file space to store large files (something which is possible with
entomb
).
-
Deleted files show up when a user does
ls -a
. This is
considered a relatively minor disadvantage by most people,
especially since
files starting with a dot (
.
) are supposed to be
hidden (
16.11
)
most of the time.
-
Deleted files have to be searched for in filesystem trees in order
to expunge them, rather than all residing in one location as they do
with
entomb
. This, too, is usually considered a minor
disadvantage, since most systems already
search the entire filesystem (
23.22
)
each night automatically in order to delete certain
temporary files.
-
Only the
entomb
program protects files. A user can still blow
away a file with
mv
,
cat a b > a
, etc.
If your main concern
is eliminating accidental file deletions with
rm
, this
isn't
much of a problem; furthermore, it is not clear that the extra
overhead required to run something like
entomb
is worth the
advantage gained (even if it is possible to do what
entomb
needs
at your site).
entomb
and
delete
represent the two main approaches to the
problem of protection from accidental file erasure. Other packages of
this sort choose one or the other of these basic techniques in order
to accomplish their purposes.