You often want to check to make sure you have the right
"kind" of file before doing
something. For example, you'd like to read the file
tar. But before typing more
tar, you'd like to know whether this file
is your set of notes on carbon-based sludge or the
tar executable. If you're wrong,
the consequences might be unpleasant. Sending the
tar executable to your screen might screw up your
terminal settings, log you off, or do any number of unpleasant
things.
/etc/magic has four fields:
offset data-type value file-type
These are as follows:
- offset
-
The offset into the file at which magic will try
to find something. If you're looking for something
right at the beginning of the file, the offset should be
0. (This is usually what you want.)
- data-type
-
The type of test to make. Use string for text
comparisons, byte for byte comparisons,
short for two-byte comparisons, and
long for four-byte comparisons.
- value
-
The value you want to find. For string comparisons, any text string
will do; you can use the standard Unix escape sequences (such as
\n for newline). For numeric comparisons (byte,
short, long), this field should be a number, expressed as a C
constant (e.g., 0x77 for the hexadecimal byte 77).
- file-type
-
The string that file will print if this test
succeeds.
So, we know that RCS archives begin with the word
head. This word is right at the beginning of the
file (offset 0). Since we obviously want a string comparison, we make
the the following addition to /etc/magic:
0 string head RCS archive
This says, "The file is an RCS archive if you find
the string head at an offset of 0 bytes from the
beginning of the file." Does it work?
% file RCS/0001,v
RCS/0001,v: RCS archive
As I said, the tests can be much more complicated, particularly if
you're working with binary files. To recognize
simple text files, this is all you need to know.
-- ML
 |  |  |
12.5. What's in That Whitespace? |  | 12.7. Squash Extra Blank Lines |