12.4. Show Nonprinting Characters with cat -v or od -c
Especially if you
ASCII-based terminal, files can have characters that your terminal
can't display. Some characters will lock up your
communications software or hardware, make your screen look strange,
or cause other weird problems. So if you'd like to
look at a file and you aren't sure
what's in there, it's not a good
idea to just cat the file!
Instead, try cat -v. It shows an
("printable") representation of
unprintable and non-ASCII characters. In fact,
although most manual pages don't explain how, you
can read the output and see what's in the file.
Another utility for displaying nonprintable files is
od. I usually use its
-c option when I need to look at a file character by
Let's look at a file that's almost
guaranteed to be unprintable: a directory file. This example is on a
standard V7 (Unix Version 7) filesystem. (Unfortunately, some Unix
systems won't let you read a directory. If you want
to follow along on one of those systems, try a compressed file (Section 15.6)
or an executable program from /bin.) A directory
usually has some long lines, so it's a good idea to
pipe cat's output through
% ls -fa
% cat -v . | fold -62
% od -c .
0000000 377 016 . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000020 > 007 . . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 341 \n c o m p \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000060 \0 \0 M a s s A v e F o o d \0 \0 \0
0000100 \0 \0 h i s t \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
Each entry in a V7-type directory is 16 bytes long
(that's also 16 characters, in the ASCII system).
The od -c command starts each line with the
number of bytes, in octal, shown since the start of the file. The
first line starts at byte 0. The second line starts at byte 20 octal
(that's byte 16 in decimal, the way most people
count). And so on. Enough about od for now,
though. We'll come back to it in a minute. Time to
dissect the cat -v output:
You've probably seen sequences like
^N and ^G. Those are control
Another character like this is ^@, the character
0). There are a lot of NULs in the directory; more about that later.
character (ASCII 177 octal) is shown as ^?. Check
an ASCII chart.
cat -v has its own symbol for characters outside
the ASCII range with their high bits set, also called
-v prints those as M- followed by
another character. There are two of them in the cat
-v output: M-^? and
To get a metacharacter, you add 200 octal. For an example,
let's look at M-a. The octal
value of the letter a is 141. When cat
-v prints M-a, it means the character
you get by adding 141+200, or 341 octal.
You can decode that the character cat prints as
M-^? in the same way. The ^?
stands for the DEL character, which is octal 177. Add 200+177 to get
If a character isn't
it's a regular printable character. The entries in
the directory (., ..,
comp, MassAveFood, and
hist) are all made of regular ASCII characters.
If you're wondering where the entries
MassAveFood and hist are in the
ls listing, the answer is that they
aren't. Those entries have been deleted from the
directory. Unix puts two NUL (ASCII 0, or ^@)
bytes in front of the names of deleted V7 directory entries.
cat has two options,
-t and -e, for displaying
whitespace in a line. The -v option
TAB and trailing-space characters to a
visible form without those options. See Section 12.5.
-c. It's easier to explain than
od -c shows some characters
starting with a backslash (\). It uses the
standard Unix and C abbreviations for control characters where it can. For
instance, \n stands for a newline character,
\t for a tab, etc. There's a
newline at the start of the comp entry -- see
it in the od -c output? That explains why the
cat -v output was broken onto a new line at that
place: cat -v doesn't translate
newlines when it finds them.
The \0 is a NUL character (ASCII 0).
It's used to pad the ends of entries in V7
directories when a name isn't the full 14 characters
od -c shows the octal value of other characters
as three digits. For instance, the 007 means
"the character 7 octal."
cat -v shows this as ^G
Metacharacters, the ones with octal
values 200 and higher, are shown as
cat -v. In od -c,
you'll see their octal values -- such as
Each directory entry on a Unix Version 7 filesystem starts with a
two-byte "pointer" to its location
in the disk's inode table. When you type a filename,
Unix uses this pointer to find the actual file information on the
disk. The entry for this directory (named .) is 377
016. Its parent (named ..) is at
> 007. And
comp's entry is 341
\n. Find those in the cat -v output,
if you want; and compare the two outputs.
Like cat -v, regular printable characters are
shown as is by od -c.
(Section 13.15) program finds printable strings of characters
(such as filenames) inside mostly nonprintable files (such as
Copyright © 2003 O'Reilly & Associates. All rights reserved.