24.7 Compressing Files to Save SpaceMost files can be "squeezed" to take up less space. Let's say you have a text file. Each letter occupies a byte, but almost all of the characters in the file are alphanumeric or punctuation, and there are only about 70 such characters. Furthermore, most of the characters are (usually) lowercase; furthermore, the letter "e" turns up more often than "z," the letter "e" often shows up in pairs, and so on. All in all, you don't really need a full eight-bit byte per character. If you're clever, you can reduce the amount of space a file occupies by 50 percent or more.
To compress a file, just give the command:
%
The file's name is changed to If the file shouldn't be compressed - that is, if the file has hard links ( 18.4 ) or the corresponding file already exists- gzip prints a message. You can use the -f option to "force" gzip to compress such a file. This might be better if you're using gzip within a shell script and don't want to worry about files that might not be compressed. Compressed files are always binary files; even if they started out as text files, you can't read them. To get back the original file, use the gunzip utility:
% ( gunzip also handles files from compress , or you can use uncompress if you'd rather.) You can omit the .gz at the end of the filename. If you just want to read the file but don't want to restore the original version, use the command gzcat ; this just decodes the file and dumps it to standard output. It's particularly convenient to pipe gzcat into more ( 25.3 ) or grep ( 27.1 ) . (There's a zcat for compress ed files, but gzcat can handle those files too.) The CD-ROM has several scripts that work on compressed files, uncompressing and recompressing them automatically: editing with zvi , zex , and zed ( 24.11 ) ; viewing with zmore , zless , and zpg ( 25.5 ) ; or running almost any command that can read from a pipe with zloop ( 24.10 ) . There are a number of other compression utilities floating around the UNIX world. gzip also works on other operating systems, though. It's reliable and freely available. So gzip has become the utility that more people choose. - , |
|