home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 28.11 cmp and diff Chapter 28
Comparing Files
Next: 28.13 make Isn't Just for Programmers!
 

28.12 Comparing Two Files with comm

The comm command can tell you what information is common to two lists, and what information appears uniquely in one list or the other. For example, let's say you're compiling information on the favorite movies of critics Siskel and Ebert. The movies are listed in separate files (and must be sorted ( 36.1 ) - if they aren't sorted, the ! script ( 9.18 ) will help). For the sake of illustration, assume each list is short:

% 

cat siskel


Citizen Kane
Halloween VI
Ninja III
Rambo II
Star Trek V
Zelig
% 

cat ebert


Cat People
Citizen Kane
My Life as a Dog
Q
Z
Zelig

To compare the favorite movies of your favorite critics, type:

% 

comm siskel ebert


                  Cat People
                                         Citizen Kane
Halloween VI
                  My Life as a Dog
Ninja III
                  Q
Rambo II
Star Trek V
                  Z
                                         Zelig

Column 1 shows the movies that only Siskel likes; Column 2 shows those that only Ebert likes; and Column 3 shows the movies that they both like. You can suppress one or more columns of output by specifying that column as a command-line option. For example, to suppress Columns 1 and 2 (displaying only the movies both critics like), you would type:

% 

comm -12 siskel ebert


Citizen Kane
Zelig

As another example, say you've just received a new software release (Release 4), and it's your job to figure out which library functions have been added so that they can be documented along with the old ones. Let's assume you already have a list of the Release 3 functions ( r3_list ) and a list of the Release 4 functions ( r4_list ). (If you didn't, you could create them by changing to the directory that has the function manual pages, listing the files with ls , and saving each list to a file.) In the lists below, we've used letters of the alphabet to represent the functions:

% 

cat r3_list


b
c
d
f
g
h

% 

cat r4_list


a
b
c
d
e
f

You can now use the comm command to answer several questions you might have:

  • Which functions are new to Release 4? Answer:

    % 
    
    comm -13 r3_list r4_list    
    
    
    Show 2nd column, which is "Release 4 only"
     
    a
    e

  • Which Release 3 functions have been dropped in Release 4? Answer:

    % 
    
    comm -23 r3_list r4_list    
    
    
    Show 1st column, which is "Release 3 only"
     
    g
    h

  • Which Release 3 functions have been retained in Release 4? Answer:

    % 
    
    comm -12 r3_list r4_list    
    
    
    Show 3rd column, which is "common functions"
     
    b
    c
    d
    f

You can create partial lists by saving the above output to three separate files.

comm can only compare sorted files. If you can't sort the files, look at the trick in article 2.14 :  using diff and grep .

- DG


Previous: 28.11 cmp and diff UNIX Power Tools Next: 28.13 make Isn't Just for Programmers!
28.11 cmp and diff Book Index 28.13 make Isn't Just for Programmers!

The UNIX CD Bookshelf Navigation The UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System