Chapter 3. User Accounts

Why all the grumbling then? Users introduce two things into the systems and networks we administer that make them significantly more complex: nondeterminism and individuality. We'll address the nondeterminism issues when we discuss user activity in the next chapter, but for now let's focus on individuality.

In most cases, users want to retain their own separate identities. Not only do they want a unique name, but they want unique "stuff" too. They want to be able to say, "These are my files. I keep them in my directories. I print them with my print quota. I make them available from my home page on the Web." Modern operating systems keep an account of all of these details for each user.

But who keeps track of all of the accounts on a system or network of systems? Who is ultimately responsible for creating, protecting, and disposing of these little shells for individuals? I'd hazard a guess and say "you, dear reader" -- or if not you personally, then tools you'll build to act as your proxy. This chapter is designed to help you with that responsibility.

Let's begin our discussion of users by addressing some of the pieces of information that form their identity and how it is stored on a system. We'll start by looking at Unix and Unix-variant users, and then address the same issues for Windows NT/Windows 2000. For current-generation MacOS systems, this is a non-issue, so we'll skip MacOS in this chapter. Once we address identity information for both operating systems, we'll construct a basic account system.

3.1. Unix User Identity

When discussing this topic, we have to putter around in a few key files because they store the persistent definition of a user's identity. By persistent definition, I mean those attributes of a user that exist during the entire lifespan of that user, persisting even while that user is not actively using a computer. Another word that we'll use for this persistent identity is account. If you have an account on a system, you can log in and become a user of that system.

Users come into being on a system at the point when their information is first added to the password file (or the directory service which offers the same information). A user's subsequent departure from the scene occurs when this entry is removed. We'll dive right in and look at how the user identity is stored.

3.1.1. The Classic Unix Password File

Let's start off with the "classic" password file format and then get more sophisticated from there. I call this format classic because it is the parent for all of the other Unix password file formats currently in use. It is still in use today in many Unix variants, including SunOS, Digital Unix, and Linux. Usually found on the system as /etc/passwd, this file consists of lines of ASCII text, each line representing a different account on the system or a link to another directory service. A line in this file is composed of several colon-separated fields. We'll take a close look at all of these fields as soon as we see how to retrieve them.

Here's an example line from /etc/passwd:

dnb:fMP.olmno4jGA6:6700:520:David N. Blank-Edelman:/home/dnb:/bin/zsh

There are at least two ways to go about accessing this information from Perl:

If we access it "by hand," we can treat this file like any random text file and parse it accordingly:

$passwd = "/etc/passwd";
open(PW,$passwd) or die "Can't open $passwd:$!\n";
while (<PW>){
	($name,$passwd,$uid,$gid,$gcos,$dir,$shell) = split(/:/);
    <your code here>
}
close(PW);

Or we can "let the system do it," in which case Perl makes available some of the Unix system library calls that parse this file for us. For instance, another way to write that last code snippet is:
```
while(($name,$passwd,$uid,$gid,$gcos,$dir,$shell) = getpwent(  )){
       <your code here>
}
endpwent(  );
```

Using these calls has the added advantage of automatically tying in to any OS-level name service being used (e.g., Network Information Service, or NIS). We'll see more of these library call functions in a moment (including an easier way to use getpwent( )), but for now let's look at the fields our code returns:[1]

[1]The values returned by getpwent( ) changed between Perl 5.004 and 5.005; this is the 5.004 list of values. In 5.005 and later, there are two additional fields, $quota and $comment, in the list right before $gcos. See your system documentation for getpwent( ) for more information.

Name

The login name field holds the short (usually eight characters or less), unique, nomme de machine for each account on the system. The Perl function getpwent( ), which we saw earlier being used in a list context, will return the name field if we call it in a scalar context:

$name = getpwent(  );

User ID (UID)

On Unix systems, the user ID (UID) is actually more important than the login name for most things. All of the files on a system are owned by a UID, not a login name. If we change the login name associated with UID 2397 in /etc/passwd from danielr to drinehart, these files instantly show up as be owned by drinehart instead. The UID is the persistent part of a user's identity internal to the operating system. The Unix kernel and filesystems keep track of UIDs, not login names, for ownership and resource allocation. A login name can be considered to be the part of a user's identity that is external to the core OS; it exists to make things easier for humans.

Here's some simple code to find the next available unique UID in a password file. This code looks for the highest UID in use and produces the next number:

$passwd = "/etc/passwd";
open(PW,$passwd) or die "Can't open $passwd:$!\n";
while (<PW>){
    @fields = split(/:/);
    $highestuid = ($highestuid < $fields[2]) ? $fields[2] : $highestuid;
}
close(PW);
print "The next available UID is " . ++$highestuid . "\n";

Table 3-1 lists other useful name- and UID-related Perl functions and variables.

Table 3.1. Login Name- and UID-Related Variables and Functions

Function/Variable	How Used
getpwnam($name)	In a scalar context returns the UID for that login name; in a list context returns all of the fields of a password entry
getpwuid($uid)	In a scalar context returns the login name for that UID; in a list context returns all of the fields of a password entry
$>	Holds the effective UID of the currently running Perl program
$<	Holds the real UID of the currently running Perl program

The primary group ID (GID)

On multiuser systems, users often want to share files and other resources with a select set of other users. Unix provides a user grouping mechanism to assist in this process. An account on a Unix system can be part of several groups, but it must be assigned to one primary group. The primary group ID (GID) field in the password file lists the primary group for that account.

Group names, GIDs, and group members are usually stored in the /etc/group file. To make an account part of several groups, you just list that account in several places in the file. Some OSes have a hard limit on the number of groups an account can join (eight used to be a common restriction). Here's a couple of lines from an /etc/group file:

bin::2:root,bin,daemon
sys::3:root,bin,sys,adm

The first field is the group name, the second is the password (some systems allow people to join a group by entering a password), the third is the GID of the group, and the last field is a list of the users in this group.

Schemes for group ID assignment are site-specific because each site has its own particular administrative and project boundaries. Groups can be created to model certain populations (students, salespeople, etc.), roles (backup operators, network administrators, etc.), or account purposes (backup accounts, batch processing accounts, etc.).

Dealing with group files via Perl files is a very similar process to the passwd parsing we did above. We can either treat it as a standard text file or use special Perl functions to perform the task. Table 3-2 lists the group-related Perl functions and variables.

Table 3.2. Group Name- and GID-Related Variables and Functions

Function/Variable	How Used
getgrent( )	In a scalar context returns the group name; in a list context returns these fields: `$name,$passwd,$gid,$members`
getgrnam($name)	In a scalar context returns the group ID; in a list context returns the same fields mentioned for `getgrent( )`
getgrgid($gid)	In a scalar context returns the group name; in a list context returns the same fields mentioned for `getgrent( )`
$)	Holds the effective GID of the currently running Perl program
$(	Holds the real GID of the currently running Perl program

The "encrypted" password

So far we've seen three key parts of how a user's identity is stored on a Unix machine. The next field is not part of this identity, but is used to verify that someone should be allowed to assume all of the rights, responsibilities, and privileges bestowed upon a particular user ID. This is how the computer knows that someone presenting her or himself as mguerre should be allowed to assume a particular UID. There are other, better forms of authentication that now exist in the world (e.g., public key cryptographic), but this is the one that has been inherited from the early Unix days.

It is common to see a line in a password file with just an asterisk (*) for a password. This convention is usually used when an administrator wants to disable the user from logging into an account without removing it altogether.

Dealing with user passwords is a topic unto itself. We deal with it later in this book in Chapter 10, "Security and Network Monitoring".

GCOS field

The GCOS[2] field is the least important field (from the computer's point of view). This field usually contains the full name of the user (e.g., "Roy G. Biv"). Often, people put their title and/or phone extension in this field as well.

[2]For some amusing details on the origin of the name of this field, see the GCOS entry at the Jargon Dictionary: http://www.jargon.org.

System administrators who are concerned about privacy issues on behalf of their users (as all should be) need to watch the contents of this field. It is a standard source for account-name-to-real-name mappings. On most Unix systems, this field is available as part of a world-readable /etc/passwd file or directory service, and hence the information is available to everyone on the system. Many Unix programs, mailers and finger daemons also consult this field when they attach a user's login name to some piece of information. If you have any need to withhold a user's real name from other people (e.g., if that user is a political dissident, federal witness, or a famous person), this is one of the places you must monitor.

As a side note, if you maintain a site with a less mature user base, it is often a good idea to disable mechanisms that allow users to change their GCOS field to any random string (for the same reasons that user-selected login names can be problematic). You may not want your password file to contain expletives or other unprofessional information.

Home directory

The next field contains the name of the user's home directory. This is the directory where the user begins her or his time on the system. Typically this is also where the files that configure that user's environment live.

It is important for security purposes that an account's home directory be owned and writable by that account only. World-writable home directories allow trivial account hacking. There are cases, however, where even a user-writable home directory is problematic. For example, in restricted shell scenarios (accounts that can only log in to perform a specific task without permission to change anything on the system), a user-writable home directory is a big no-no.

Here's some Perl code to make sure that every user's home directory is owned by that user and is not world writable:

use User::pwent;
use File::stat;

# note: this code will beat heavily upon any machine using 
# automounted homedirs
while($pwent = getpwent(  )){
    # make sure we stat the actual dir, even through layers of symlink
    # indirection
    $dirinfo = stat($pwent->dir."/."); 
    unless (defined $dirinfo){
        warn "Unable to stat ".$pwent->dir.": $!\n";
        next;
    }
    warn $pwent->name."'s homedir is not owned by the correct uid (".
         $dirinfo->uid." instead ".$pwent->uid.")!\n"
        if ($dirinfo->uid != $pwent->uid);

    # world writable is fine if dir is set "sticky" (i.e., 01000), 
    # see the manual page for chmod for more information
    warn $pwent->name."'s homedir is world-writable!\n"
      if ($dirinfo->mode & 022 and (!$stat->mode & 01000));
}
endpwent(  );

This code looks a bit different than our previous parsing code because it uses two magic modules by Tom Christiansen: User::pwent and File::stat. These modules override the normal getpwent( ) and stat( ) functions, causing them to return something different than the values mentioned before. When User::pwent and File::stat are loaded, these functions return objects instead of lists or scalars. Each object has a method named after a field that normally would be returned in a list context. So code like:

$gid = (stat("filename"))[5];

can be written more legibly as:

use File::stat;
$stat = stat("filename");
$gid = $stat->gid;

or even:

use File::stat;
$gid = stat("filename")->gid;

User shell

The final field in the classic password file format is the user shell field. This field usually contains one of a set of standard interactive programs (e.g., sh, csh, tcsh, ksh, zsh) but it can actually be set to the path of any executable program or script. From time to time, people have joked (half-seriously) about setting their shell to be the Perl interpreter. For at least one shell (zsh), people have actually contemplated embedding a Perl interpreter in the shell itself, but this has yet to happen. There is, however, some serious work that has been done to create a Perl shell (http://www.focusresearch.com/gregor/psh/ ) and to embed Perl into Emacs, an editor that could easily pass for an operating system (http://john-edwin-tobey.org/perlmacs/ ).

On occasion, you might have reason to list nonstandard interactive programs in this field. For instance, if you wanted to create a menu-driven account, you could place the menu program's name here. In these cases some care has to be taken to prevent someone using that account from reaching a real shell or they may wreak havoc. A common mistake made is including a mail program in the menu that allows the user to launch an editor or pager for mail composition and mail reading. Either the editor or pager could have a shell-escape function built in.

A list of standard, acceptable shells on a system is often kept in /etc/shells for the FTP daemon's benefit. Most FTP daemons will not allow a normal user to connect to a machine if their shell in /etc/passwd (or networked password file) is not one of a list kept in /etc/shells. Here's some Perl code to report accounts that do not have approved shells:

use User::pwent;

$shells = "/etc/shells";
open (SHELLS,$shells) or die "Unable to open $shells:$!\n";
while(<SHELLS>){
    chomp;
    $okshell{$_}++;
}
close(SHELLS);

while($pwent = getpwent(  )){
   warn $pwent->name." has a bad shell (".$pwent->shell.")!\n"
     unless (exists $okshell{$pwent->shell});
}
endpwent(  );

3.1.2. Extra Fields in BSD 4.4 passwd Files

At the BSD (Berkeley Software Distribution) 4.3 to 4.4 upgrade point, the BSD variants added two twists to the classic password file format: additional fields, and the introduction of a binary database format used to store account information.

BSD 4.4 systems add some fields to the password file in between the GID and GCOS fields. The first field they added was the class field. This allows a system administrator to partition the accounts on a system into separate classes (e.g., different login classes might be given different resource limits like CPU time restrictions). BSD variants also add change and expire fields to hold an indication of when a password must be changed and when the account will expire. We'll see fields like these when we get to the next Unix password file format as well.

When compiled under an operating system that supports these extra fields, Perl includes the contents of these fields in the return value of functions like getpwent( ). This is one good reason to use getpwent( ) in your programs instead of split( )ing the password file entries by hand.

3.1.3. Binary Database Format in BSD 4.4

The second twist added to the password mechanisms by BSD is their use of a database format, rather than plain text, for primary storage of password file information. BSD machines keep their password file information in DB format, a greatly updated version of the older (Unix database) DBM (Database Management) libraries. This change allows the system to do speedy lookups of password information.

The program pwd_mkdb takes the name of a password text file as its argument, creates and moves two database files into place, and then moves this text file into /etc/master.passwd. The two databases are used to provide a shadow password scheme, differing in their read permissions and encrypted password field contents. We'll talk more about this in the next section.

Perl has the ability to directly work with DB files (we'll work with this format later in Chapter 9, "Log Files"), but in general I would not recommend directly editing the databases while the system is in use. The issue here is one of locking: it's very important not to change a crucial database like your password file without making sure other programs are not similarly trying to write to it or read from it. Standard operating system programs like chpasswd handle this locking for you.[3] The sleight-of-hand approach we saw for quotas in Chapter 2, "Filesystems", which used the EDITOR variable, can be used with chpasswd as well.

[3]pwd_mkdb may or may not perform this locking for you (depending on the BSD flavor and version), however, so caveat implemptor.

3.1.4. Shadow Passwords

Earlier I emphasized the importance of protecting the contents of the GCOS field, since this information is publicly available through a number of different mechanisms. Another fairly public, yet rather sensitive piece of information is the list of encrypted passwords for all of the users on the system. Even though the password information is cryptologically hidden, having it exposed in a world-readable file still provides some measure of vulnerability. Parts of the password file need to be world-readable (e.g., the UID and login name mappings), but not all of it. There's no need to provide a list of encrypted passwords to users who may be tempted to run password-cracking programs.

One alterative is to banish the encrypted password string for each user to a special file that is only readable by root. This second file is known as a "shadow password" file, since it contains lines that shadow the entries in the real password file.

Here's how it all works: the original password file is left intact with one small change. The encrypted password field contains a special character or characters to indicate password shadowing is in effect. Placing an x in this field is common, though the insecure copy of the BSD database uses a *.

I've heard of some shadow password suites that insert a special, normal-looking string of characters in this field. If your password file goes awanderin', this provides a lovely time for the recipient who will attempt to crack a password file of random strings that bear no relation to the real passwords.

Most operating systems take advantage of this second shadow password file to store more information about the account. This additional information resembles the surplus fields we saw in the BSD files, storing account expiration data and information on password changing and aging.

In most cases Perl's normal password functions like getpwent( ) can handle shadow password files. As long as the C libraries shipped with the OS do the right thing, so will Perl. Here's what "do the right thing" means: when your Perl script is run with the appropriate privileges (as root), these routines will return the encrypted password. Under all other conditions that password will not be accessible to those routines.

Unfortunately, it is dicier if you want to retrieve the additional fields found in the shadow file. Perl may not return them for you. Eric Estabrooks has written a Passwd::Solaris module, but that only helps if you are running Solaris. If these fields are important to you, or you want to play it safe, the sad truth (in conflict with my recommendation to use getpwent( ) above) is that it is often simpler to open the shadow file by hand and parse it manually.