Chapter 25. Portable PerlContents:
Newlines
A world with only one operating system makes portability easy, and life boring. We prefer a larger genetic pool of operating systems, as long as the ecosystem doesn't divide too cleanly into predators and prey. Perl runs on dozens of operating systems, and because Perl programs aren't platform dependent, the same program can run on all of those systems without modification. Well, almost. Perl tries to give the programmer as many features as possible, but if you make use of features particular to a certain operating system, you'll necessarily reduce the portability of your program to other systems. In this section, we'll provide some guidelines for writing portable Perl code. Once you make a decision about how portable you want to be, you'll know where the lines are drawn, and you can stay within them. Looking at it another way, writing portable code is usually about willfully limiting your available choices. Naturally, it takes discipline and sacrifice to do that, two traits that Perl programmers might be unaccustomed to. Be aware that not all Perl programs have to be portable. There is no reason not to use Perl to glue Unix tools together, or to prototype a Macintosh application, or to manage the Windows registry. If it makes sense to sacrifice portability, go ahead.[1]
In general, note that the notions of a user ID, a "home" directory, and even the state of being logged in will exist only on multi-user platforms. The special $^O variable tells you what operating system your Perl was built on. This is provided to speed up code that would otherwise have to use Config to get the same information via $Config{osname}. (Even if you've pulled in Config for other reasons, it still saves you the price of a tied-hash lookup.) To get more detailed information about the platform, you can look at the rest of the information in the %Config hash, which is made available by the standard Config module. For example, to check whether the platform has the lstat call, you can check $Config{d_lstat}. See Config's online documentation for a full description of available variables, and the perlport manpage for a listing of the behavior of Perl built-in functions on different platforms. Here are the Perl functions whose behavior varies the most across platforms: -X (file tests), accept, alarm, bind, binmode, chmod, chown, chroot, connect, crypt, dbmclose, dbmopen, dump, endgrent, endhostent, endnetent, endprotoent, endpwent, endservent, exec, fcntl, fileno, flock, fork, getgrent, getgrgid, getgrnam, gethostbyaddr, gethostbyname, gethostent, getlogin, getnetbyaddr, getnetbyname, getnetent, getpeername, getpgrp, getppid, getpriority, getprotobyname, getprotobynumber, getprotoent, getpwent, getpwnam, getpwuid, getservbyport, getservent, getservbyname, getsockname, getsockopt, glob, ioctl, kill, link, listen, lstat, msgctl, msgget, msgrcv, msgsnd, open, pipe, qx, readlink, readpipe, recv, select, semctl, semget, semop, send, sethostent, setgrent, setnetent, setpgrp, setpriority, setprotoent, setpwent, setservent, setsockopt, shmctl, shmget, shmread, shmwrite, shutdown, socket, socketpair, stat, symlink, syscall, sysopen, system, times, truncate, umask, utime, wait, waitpid 25.1. NewlinesOn most operating systems, lines in files are terminated by one or two characters that signal the end of the line. The characters vary from system to system. Unix traditionally uses \012 (that is, the octal 12 character in ASCII), one type of DOSish I/O uses \015\012, and Macs uses \015. Perl uses \n to represent a "logical" newline, regardless of platform. In MacPerl, \n always means \015. In DOSish Perls, \n usually means \012, but when accessing a file in "text mode", it is translated to (or from) \015\012, depending on whether you're reading or writing. Unix does the same thing on terminals in canonical mode. \015\012 is commonly referred to as CRLF. Because DOS distinguishes between text files and binary files, DOSish Perls have limitations when using seek and tell on a file in "text mode". For best results, only seek to locations obtained from tell. If you use Perl's built-in binmode function on the filehandle, however, you can usually seek and tell with impunity. A common misconception in socket programming is that \n will be \012 everywhere. In many common Internet protocols, \012 and \015 are specified, and the values of Perl's \n and \r are not reliable since they vary from system to system: However, using \015\012 (or \cM\cJ, or \x0D\x0A, or even v13.10) can be tedious and unsightly, as well as confusing to those maintaining the code. The Socket module supplies some Right Things for those who want them:print SOCKET "Hi there, client!\015\012"; # right print SOCKET "Hi there, client!\r\n"; # wrong When reading from a socket, remember that the default input record separator $/ is \n, which means you have to do some extra work if you're not sure what you'll be seeing across the socket. Robust socket code should recognize either \012 or \015\012 as end of line:use Socket qw(:DEFAULT :crlf); print SOCKET "Hi there, client!$CRLF" # right Similarly, code that returns text data--such as a subroutine that fetches a web page--should often translate newlines. A single line of code will often suffice:use Socket qw(:DEFAULT :crlf); local ($/) = LF; # not needed if $/ is already \012 while (<SOCKET>) { s/$CR?$LF/\n/; # replace LF or CRLF with logical newline } $data =~ s/\015?\012/\n/g; return $data; Copyright © 2001 O'Reilly & Associates. All rights reserved. |
|