How Do I Make a Perl Program? (Learning Perl, 3rd Edition)

1.4. How Do I Make a Perl Program?

It's about time you asked (even if you didn't). Perl programs are text files; you can create and edit them with your favorite text editor. (You don't need any special development environment, although there are some commercial ones available from various vendors. We've never used any of these enough to recommend them.)

You should generally use a programmers' text editor, rather than an ordinary editor. What's the difference? Well, a programmers' text editor will let you do things that programmers need, like to indent or unindent a block of code, or to find the matching closing curly brace for a given opening curly brace. On Unix systems, the two most popular programmers' editors are emacs and vi (and their variants and clones). Both of these have been ported to several non-Unix systems, and many systems today offer a graphical editor (which uses a pointing device like a mouse). In fact, there are even versions of vi and emacs that offer a graphical interface. Ask your local expert about text editors on your system.

For the simple programs you'll be writing for the exercises in this book, none of which will need to be more than about twenty or thirty lines of code, any text editor will be fine.

A few beginners try to use a word processor instead of a text editor. We recommend against this -- it's inconvenient at best and impossible at worst. But we won't try to stop you. Be sure to tell the word processor to save your file as "text only"; the word processor's own format will almost certainly be unusable.

In some cases, you may need to compose the program on one machine, then transfer it to another to be run. If you do this, be sure that the transfer uses "text" or "ASCII" mode, and not "binary" mode. This step is needed because of the different text formats on different machines. Without that, you may get inconsistent results -- some versions of Perl actually abort when they detect a mismatch in the line endings.

1.4.1. A Simple Program

According to the oldest rule in the book, any book about a computer language that has Unix-like roots has to start with showing the "Hello, world" program. So, here it is in Perl:

#!/usr/bin/perl
print "Hello, world!\n";

Let's imagine that you've typed that into your text editor. (Don't worry yet about what the parts mean and how it works. We'll see about those in a moment.) You can generally save that program under any name you wish. Perl doesn't require any special kind of filename or extension, and it's better to use no extension at all.[31] But some non-Unix systems may require an extension like .plx (meaning PerL eXecutable); see your system's release notes for more information.

[31]Why is it better to have no extension? Imagine that you've written a program to calculate bowling scores and you've told all of your friends that it's called bowling.plx. One day you decide to rewrite it in C. Do you still call it by the same name, implying that it's still written in Perl? Or do you tell everyone that it has a new name? (And don't call it bowling.c, please!) The answer is that it's none of their business what language it's written in, if they're merely using it. So it should have simply been called bowling in the first place.

You will also need to do something so that your system knows that it's an executable program (that is, a command). What you'll do depends upon your system; maybe you won't have to do anything more than to save the program in a certain place. (Your current directory will generally be fine.) On Unix systems, you mark a program as being executable by using the chmod command, perhaps like this:

$ chmod a+x my_program

The dollar sign (and space) at the start of the line represents the shell prompt, which will probably look different on your system. If you're used to using chmod with a number like 755 instead of a symbolic parameter like a+x, that's fine too, of course. Either way, it tells the system that this file is now a program.

Now you're ready to run it:

$ ./my_program

The dot and slash at the start of this command mean to find the program in the current working directory. That's not needed in all cases, but you should use it at the start of each command invocation until you fully understand what it's doing.[32]

[32]In short, it's preventing your shell from running another program (or shell builtin) of the same name. A common mistake among beginners is to name their first program test. Many systems already have a program (or shell builtin) with that name; that's what the beginners run instead of their program.

If everything worked, it's a miracle. More often, you'll find that your program has a bug. Edit and try again -- but you don't need to use chmod each time, since that should "stick" to the file. (Of course, if the bug is that you didn't use chmod correctly, you'll probably get a "permission denied" message from your shell.)

1.4.2. What's Inside That Program?

Like other "free-form" languages, Perl generally lets you use insignificant whitespace (like spaces, tabs, and newlines) at will to make your program easier to read. Most Perl programs use a fairly standard format, though, much like most of what we show here. We strongly encourage you to properly indent your programs, since that makes your program easier to read; a good text editor will do most of the work for you. Good comments also make a program easier to read. In Perl, comments run from a pound sign (#) to the end of the line. (There are no "block comments" in Perl.[33]) We don't use many comments in the programs in this book, because the surrounding text explains their workings, but you should use comments as needed in your own programs.

[33]But there are a number of ways to fake them. See the FAQ (accessible with perldocperlfaq on most installations).

So another way (a very strange way, it must be said) to write that same "Hello, world" program might be like this:

#!/usr/bin/perl
    print    # This is a comment
"Hello, world!\n"
  ;    # Don't write your Perl code like this!

That first line is actually a very special comment. On Unix systems,[34] if the very first two characters on the first line of a text file are "#!", then what follows is the name of the program that actually executes the rest of the file. In this case, the program is stored in the file /usr/bin/perl.

[34]Most modern ones, anyway. The "sh-bang" mechanism was introduced somewhere in the mid-1980s, and that's pretty ancient, even on the extensively long Unix timeline.

This #! line is actually the least portable part of a Perl program, because you'll need to find out what goes there for each machine. Fortunately, it's almost always either /usr/bin/perl or /usr/local/bin/perl. If you find that it's not, you can cast a magic spell on your system administrator to fix things. Just say "You know, I read in a book that both /usr/bin/perl and /usr/local/bin/perl should be symbolic links to the true Perl binary," and under the influence of your spell the admin will make everything work. All of the example programs you're likely to find on the Net and elsewhere will begin with one of those two forms.

On non-Unix systems, it's traditional (and even useful) to make the first line say #!perl. If nothing else, it tells your maintenance programmer as soon as he or she gets ready to fix it that it's a Perl program.

If that #! line is wrong, you'll generally get an error from your shell. This may be something unexpected, like "file not found." It's not your program that's not found, though; it's /usr/bin/perl that wasn't where it should have been. We'd make the message clearer, but it's not coming from Perl; it's the shell that's complaining. (By the way, you should be careful to spell it usr and not user -- the folks who invented Unix were lazy typists, so they omitted a lot of letters.)

Another problem you could have is if your system doesn't support the #! line at all. In that case, your shell (or whatever your system uses) will probably try to run your program all by itself, with results that may disappoint or astonish you. If you can't figure out what some strange error message is telling you, search for it in the perldiag manpage.

The "main" program consists of all of the ordinary Perl statements (not including anything in subroutines, which we'll see later). There's no "main" routine, as there is in languages like C or Java. In fact, many programs don't even have routines (in the form of subroutines).

There's also no required variable declaration section, as there is in some other languages. If you've always had to declare your variables, you may be startled or unsettled by this at first. But it allows us to write "quick-and-dirty" Perl programs. If your program is only two lines long, you don't want to have to use one of those lines just to declare your variables. If you really want to declare your variables, that's a good thing; we'll see how to do that in Chapter 4, "Subroutines".

Most statements are an expression followed by a semicolon. Here's the one we've seen a few times so far:

print "Hello, world!\n";

As you may have guessed by now, this line prints the message Hello, world! At the end of that message is the shortcut \n, which is probably familiar to you if you've used another language like C, C++, or Java; it means a newline character. When that's printed after the message, the print position drops down to the start of the next line, allowing the following shell prompt to appear on a line of its own, rather than being attached to the message. Every line of output should end with a newline character. We'll see more about the newline shortcut and other so-called backslash escapes in the next chapter.

1.4.3. But How Do I Compile Perl?

You may be surprised to learn that all you have to do to compile Perl is to run it. When you run your program, Perl's internal compiler first runs through your entire source, turning it into internal bytecodes (an internal data structure representing the program); then Perl's bytecode engine actually runs them.[35]

[35]As usual, there's more to the story than what we say here. But this should be close enough for all but the technically advanced folks, and they already know about this.

So, if there's a syntax error on line 200, you'll get that error message before you start running line two.[36] If you have a loop that runs 5000 times, it's compiled just once; the actual loop can then run at top speed. And there's no runtime penalty for using as many comments and as much whitespace as you need to make your program easy to understand. You can even use calculations involving only constants, and the result is a constant computed once as the program is beginning -- not each time through a loop.

[36]Unless line two happens to be a compile-time operation, like a BEGIN block or a use invocation.

To be sure, this compilation does take time -- it's inefficient to have a voluminous Perl program that does one small quick task (out of many potential tasks, say) and then exits, because the runtime for the program will be dwarfed by the compile time. But the compiler is very fast; normally the compilation will be a tiny percentage of the runtime.

An exception might be if you were writing a program to be run over the Web, where it may be called hundreds or thousands of times every minute. (This is a very high usage rate. If it were called a few hundreds or thousands of times per day, like most programs on the Web, we probably wouldn't worry too much about it.) Many of these programs have very short runtimes, so the issue of recompilation may become significant. If this is an issue for you, you'll want to find a way to keep your program resident in memory between invocations (whether it's written in Perl or not); see the documentation for your web server and ask your local expert for help with this.[37]

[37]Point your local expert to http://perl.apache.orgfor one possible solution.

What if you could save the compiled bytecodes to avoid the overhead of compilation? Or, even better, what if you could turn the bytecodes into another language, like C, and then compile that? Well, both of these things are possible (although beyond the scope of this book), although they won't make most programs any easier to use, maintain, debug, or install, and they may (for somewhat technical reasons) make your program even slower.[38] We don't know anyone who has ever needed to compile a Perl program (except for experimental purposes), and we doubt you ever will ever meet one, either.

[38]On many (perhaps most) systems where you might want to compile a Perl program, the perl binary (the program that executes your Perl programs) is always in use by some process, so it's always resident in memory. A "compiled Perl" program will take time to load into memory. If it's a small program, it would probably compile at least as fast as it takes to load a compiled executable. If it's a large one, compilation is probably an insignificant part of its runtime anyway.