[Chapter 24] 24.2 Debugging Tips and Strategies

24.2 Debugging Tips and Strategies

I do not pretend in this chapter to offer a comprehensive primer on debugging. The following tips and techniques, however, should improve upon your current set of error-fixing skills.

24.2.1 Gather Data

Gather as much data as possible about when, where, and how the error occurs. It is very unlikely that the first occurrence of an error will give you all the information you will want or need to figure out the source of that error. Upon noticing an error, the temptation is to show off one's knowledge of the program by declaring, "Got it! I know what's going on and I know exactly what to do to fix it." Such a move can be very gratifying when it turns out that you do have a handle on the problem, and that will be the case for simple bugs. Some problems can appear simple, however, and turn out to require extensive testing and analysis. Save yourself the embarrassment of pretending (or believing) that you know more than you actually do. Before rushing to change your code, take these steps:

Run the program again to see if the error is reproducible. This will be the first indication of the complexity of the problem. It is almost impossible to determine the cause of a problem if you are unable to get it to occur predictably. Once you work out the steps needed to cause the error to occur, you will have gained much valuable information about its cause.
Narrow the test case needed to generate the error. I recently had to debug a problem in one of my Oracle Forms modules. A pop-up window would lose its data under certain circumstances. At first glance, the rule seemed to be: "For a new call, if you enter only one request, that request will be lost." If I had stopped testing at this point, I would have had to analyze all code that initialized the call record and handled the INSERT logic. Instead, I tried additional variations of data entry and soon found that the data was lost only when I navigated to the pop-up window directly from a certain item. Now I had a very narrow test case to analyze and it was very easy to uncover the error in logic.
Examine the circumstances under which the problem does not occur. "Failure to fail" can offer many insights into the reason an error does occur. It also helps you narrow down the sections of code and the conditions you have to analyze when you go back to the program.

There can be no doubt that the best way to decrease the effort and time required to debug your application is to write code that has fewer bugs. Such code will have fewer errors, and the errors it does have are more likely to be typos and superficial mistakes. A single, fundamental logical flaw in a program can generate a multitude of other symptomatic errors.

We all have to accept the fact that it takes more effort and time to be careful and methodical. Yet we are almost always in a rush -- to meet a deadline, to go home and play with the baby, to go bowling. And so we truly, deeply believe that "this next one is easy." Heck, why not just code it straight out?

The morning after such a lapse in judgment, however, is uniformly painful and embarrassing. If you have a reputation as a guru with a tool, you may not want to have to go through several rounds of test, debug, and fix to get your program to work. If you are just learning the tool and working hard for a promotion, you can't afford to reveal sloppy programming habits. If you are responsible for a complex project, which is bound to be behind schedule, you've got to make every hour, every line of code count.

There is no substitute for getting it right the first time. Here are a few of the things you can do to minimize the number of bugs you introduce into a program:

Understand the purpose of the program and its logic. You should be thoroughly comfortable with the objective of your coding task. You should have a strong picture in your mind of the overall structure of the program and how that relates to the documented specifications before you write a single line of code. With this fundamental grounding, it will be much more difficult for you to introduce gross errors into the program.
Use top-down design and stepwise refinement to help keep your eyes on the prize. It is difficult, especially with complicated programs, to translate specifications directly into code without getting a bit lost in the programming details. Before you start coding, use pseudo-code at a very high level to list the different steps or functions needed to build the program. Then, for each of these steps in turn, focus your attention just on the details needed to implement that step. In this way you isolate complexities behind your steps (which will, you hope, turn into modules) and concentrate more effectively on the structure of your program before you worry about its details.
Modularize, modularize, modularize! Once you are ready to code, avoid writing one large, amorphous program. Instead, break up your code into separate modules, each with a specific purpose, that you can implement and test individually.

In addition to these high-level tips, follow the many pieces of advice offered throughout the book to improve the quality, readability, and maintainability of your code. Taken together and used together, these techniques should greatly improve your chances of bug-free programs.

The more information you gather about the problem at hand, the easier it will be to solve that problem. It is worth the extra time it will take to assemble the evidence. So even when you are absolutely sure you are on to that bug, hold off and investigate a little further.

24.2.2 Remain Logical at All Times

Symbolic logic is the life-blood of programmers. No matter which programming language you use, the underlying logical framework is a constant. PL/SQL has one particular syntax. The C language uses different keywords and the IF statement looks a little different. The elegance of LISP demands a very different way of building programs. But underneath it all, symbolic logic provides the backbone on which you hang the statements that solve your problems.

The reliance on logical and rational thought in programming is one reason that it is so easy for a developer to learn a new programming language. As long as you can take the statement of a problem and develop a logical solution step by step, the particulars of a language are secondary.

With logic at the core of our being, it amazes me to see how often we programmers abandon this logic and pursue the most irrational path to solving a problem. We engage in wishful thinking and highly superstitious, irrational, or dubious thought processes. Even though we know better -- much better -- we find ourselves questioning code that conforms to documented functionality, has worked in the past, and surely works right at that moment. This irrationality almost always involves shifting the blame from oneself to "the other": the computer, the compiler, our spouse, the word processor, whatever. Anything and anybody but our own pristine selves!

When you attempt to shift blame, you only put off solving your problem. Computers and compilers may not be intelligent, but they are very fast and they are very consistent. All they can do is follow rules, and you write the rules for them in your program. So when you uncover a bug in your code, take responsibility for that error. Assume that you did something wrong. Don't blame the PL/SQL compiler or Oracle Forms or the text editor.

If you do find yourself questioning a basic element or rule in the compiler that has always worked for you in the past (but maybe not in this precise circumstance, right?), it is time to take a break. Better yet, it is time to get someone else to look at your code. It is amazing how another pair of eyes can focus your own analytical powers on the real causes of a problem.

Strive to be the Spock of Programming. Accept only what is logical. Reject that which has no explanation.

24.2.3 Analyze Instead of Trying

So you have a pile of data, all the clues you could ask for in profiling the symptoms of your problem. Now it is time to analyze that data. For many people, analysis takes the following form: "Hmm, this looks like it could be the answer. I'll make this change, recompile, and try it to see if it works."

What's wrong with this approach? When you try a solution to see what will happen, what you are really saying is:

You are not sure that the change really is a solution. If you were sure, you wouldn't "try" it to see what would happen. You would make the change and then test that change.
You have not fully analyzed the error to understand what causes it to happen. If you know why an error occurs, then you know if a particular change will fix that problem. If you are unsure about the source of the error, you will be tempted to simply try a change and examine the impact. This is, unfortunately, very faulty logic.
Even if the change stops the error from occurring, you cannot be sure that your "solution" really solved anything. Because you are not sure why the problem occurred, the simple fact that the problem does not appear with your particular test (or tests) does not mean that you fixed the bug. The most you can say is that your change stopped the bug from occurring under certain, perhaps even most, circumstances.

To truly solve a problem, you must completely analyze the cause of the problem. Once you understand why the problem occurs, you have found the root cause and you can take the steps necessary to make sure the problem goes away in all circumstances.

When you identify a potential solution, perform a walk-through of your code based on that change. Don't execute your form. Examine your program and mentally try out different scenarios to test your hypothesis. Once you are certain that your change actually does address the problem, you can then perform a test of that solution. You won't be trying anything; you will be verifying a fix.

Analyze your bug fully before you try solutions. If you say to yourself, "Why don't I try this?" in the hope that it will solve the problem, then you are wasting your time and debugging inefficiently.

24.2.4 Take Breaks and Ask for Help

We are often our own biggest obstacles when it comes to sorting out our problems, whether a program bug or a personal crisis. When you are stuck on the inside of a problem, it is hard to maintain an objective distance and, with that distance, a fresh look.

When you are making absolutely no progress and feel that you have tried everything, try these two radical techniques:

Take a break
Ask for help

When I have struggled with a bug for any length of time without success, I not only become ineffective, I also tend to lose perspective. I pursue irrational and superstitious leads. I lose track of what I have already tested and what I have assumed to be right. I get too close to the problem to debug it effectively.

My frustration level usually correlates closely to the amount of time I have sat in my ergonomic chair and perched over my wrist-padded keyboard and stared at my low-radiation screen. Often the very simple act of stepping away from the workstation will clear my head and leave room for a solution to pop into place. Did you ever wake up the morning after a very difficult day at work to find the elusive answer sitting there at the end of your dream?

Make it a rule to get up and walk around at least once an hour when you are working on a problem -- heck, even when you are writing your programs. Give your brain a chance to let its neural networks make the connections and develop new options for your programming. There is a whole big world out there. Even when your eyes are glued to the monitor and your source code, the world keeps turning. It never hurts to remind yourself of the bigger picture, even if that only amounts to taking note of the weather outside your air-conditioned cocoon.

Even more effective than taking a break is asking another person to look at your problem. There is something entirely magical about the dynamic of adding another pair of eyes to the situation. You might have struggled without any result for an hour or two on a problem, and finally, out of desperation, asked a coworker to have a go at it. Then, at the exact moment that you explain the problem to your friend, the solution will jump out at you. It could be a mismatch on names, a false assumption, or a misunderstanding of the IF statement logic. Whatever the case, chances are that you yourself will find it (even though you couldn't for the last two hours) as soon as you ask someone else to find it for you.

If the error does not yield itself quite that easily, you still have lots to gain from the perspective of another person who (a) did not write the code and has no subconscious assumptions or biases about the code, and (b) isn't mad at the program.

Other benefits accrue from asking for help. You improve the self-esteem and self-confidence of other programmers by showing that you respect their opinions. If you are one of the best developers in the group, then your request for help demonstrates that you, too, make mistakes and need help from the team to move forward. This builds the sense (and reality) of teamwork, which will improve the overall development and testing efforts on the project.

24.2.5 Change and Test One Area of Code at a Time

One of my biggest problems when I debug my code is that I try to make too many changes at once. I am overconfident about my development and debugging skills. So I make five or ten changes, rerun my test, and get very unreliable and minimally useful results. I find that my changes cause other problems (a common phenomenon until a program stabilizes and a sure sign that lots more debugging and testing is needed), that some of the original errors are gone, but not all of them, and I have no idea which changes fixed which errors, and which changes caused new errors.

In short, my debugging effort is a mess and I have to back out of changes until I have a clearer picture of what is happening in my program.

Unless you are making very simple changes, you should fix one problem at a time and then test that fix. The amount of time it takes to compile, generate, and test may increase, but in the long run you will be much more productive.

One other aspect of incremental testing and debugging is performing "unit tests" on individual modules before you test a program which calls these various modules. If you test the programs separately and determine that they work, when you debug your application as a whole (in a system test), you do not have to worry about whether those modules return correct values or perform the correct actions. Instead, you can concentrate on the code that calls the modules.

24.2.6 Document and Back Up Your Efforts

Are you offended that I would even say this? Of course we all back up our programs. Of course we all document our efforts.

And pigs can fly.

What I personally cannot believe is that, as recently as September, 1994, I managed to delete the entire contents of my source code directory for a data transfer utility. Sure, there was a backup -- on tape. It was already archived. It took the operations department five hours to recover my code and then I still lost three hours of work from that morning. You see, I hadn't bothered to back up my code to my own convenient refresh directory. And I had decided to clean up my UNIX subdirectories of old data files. My clean-up script, sad to say, had a bug in it, which caused it to remain in my source directory and remove all files.

You can snicker all you want, but the same thing has happened to you, and will happen again, unless you constantly remind yourself to:

Archive your original program version
Copy incremental versions to different files
Save your current changes

Suppose you have finished development on a very complex and difficult program. You are not surprised when you discover many different bugs. You hunker right down to the task of analyzing and fixing the errors. As you complete a number of fixes, you should save that program out to a different file so that you have "frozen" a version to which you can return if necessary. Then if, as you continue debugging, you take a wrong turn and code yourself into a nonproductive corner, you do not have to spend time surgically removing your mistakes. You can instead shift back to an earlier version of the program.

Clearly, there are source code control programs such as PVCS that handle these functions for you. In many cases with Oracle software, however, these utilities are not used and it is completely up to you to take the steps necessary to protect yourself and your code.

24.2.7 Test All Assumptions

Every move you make, every test you take, every line of code you write has assumptions attached to it. You need to be conscious of those assumptions and, when you start your debug cycle, test those assumptions to make sure they are valid.

I recently wasted several hours debugging a problem. I could have solved it in about three minutes -- if I'd properly tested my assumptions. A stored function accepted three parameters and returned a value as shown below:

new_rate := mortgage_rate (down_payment, prime_rate, mortgage_type);

The mortgage_rate function was, unfortunately, setting new_rate to NULL. After examining the function, it was clear to me that the only way it could return a NULL was if one of the inputs was NULL. I then used this information as follows:

I had just examined the global variable holding the mortgage type. That global value was transferred to the mortgage_type variable in an earlier program and passed to my current module, so I knew that it was OK.
I performed a walk-through of the code and could not identify how any of the other two variables could be NULL.

So I inserted a trace statement before and after the call to mortgage_rate. My code now looked like this:

DBMS_OUTPUT.PUT_LINE
   ('Inputs: ' || TO_CHAR (down_payment) || '-' ||
    TO_CHAR  (prime_rate));

new_rate := mortgage_rate (
   down_payment, prime_rate,
   bank, mortgage_type);

DBMS_OUTPUT.PUT_LINE ('Rate: ' || NVL (TO_CHAR (new_rate), 'Missing'));

I ran the program and no matter what data I entered on the screen, my trace statements remained the same:
```
Inputs: 55000-9.5
Rate: Missing
```

I wracked my obviously overwrought brain: what could cause a stored function to return a NULL value? I looked at the source code for the function again. There wasn't much to it. Just division and multiplication. How could it return a NULL without a NULL input?

After two hours of this nonsense, I finally said to myself, "Well, you know that you really haven't verified the value of the mortgage_type variable." I knew that it was OK, but, hey, it wouldn't hurt to check -- and if I didn't solve this one soon I would have to actually ask for help.

So I modified my trace statement and, sure enough, the mortgage type was NULL. Turns out that while the global variable held the proper value, the previous program did not pass it to the local variable properly. My assumption did me in.

One of the first assumptions you will make in your testing is that you have valid inputs to your program. This can also be one of the most dangerous assumptions to make.

Data errors are perhaps the most frustrating of all the kinds of bugs you will encounter. You follow all the right steps for debugging and analyzing your code, only to discover that there is nothing actually wrong with your program. Instead, the data that drives your program is wrong.

I encounter this problem most often in the following situations:

I am testing my code in multiple database instances. Never assume that the data structures and actual rows of data are identical in all instances. Never assume that all indexes have been defined the same way.
The reference data is still unstable. What is the valid code for a "closed" call? What are the ten valid company types? If this data is still changing, your program is likely to break.

If you do not understand why your program is doing what it is doing, make a list of all your assumptions and then test those -- including the data you rely on to run your program. There is a good chance that your error was introduced very early into the process.

24.2.8 Leverage Existing Utilities -- Or Build Your Own

As you build more and more complex programs, you will find it increasingly difficult and incredibly frustrating to manage and debug these programs without a utility of some kind. Take the time to investigate what is available and what it will do for you.

Historically, Oracle Corporation has been very slow to offer debugging and other programmer-oriented utilities. Third-party vendors seem to have taken a clue from Oracle and also have not hurried to provide a strong set of tools for PL/SQL developers. As of mid-year 1997, that situation is finally changing. You can now purchase debuggers from the following vendors:

Oracle Corporation: Procedure Builder
Platinum Technology: SQL Station Debugger
Technosolutions: SQL Navigator
Compuware: XPediter/SQL

All of these products greatly improve the ability to debug client-side PL/SQL; you will need to carefully examine the specific benefits and features before deciding which of these (and, I hope, by the time this book is published, others as well) fit your needs most closely.

If, on the other hand, you cannot find anything that will help (or you can't get the approval to buy the utility of your dreams), you might consider building your own. I have found in the past that it is relatively straightforward to implement utilities that have a significant impact on my debugging capabilities. I built XRay Vision, a debugger for SQL*Forms, implemented entirely in SQL*Forms itself, which gave me the ability to view and modify all variables in my programs. You'll find this debugger, xrayvizn.zip , on the RevealNet Web site. While you are unlikely to be using SQL*Forms at this point, you may find the source code of interest (stored in the good, old INP files).

When Oracle Forms 4.x (even the version in Oracle Developer/2000 that has its own source code debugger) came along, I realized that it was impossible to view and change data structures created at runtime (record groups, parameter lists, etc.). So I built a utility named Arjy (pronounced "RG" for Record Group), which gave me that access. The shareware version of Arjy, arjy.zip is also available at the RevealNet Web site.

The basic PL/SQL product from Oracle Corporation will never have everything you need. If you can't find what you need to get the job done, then get creative and take a crack at meeting your own needs. Dive in and build your own utility. Not only will you improve your productivity (and that of others), but you will gain a feeling of intense satisfaction from solving your own problems all by yourself.

24.2.9 Build Debugging Messages into Your Packages

If you do not use a GUI-based, source code debugger, you probably spend a fair amount of time throwing debug or trace messages into your code and then removing them when things are fixed. A much better approach is to leave these messages intact, but give yourself lots of flexibility in deciding when you want to see them.

The simplest model for this technique is actually built right into the DBMS_OUTPUT package. The DBMS_OUTPUT.PUT_LINE procedure displays output from your PL/SQL program when that program completes execution. But it will not show anything unless the package itself is enabled with a call to DBMS_OUTPUT.ENABLE and/or unless from within SQL*Plus you issue the SET SERVEROUTPUT ON command. Furthermore, this is an all-or-nothing proposition: you either see no output, or you see output from every call to this procedure from every corner of your application. That can be overwhelming if you have inserted lots of trace messages.

You can easily come up with a more discerning technique when working with packages. Suppose I have developed a package to perform calculations for profit-and-loss statements. My p_and_l package maintains a last statement date as a global variable for use within the package. I build a "get and set" layer of code around the variable so that applications can retrieve and manipulate the variable -- but only through my code layer. Here is the package:

CREATE OR REPLACE PACKAGE p_and_l
IS
   PROCEDURE set_lastdate (date_in IN DATE);
   FUNCTION lastdate RETURN DATE;

   /* Lots of other stuff, too! */
   . . .
END p_and_l;
/

CREATE OR REPLACE PACKAGE BODY p_and_l
IS
   g_lastdate DATE;

   PROCEDURE set_lastdate (date_in IN DATE)
   IS
   BEGIN
      /* Date cannot be in future. */
      g_lastdate := LEAST (SYSDATE, date_in);
   END;

   FUNCTION lastdate RETURN DATE
   IS
   BEGIN
      RETURN g_lastdate;
   END;
END p_and_l;
/

As I test this package as part of a large, complex application, suppose that I find that the last date variable is being set improperly, but I can't figure out what is doing it and why. I can go into the p_and_l.set_lastdate procedure and insert a call to DBMS_OUTPUT.PUT_LINE as follows:

PROCEDURE set_lastdate (date_in IN DATE)
IS
BEGIN
   DBMS_OUTPUT.PUT_LINE (`setting date to ` || TO_CHAR (date_in));
   /* Date cannot be in future. */
   g_lastdate := LEAST (SYSDATE, date_in);
END;

but then I have to see all the output in my application and try to find this one statement among all the others. The approach I take instead is to provide a debug "toggle" in my package which allows me to focus output to just the statements I need to see. With the toggle technique, I add three programs to my package specification:

CREATE OR REPLACE PACKAGE p_and_l
IS
   PROCEDURE set_lastdate (date_in IN DATE);
   FUNCTION lastdate RETURN DATE;

   PROCEDURE dbg;
   PROCEURE nodbg;
   FUNCTION debugging RETURN BOOLEAN;

   /* Lots of other stuff, too! */
   . . .
END p_and_l;
/

I also modify the package body to both implement this toggle and use it inside the set_lastdate procedure:

CREATE OR REPLACE PACKAGE BODY p_and_l
IS
   g_lastdate DATE;
   g_dbg BOOLEAN := FALSE;

   PROCEDURE dbg IS BEGIN g_dbg := TRUE; END;
   PROCEDURE nodbg IS BEGIN g_dbg := FALSE; END;
   FUNCTION debugging RETURN BOOLEAN RETURN g_dbg; END;

   PROCEDURE set_lastdate (date_in IN DATE)
   IS
   BEGIN
      IF debugging
      THEN
         DBMS_OUTPUT.PUT_LINE (`before set ` || TO_CHAR (g_lastdate));
      END IF;

      /* Date cannot be in future. */
      g_lastdate := LEAST (SYSDATE, date_in);

      IF debugging
      THEN
         DBMS_OUTPUT.PUT_LINE (`after set ` || TO_CHAR (g_lastdate));
      END IF;
   END;

   FUNCTION lastdate RETURN DATE
   IS
   BEGIN
      RETURN g_lastdate;
   END;
END p_and_l;
/

Then if I want to see what is happening to the g_lastdate variable, I can issue the debug command in SQL*Plus for this package and see just the output I need:

SQL> exec p_and_l.dbg
SQL> exec testing_program
before set 12-JAN-97
after set 15-JAN-97

Of course, you'd probably want to see more information, such as the execution call stack to see what program called the p_and_l.set_lastdate procedure. You can add anything you want -- all within the confines of the IF debugging clause -- and that information will be available only on a need-to-know basis. You might even decide to free yourself from the confines of DBMS_OUTPUT by writing information out to a database pipe.

Furthermore, if you set as a standard in your group that every package is to have a debug toggle, then it will be much easier for users of those packages to debug their own use (or misuse) of that reusable code. They know that there will be a program named PKG.dbg which can be used to extract additional information about package processing.

This technique is explored in more detail and with a slightly different focus (production support) in Chapter 26, Tracing PL/SQL Execution .