How can you create, use, and maintain your variables and constants most effectively? This section pulls together a number of tips.
Database administrators have long insisted on, and usually enforced, strict naming conventions for tables, columns, and other database objects. The advantage of these conventions is clear: at a single glance anyone who knows the conventions (and even many who do not) will understand the type of data contained in that column or table.
Many programmers, however, tend to get annoyed by such naming conventions. They regard them as just another bureaucracy that prevents them from "getting the job (the program) done." Of course, when it comes to tables and columns, most developers have no choice; they are stuck with what the DBA gives them. But when it comes to their own programs, they are free to name variables, programs, and other identifiers whatever they want. What a rush! Freedom! Room to express oneself!
Or rope with which to hang oneself. While those conventions are a bother to anyone who must follow them, they are a tremendous time- and eye-saver to anyone who must read and understand your code -- even yourself.
If conventions for columns make sense in the database, they also make sense in PL/SQL programs -- and in the development tools, such as Oracle Forms. In general I recommend that you follow the same guidelines for naming variables that you follow for naming columns in your development environment. In many cases, your variables simply represent database values, having been fetched from a table into those variables via a cursor. Beyond these database-sourced variables, however, there are a number of types of variables that you will use again and again in your code.
Generally, I try to come up with a variable name suffix that will help identify the representative data. This convention not only limits the need for additional comments, it can also improve the quality of code, because the variable type often implies rules for its use. If you can identify the variable type by its name, then you are more likely to use the variable properly.
Here is a list of variable types and suggested naming conventions for each.
One of the most compelling reasons for creating your own subtypes is to provide application- or function-specific datatypes that self-document your code. A programmer-defined subtype hides the generic "computer datatype" and replaces it with a datatype that has meaning in your own environment. In naming your subtype, you should consequently avoid references to the underlying datatype and concentrate instead on the business use of the subtype.
Suppose you are building a hotel reservation system. A very useful subtype would be a room number; it is a special kind of NUMBER and is used throughout the application. A room number is always an integer and is always greater than zero. So you could create the subtype as follows:
SUBTYPE room_#_pos_integer IS POSITIVE; ... -- A declaration using the subtype: open_room room_#_pos_integer;
The name, room_#_pos_integer, however, provides too much information about the subtype -- this information is not needed by the developer. A much better name for the subtype follows:
SUBTYPE room_number_type IS POSITIVE; ... -- A declaration using the subtype: open_room room_number_type;
A word-processing screen might rely heavily on fixed-length, 80-character strings (full lines of text). Which of these subtype declarations would you prefer to use?
SUBTYPE line_fixlen_string IS CHAR; SUBTYPE full_line_type IS CHAR;
Which of these next two subtypes makes more sense in the variable declarations?
DECLARE SUBTYPE use_details_flag_Boolean_subtype IS BOOLEAN; use_item_totals use_details_flag_Boolean_subtype; SUBTYPE use_details_type IS BOOLEAN; use_footers use_details_type;
In both examples, the second example is preferable. You can improve the readability of your subtypes if you include a standard prefix or suffix to all subtypes, explicitly defining the identifier as a type. As shown in the following example, you might use a concise _t or the full _type suffix; in either case, a glance at the code will reveal that identifier's nature:
DECLARE longitude coordinate_t; avail_room room_number_type; found_company t_entity_found;
If you set your standards before developers begin to define their subtypes, you will have a consistent naming convention throughout your application.
The one thing that all of the above statements have in common, besides the fact that they are poor excuses for writing poorly designed code, is that I avoid declaring separate variables and constants to meet the different needs in my program. Although I strongly support the idea of "reuse and recycle," both in my neighborhood and with procedures and functions, this principle produces hard-to-read code when applied to variables and constants.
Each variable and constant I declare should have one purpose and one purpose only. The name for that variable or constant should describe, as clearly as possible, that single-minded purpose.
The only reason to create generically-named variables and constants is to save you, the developer, typing time. That is always a terrible reason for a coding style or change in a programming effort. Reliance on a "time-saver" short-cut should raise a red flag: you are probably doing (or avoiding) something now for which you will pay later.
Every application has its own set of special or "magic" values. These values are often configuration parameters, such as the maximum amount of time allowed before an order must be shipped or the name of an application (to be displayed in report and screen headers). Sometimes these special values will be relevant only for a particular program and not for the application as a whole, such as the character used as a string delimiter.
Whatever the scope and form of these values, programmers seem to believe that they will never change. The data in the database will certainly change and you might even need to add a column or two, but nobody will ever need to modify the naming scheme for reports, right? The value used to identify a closed request is always going to be "C," right? In a seemingly unconscious effort to prove this inviolability, we enter these values directly into the code, wherever it is needed, however often it is needed.
And then we are burned when the assumptions change -- usually as the result of a change in business rules. Not only is our belief system undermined, our weekend is lost to minute changes to code. It's always better to be skeptical, with perhaps a tinge of cynicism. Assume that anything you do will have to be changed a week after you write it. Protect yourself in every way possible -- particularly with your constants.
You can, of course, go overboard in your hunt for hardcoded values. Suppose you encounter this very common assignment to increment a counter:
number_of_orders := number_of_orders+ 1;
Should you really create a named constant called single_increment, initialize it with a value of one, and then change the above line to the following:
number_of_orders := number_of_orders+ single_increment;
I don't think so. The use of the literal value 1 in this situation is perfectly appropriate to the task at hand. Almost any other literal value hardcoded into your program should be replaced with a named constant.
There is a critical difference between a variable and a named constant. A variable can be assigned values; its value can be changed in the code. The value of a named constant is a constant; its value may not be altered after it has been declared. If you find that you have written a program in which a local variable's value does not change, you should first determine if that behavior is correct. If all is as it should be, you should then convert that variable to a constant.
Why should you bother converting "read only" variables to constants (and named ones at that)? Because when you "tell it and use it like it is," the program explains itself more clearly. The declaration of a named identifier as a constant gives you information about how it should be used in the program.
If you do convert a variable to a constant, you should also change its name. This will help to remind anyone reading the code that your identifier refers to a constant and cannot be changed. See the earlier tip on naming conventions for ideas on how to name your constants.
Directly related to improved readability, the declaration of a constant can also help with maintenance and debugging. Suppose you define and use a variable under the assumption that its default value will not change. You do not, however, go to the trouble of declaring that variable as a constant. You are aware of this assumption and use the variable properly. What happens, however, when a month or a year later another programmer has to perform maintenance on the code? This person might change the value of that variable in the course of making changes, and then cause a ripple effect of errors. If the variable was declared initially as a constant, such a change could not be compiled.
I have always been amazed at how quickly and easily a program can turn into spaghetti code. Consider this scenario. A program starts out with weak specifications. As a result, users change requirements at the same time that the developer implements the functions to support those requirements. A program evolves (or does it devolve?) rapidly, with whole sections of code moved, removed, or revamped. After many rounds of approximating a solution, the program works to the users' satisfaction. You wipe the sweat from your brow and gratefully move on to the next screen or report, hoping never to have to touch that program again.
Consider another scenario. After you wipe the sweat from your brow, you take a deep breath and dive into clean-up mode. There will never be a better time to review all the steps you took and understand the reasons you took them than immediately upon completion of your program. If you wait, you will find it particularly difficult to remember those parts of the program which were needed at one point, but were rendered unnecessary in the end. You should not view these "dead zones" in your code as harmless backwaters, never traveled so never a bother. In fact, unexecuted portions of code become sources of deep insecurity for maintenance programmers.
You should go through your programs and remove any part of your code that is no longer used. This is a relatively straightforward process for variables and named constants. Simply execute searches for a variable's name in that variable's scope. If you find that the only place it appears is its declaration, delete the declaration and, by doing so, delete one more potential question mark from your code.
Always, always use the %TYPE attribute to declare variables which are actually PL/SQL representations of database values. When you think about it, this includes a lot of your variables; using %TYPE sometimes takes lots more typing, but it improves your code substantially.
Suppose you have a procedure in Oracle Forms that formats information about a customer. You need to declare a variable for each attribute of the customer: first name, last name, address, Social Security number, etc. "Hardcoded" declarations would look like this:
PROCEDURE format_customer IS first_name VARCHAR2 (30); last_name VARCHAR2 (30); address VARCHAR2 (60); city VARCHAR2 (60); state VARCHAR2 (2); soc_sec# VARCHAR2 (11); BEGIN ... interact with database customer information ... END;
There are a few problems associated with this declaration style. As mentioned previously, these declarations are going to work only as long as the structure of the table does not change and the resulting changed data does not violate the above size constraints. Another drawback is that there is no documented relationship between the variables and the table columns. For example, is the city variable the city in which the customer lives or the city in which the sale was made? I need additional documentation here to guide my understanding of the ways these variables will be used.
These problems are all resolved with the %TYPE attribute. The declaration section shown in the preceding example has a very different look when explicit declarations are replaced with the referential %TYPE declarations:
PROCEDURE format_customer IS first_name customer.cust_fname%TYPE; last_name customer.cust_lname%TYPE; address customer.cust_addr_l1%TYPE; city customer.cust_city%TYPE; state customer.cust_state_abbrev_cd%TYPE; soc_sec# customer.cust_soc_sec_number%TYPE; BEGIN ... interact with database customer information ... END;
Using the %TYPE attribute ensures that my variables stay synchronized with my database structure. Just as importantly, though, this declaration section is more self-documenting now. The %TYPE attribute provides important information to anyone reviewing the code, stating: "These variables represent my columns in the program. When you see one of these variables, think `database column'." This correlation makes it easier to understand the code, easier to change the code, and easier to recognize when one of those variables is used in an inappropriate manner.
Notice that the variable name does not have to match the column name. The %TYPE attribute guarantees only that the datatype of the variable matches the datatype of the column. While a name matchup is not required, I generally try to name my %TYPE variables the same as my column names. The identical name strongly reinforces the fact that this variable "represents" my column in this program. While name synchronization can be a nuisance (database administrators often insist on somewhat obscure and rigid naming conventions, such as cust_state_abbrev_cd for the two-character abbreviation of a customer's state), you cannot escape a DBA's conventions. Because you must use that table, why not make your programs as consistent as possible with the underlying database?
While it is true that many (perhaps even most) of your local PL/SQL variables are directly related to database columns, at least some of your variables are local-only, perhaps calculated values based on database columns. You can also use the %TYPE attribute to infer a variable's datatype from another, previously defined PL/SQL variable, as I'll explain in this section.
The following declarations use this alternative source:
DECLARE revenue_data NUMBER(20,2); total_revenue revenue_data%TYPE; -- max_available_date DATE := LAST_DAY (ADD_MONTHS (SYSDATE, 3)); last_ship_date max_available_date%TYPE;
The variable called revenue_data acts as the standard variable for revenue data. Whenever I declare my total_revenue variable (or any other revenue-related variables), I base it on the general revenue_data variable. By doing this, I can guarantee a consistent declaration of revenue variables. Furthermore, if I ever need to change my revenue datatypes again, I only have to change the way that revenue_data is declared and recompile. All variables declared with revenue_data%TYPE will automatically adjust.
Note that while max_available_date has a default value as well, it is not applied to last_ship_date. Everything up to the optional default value assignment (initiated with a DEFAULT keyword or assignment operator) in a declaration is used in the %TYPE declaration, such as NOT NULL and the datatype. The default value, if specified in the source variable declaration, is ignored.
To make it easiest for individual developers to be aware of and make use of standard variable declarations, consider creating a package that contains only standard variable declarations and any code necessary to initialize them, as follows:
A variable is a chunk of memory that has a name. A variable can hold a simple value. It can also be assigned the value of the outcome of an arbitrarily complicated expression -- either through a default value setting or an assignment. In this way a variable can represent that complex expression and thus be used in place of that expression in your code. The result is a program which is easy to read and maintain.
Consider the following code fragment:
IF (shipdate < ADD_MONTHS (SYSDATE, +3) OR order_date >= ADD_MONTHS (SYSDATE, -2)) AND cust_priority_type = 'HIGH' AND order_status = 'O' THEN ship_order ('EXPRESS'); ELSIF (order_date >= ADD_MONTHS (SYSDATE, -2) OR ADD_MONTHS (SYSDATE, 3) > shipdate) AND order_status = 'O' THEN ship_order ('GROUND'); END IF;
If I skip past the complicated Boolean expressions and look at the code executed in each IF and ELSIF clause I can "reverse-engineer" my understanding of the code. It looks like the IF statement is used to determine the method by which an order should be shipped. Well, that's good to know. Unfortunately, it would be very difficult to discern this fact from the conditions in the IF statement. Those Boolean expressions with multiple components are, in and of themselves, almost impossible to interpret without drawing a diagram. If there is an error in this logic, no one but the original author would be able to readily untangle the knot.
When you find yourself writing this kind of code (or having to maintain it) and, in the process, stumble through the logic -- either not sure if you got it right or even wondering precisely what "right" is, it is time for a shift in your approach. Perhaps you are trying to understand the implementational details before you understand the business rules.
You should, in general, avoid implementing complicated expressions and logic until you have mapped out and verified that logic independent of PL/SQL code. Are you working from specifications? Then for the above code, your specifications might say something like this:
Before you dive into the PL/SQL IF statement, write some pseudocode based on your specifications. Here is an example:
IF order-overdue THEN IF customer-priority-is-high THEN ship-express; ELSE ship-ground; END IF; ELSE ship-ground; END IF;
The next step would be to rewrite this nested IF statement as follows:
IF order-overdue AND IF customer-priority-is-high THEN ship-express; ELSE ship-ground; END IF;
Even before writing a line of code, I have been able to simplify the logic which meets this specification. At this point I don't know what it means for an order to be overdue. I don't know how to tell if a customer's priority is high. I don't really need to know these details yet. My focus is to make sure I understand the logical requirements. Once this is done, I can recast the pseudocode as real PL/SQL.
My first pass in a conversion to PL/SQL would be a more-or-less direct translation:
BEGIN IF order_overdue AND high_priority THEN ship_order ('EXPRESS'); ELSE ship_order ('GROUND'); END IF; END;
I don't know how ship_order will behave and, again, at this moment I don't care. I will employ top-down design to "fill in the blanks" and then work out the details later.
My conditional expression:
order_overdue AND high_priority
substitutes named variables for the pseudocode. To figure out exactly how these variables should be assigned their values, I need to look back at my requirements. Here is what I find:
This last sentence is less a requirement than an implementation instruction. Be that as it may, instead of creating a function for each of these conditions, I can declare Boolean named constants and assign a value to those constants based on the variables representing the order and customer, as shown below:
DECLARE /* I hide my business rule behind this variable. */ order_overdue CONSTANT BOOLEAN DEFAULT (shipdate < ADD_MONTHS (SYSDATE, +3) OR order_date >= ADD_MONTHS (SYSDATE, -2)) AND order_status = 'O'; high_priority CONSTANT BOOLEAN DEFAULT cust_priority_type = 'HIGH'; BEGIN IF order_overdue AND high_priority THEN ship_order ('EXPRESS'); ELSE ship_order ('GROUND'); END IF; END;
In this final version of the IF statement, I've used the order_overdue constant to abstract out or hide the two-part check against the order and ship dates. Now the IF-THEN code is much easier to read; in fact, it all but explains itself through the names of the constants. This self-documenting capability reduces the need for separate comments in the code.
By consolidating my redundant code, I also make it easier to maintain my application. If the conditions which make an order overdue change, I do not need to hunt through my code for all the places which perform the order_overdue test. I need only change the default value given to the order_overdue constant.
I can also take this approach a step further and place my business rule logic into a Boolean function. I can then call this function from any of my programs and avoid reproducing the logic in that declaration statement.
Copyright (c) 2000 O'Reilly & Associates. All rights reserved.