home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


5.3 Bulk Querying with the BULK COLLECT Clause

So now you know how to perform bulk DML with FORALL. What about bulk querying? I showed an example earlier of the kind of code that cries out for a bulk transfer. Here is the executable section of that example:

BEGIN
   FOR bad_car IN major_polluters
   LOOP
      names.EXTEND;
      names (major_polluters%ROWCOUNT) := bad_car.name;
      mileages.EXTEND;
      mileages (major_polluters%ROWCOUNT) := bad_car.mileage;
   END LOOP;

   ... now work with data in the arrays ...
END;

Not only do I have to write a bunch of code, but it is also much slower than I would like, because of the context switches.

To help out in this scenario, PL/SQL now offers the BULK COLLECT keywords. This clause in your cursor (explicit or implicit) tells the SQL engine to bulk bind the output from the multiple rows fetched by the query into the specified collections before returning control to the PL/SQL engine. The syntax for this clause is:

... BULK COLLECT INTO 
collection_name
[, 
collection_name
] ...

where collection_name identifies a collection.

Here are some rules and restrictions to keep in mind when using BULK COLLECT:

  • You can use these keywords in any of the following clauses: SELECT INTO, FETCH INTO, and RETURNING INTO.

  • The collections you reference can only store scalar values (string, number, date). In other words, you cannot fetch a row of data into a record structure that is a row in a collection.

  • The SQL engine automatically initializes and extends the collections you reference in the BULK COLLECT clause. It starts filling the collections at index 1, inserts elements consecutively (densely), and overwrites the values of any elements that were previously defined.

  • You cannot use the SELECT...BULK COLLECT statement in a FORALL statement.

Let's explore these rules and the usefulness of BULK COLLECT through a series of examples.

First, here is a recoding of the "major polluters" example using BULK COLLECT:

DECLARE
   names name_varray;
   mileages number_varray;
BEGIN
   SELECT name, mileage
     FROM cars_and_trucks
     
BULK COLLECT INTO names, mileages

    WHERE vehicle_type IN ('SUV', 'PICKUP');

   ... now work with data in the arrays ...
END;

I am now able to remove the initialization and extension code from the row-by-row fetch implementation.

But I don't have to rely on implicit cursors to get this job done. Here is another re-working of the major polluters example, retaining the explicit cursor:

DECLARE
   CURSOR major_polluters IS
      SELECT name, mileage
        FROM cars_and_trucks
       WHERE vehicle_type IN ('SUV', 'PICKUP');
   names name_varray;
   mileages number_varray;
BEGIN
   OPEN major_polluters;
   FETCH major_polluters BULK COLLECT INTO names, mileages;

   ... now work with data in the arrays ...
END;

recommend that you use this second, explicit cursor-based approach -- and that you store your cursors in packages, so that they can be reused. In fact, the optimal approach would involve a bundling of these operations into a procedure, as follows:

/* Filename on companion disk: polluters.pkg */
CREATE OR REPLACE PACKAGE pollution
IS
   CURSOR major_polluters (typelist IN VARCHAR2)
   IS
      SELECT name, mileage
        FROM cars_and_trucks
       WHERE INSTR (typelist, vehicle_type) > 0;

   PROCEDURE get_major_polluters (
      typelist IN VARCHAR2,
      names OUT name_varray,
      mileages OUT number_varray);
END;
/

CREATE OR REPLACE PACKAGE BODY pollution
IS
   PROCEDURE get_major_polluters (
      typelist IN VARCHAR2,
      names OUT name_varray,
      mileages OUT number_varray)
   IS
   BEGIN
      IF major_polluters%ISOPEN
      THEN
         CLOSE major_polluters;
      END IF;
      OPEN major_polluters (typelist);
      FETCH major_polluters BULK COLLECT INTO names, mileages;
      CLOSE major_polluters;
   END;
END;
/

Then I can populate my arrays with a minimum of fuss and a maximum of reusability (of both code and SQL):

DECLARE
   names name_varray;
   mileages number_varray;
BEGIN
   pollution.get_major_polluters ('SUV,PICKUP');

   ... now work with data in the arrays ...
END;

5.3.1 Restricting Bulk Collection with ROWNUM

There is no regulator mechanism built into BULK COLLECT. If your SQL statement identifies 100,000 rows of data, then the column values of all 100,000 rows will be loaded into the target collections. This can, of course, cause serious problems in your application -- and in system memory. Remember: these collections are allocated for each session. So if you have 100 users all running the same program that bulk collects 100,000 rows of information, then real memory is needed for a total of 10 million rows.

What can you do about this potentially hazardous scenario? First of all (as should be common sense in your application regardless of the use of BULK COLLECT), be careful about the queries you write and those you offer to developers and/or users to run. You shouldn't provide unrestricted access to very large tables.

You can also fall back on ROWNUM to limit the number of rows processed by your query. For example, suppose that my cars_and_trucks table has a very large number of rows of vehicles that qualify as major polluters. I could then add a ROWNUM condition to my WHERE clause and another parameter to my packaged cursor as follows:

CREATE OR REPLACE PACKAGE pollution
IS
   CURSOR major_polluters (
      typelist IN VARCHAR2, maxrows IN INTEGER := NULL)
   IS
      SELECT name, mileage
        FROM cars_and_trucks
       WHERE INSTR (typelist, vehicle_type) > 0
         AND ROWNUM < LEAST (maxrows, 10000);

   PROCEDURE get_major_polluters (
      typelist IN VARCHAR2,
      names OUT name_varray,
      mileages OUT number_varray);
END;
/

Now there is no way that anyone can ever get more than 10,000 rows in a single query -- and the user of that cursor (an individual developer) can also add a further regulatory capability by overriding that 10,000 with an even smaller number.

5.3.2 Bulk Fetching of Multiple Columns

As you have seen in previous examples, you certainly can bulk fetch the contents of multiple columns. However, you must fetch them into separate collections, one per column.

You cannot fetch into a collection of records (or objects). The following example demonstrates the error that you will receive if you try to do this:

DECLARE
   TYPE VehTab IS TABLE OF cars_and_trucks%ROWTYPE;
   gas_guzzlers VehTab;
   CURSOR low_mileage_cur IS SELECT * FROM cars_and_trucks WHERE mileage < 10;
BEGIN
   OPEN low_mileage_cur;
   FETCH low_mileage_cur BULK COLLECT INTO gas_guzzlers;
END;
/

When I run this code, I get the following somewhat obscure error message:

PLS-00493: invalid reference to a server-side object or 
           function in a local context

You will instead have to write this block as follows:

DECLARE
   guzzler_type name_varray;
   guzzler_name name_varray;
   guzzler_mileage number_varray;

   CURSOR low_mileage_cur IS 
      SELECT vehicle_type, name, mileage
        FROM cars_and_trucks WHERE mileage < 10;
BEGIN
   OPEN low_mileage_cur;
   FETCH low_mileage_cur BULK COLLECT 
    INTO guzzler_type, guzzler_name, guzzler_mileage;
END;
/

5.3.3 Using the RETURNING Clause with Bulk Operations

You've now seen BULK COLLECT put to use for both implicit and explicit query cursors. You can also use BULK COLLECT inside a FORALL statement, in order to take advantage of the RETURNING clause.

The RETURNING clause, new to Oracle8, allows you to obtain information (such as a newly updated value for a salary) from a DML statement. RETURNING can help you avoid additional queries to the database to determine the results of DML operations that just completed.

Suppose Congress has passed a law (overriding the almost certain presidential veto) requiring that a company pay its highest-compensated employee no more than 50 times the salary of its lowest-paid employee.[ 1 ] I work in the IT department of the newly merged company Northrop-Ford-Mattel-Yahoo-ATT, which employs a total of 250,000 workers. The word has come down from on high: the CEO is not taking a pay cut, so we need to increase the salaries of everyone who makes less than 50 times his 2004 total compensation package of $145 million -- and decrease the salaries of all upper management except for the CEO. After all, somebody's got to make up for this loss in profit.

[1] Currently in the United States, the average is more like 250 times, a very inequitable situation that almost certainly causes hundreds of thousands of children to go hungry each day in our very rich nation.

Wow! I have lots of updating to do, and I want to use FORALL to get the job done as quickly as possible. However, I also need to perform various kinds of processing on the employee data and then print a report showing the change in salary for each affected employee. That RETURNING clause would come in awfully handy here, so let's give it a try.

See the onlyfair.sql file on the companion disk for all of the steps shown here, plus table creation and INSERT statements.

First, I'll create a reusable function to return the compensation for an executive:

/* Filename on companion disk: onlyfair.sql */
FUNCTION salforexec (title_in IN VARCHAR2) RETURN NUMBER
IS
   CURSOR ceo_compensation IS
      SELECT salary + bonus + stock_options + 
             mercedes_benz_allowance + yacht_allowance
        FROM compensation
       WHERE title = title_in;       
   big_bucks NUMBER;
BEGIN
   OPEN ceo_compensation; 
   FETCH ceo_compensation INTO big_bucks;
   RETURN big_bucks;
END;
/

In the main block of the update program, I declare a number of local variables and the following query to identify underpaid employees and overpaid employees who are not lucky enough to be the CEO:

DECLARE  
   big_bucks NUMBER := salforexec ('CEO');
   min_sal NUMBER := big_bucks / 50;
   names name_varray;
   old_salaries number_varray;
   new_salaries number_varray;
   
   CURSOR affected_employees (ceosal IN NUMBER)
   IS
      SELECT name, salary + bonus old_salary
        FROM compensation
       WHERE title != 'CEO'
         AND ((salary + bonus < ceosal / 50) 
              OR (salary + bonus > ceosal / 10)) ;

At the start of my executable section, I load all this data into my collections with a BULK COLLECT query:

OPEN affected_employees (big_bucks);
FETCH affected_employees 
   BULK COLLECT INTO names, old_salaries;

Then I can use the names collection in my FORALL update:

FORALL indx IN names.FIRST .. names.LAST
   UPDATE compensation
      SET salary = 
          DECODE (
             GREATEST (min_sal, salary), 
                min_sal, min_sal,
             salary / 5)
    WHERE name = names (indx)
    RETURNING salary BULK COLLECT INTO new_salaries;

I use DECODE to give an employee either a major boost in yearly income or an 80% cut in pay to keep the CEO comfy. I end it with a RETURNING clause that relies on BULK COLLECT to populate a third collection: the new salaries.

Finally, since I used RETURNING and don't have to write another query against the compensation table to obtain the new salaries, I can immediately move to report generation:

FOR indx IN names.FIRST .. names.LAST
LOOP
   DBMS_OUTPUT.PUT_LINE (
      RPAD (names(indx), 20) || 
      RPAD (' Old: ' || old_salaries(indx), 15) ||
      ' New: ' || new_salaries(indx)
      );
END LOOP;

Here, then, is the report generated from the onlyfair.sql script:

John DayAndNight     Old: 10500     New: 2900000
Holly Cubicle        Old: 52000     New: 2900000
Sandra Watchthebucks Old: 22000000  New: 4000000

Now everyone can afford quality housing and health care; tax revenue at all levels will increase (nobody's a better tax deadbeat than the ultra-rich), so public schools can get the funding they need. Hey, and rich people are even still rich -- just not as rich as before. Now that is what I call a humming economy!

TIP: The RETURNING column values or expressions returned by each execution in FORALL are added to the collection after the values returned previously. If you use RETURNING inside a non-bulk FOR loop, previous values are overwritten by the latest DML execution.


Previous: 5.2 Bulk DML with the FORALL Statement Oracle PL/SQL Programming Guide to Oracle 8i Features Next: 5.4 Using Cursor Attributes
5.2 Bulk DML with the FORALL Statement Book Index 5.4 Using Cursor Attributes

The Oracle Library Navigation

Copyright (c) 2000 O'Reilly & Associates. All rights reserved.

Library Home Oracle PL/SQL Programming, 2nd. Ed. Guide to Oracle 8i Features Oracle Built-in Packages Advanced PL/SQL Programming with Packages Oracle Web Applications Oracle PL/SQL Language Pocket Reference Oracle PL/SQL Built-ins Pocket Reference