Processing of 10 Million Records

kalyan_samaddar · June 18th, 2003, 06:45 AM

We are in a process to design a batch, which will read around 10 Millions records from various tables and then gather their associated details from other related tables. Then process the data and finally insert the data into a separate table.
Steps in broad prospective will be as follows

1. Read the 10 Millions data from base tables and then insert the data into a temp table, from which rest of the processing will be done. Temporary table is used since we canât modify the original table
2. Temporary table data will be updated for other related information from related tables, we expect minimum 6 different updates will be needed to capture all required information.
3. Once Temporary table is having all the necessary data, records will be taken into a cursor and all 10 million records will be processed one by one. After each record is processed, data will be inserted into a table.
4. Finally summarization will be done and summarized data will be inserted into different tables.

I need your suggestion about the following points considering the following HW specs
Oracle 8.1.7 on HP UNIX-11
2 CPUs
1 GB Ram

1. What should be the optimum SGA and Rollback Segment Size considering single record size around 500 bytes
2. Regarding step no 1, is it possible to accomplish this task with a single insert statement such as INSERT INTO table1 VALUES SELECT x,y,z FROM table2 WHERE conditions
3. Regarding step no 1, whether SQL*LOADER will be a better option to accomplish the task?
4. Regarding step no 2, is it possible to update all the 10 million records in a single update statement. If possible then what would be the optimal DB setting such as
a. Rollback Segments
b. Sort area
c. SGA
d. data block buffers
e. Sort_area_size.
5. Regarding step no 3, how should we take care that rollback segment is not filled up during the insert.
6. Regarding step no 3, will Commit after optimum interval help to reduce the chance of Rollback segment problem. Optimum Commit frequency will also apply to step 2 if single update is not feasible.
7. If we Partition the final destination table as in step no 3, will that improve the insert performance?
8. How parallel processing can help to speed up the processing considering we have 2 CPUs
9. Is there any other suggestion to speed up the process

andrewst · July 9th, 2003, 07:50 AM

I'm not sure I like the approach, though of course without knowing the full details I may be wrong.

What is the purpose of the temporary table? Why not just select the data from the original base tables, join to other tables for additional info, and insert straight into the target tables?

Anyway to address your specific points (where I can):
2. yes
3. no, SQL Loader is for loading from FILES not tables.
4. yes - in fact you could probably get all the required information in the first SELECT statement.
6. no, periodic commits are generally a bad idea
9. avoid using temporary tables, cursor loops, opening/fetching/closing several cursors within cursor loops.

The ideal solution would be:

INSERT INTO target_table
SELECT ...
FROM source_table, ...
WHERE ...;

... with all of the lookups and processing done in the SELECT statement.

Tony Andrews