Processing of 10 Million Records
We are in a process to design a batch, which will read around 10 Millions records from various tables and then gather their associated details from other related tables. Then process the data and finally insert the data into a separate table.
Steps in broad prospective will be as follows
1. Read the 10 Millions data from base tables and then insert the data into a temp table, from which rest of the processing will be done. Temporary table is used since we canât modify the original table
2. Temporary table data will be updated for other related information from related tables, we expect minimum 6 different updates will be needed to capture all required information.
3. Once Temporary table is having all the necessary data, records will be taken into a cursor and all 10 million records will be processed one by one. After each record is processed, data will be inserted into a table.
4. Finally summarization will be done and summarized data will be inserted into different tables.
I need your suggestion about the following points considering the following HW specs
Oracle 8.1.7 on HP UNIX-11
2 CPUs
1 GB Ram
1. What should be the optimum SGA and Rollback Segment Size considering single record size around 500 bytes
2. Regarding step no 1, is it possible to accomplish this task with a single insert statement such as INSERT INTO table1 VALUES SELECT x,y,z FROM table2 WHERE conditions
3. Regarding step no 1, whether SQL*LOADER will be a better option to accomplish the task?
4. Regarding step no 2, is it possible to update all the 10 million records in a single update statement. If possible then what would be the optimal DB setting such as
a. Rollback Segments
b. Sort area
c. SGA
d. data block buffers
e. Sort_area_size.
5. Regarding step no 3, how should we take care that rollback segment is not filled up during the insert.
6. Regarding step no 3, will Commit after optimum interval help to reduce the chance of Rollback segment problem. Optimum Commit frequency will also apply to step 2 if single update is not feasible.
7. If we Partition the final destination table as in step no 3, will that improve the insert performance?
8. How parallel processing can help to speed up the processing considering we have 2 CPUs
9. Is there any other suggestion to speed up the process
|