Subject: process millions of rows
Posted By: maratg Post Date: 9/17/2003 12:24:54 PM
i need to apply business logic to a table containing millions of rows (15M-30M). I need to process every row. The database is sql server 2000, and front end will be written in asp.net. What are your suggestions for the best way of approaching this problem in order to create a well performing application.
Should business rules be written as stored procs?
Should I partition the table?
Thanks, for your help


Reply By: Jeff Mason Reply Date: 9/17/2003 7:22:09 PM
What is it that you intend to do to all these rows?  How often do you need to do this?

In your case I don't think partitioning will help unless you have multiple servers.

Jeff Mason
Custom Apps, Inc.
www.custom-apps.com
Reply By: maratg Reply Date: 9/17/2003 7:41:21 PM
each row will contain transaction information, each transaction will be checked by the business rules engine to determine if it's "good" or "bad" and that information will be stored in the same or separate table. This process will be performed on a daily basis (overnight).
Why won't partitioning be effective if a single sql server is used? Should business logic reside in the database (procs, scripts)?
Thanks

Reply By: Jeff Mason Reply Date: 9/17/2003 9:10:46 PM
Partitioning simply divides the one large table into several smaller ones.  If your updates were confined to a (relatively) small subset of the rows, and that subset resided in a single partition (or at worst a very small number of partitions), then operations on that smaller table would be more efficient than the same operation on the same subset of the rows in the larger table.  The indexes are smaller, etc.

But, since you indicate that the operation is done on all the rows, the only thing you'll buy is the overhead of managing the multiple partitions.  If you had multiple servers, then the multiple partitions could be spread out over those processors and could be processed in parallel.

Without knowing the extent of your 'business logic' its hard to say for certain, but I suspect you'll find far more efficiency using a stored procedure so the processing takes place on the server and within the database itself.  The alternative would be to ship 30 million rows to a client, and you really don't want to do that.

Jeff Mason
Custom Apps, Inc.
www.custom-apps.com
Reply By: maratg Reply Date: 9/18/2003 12:08:02 PM
Thanks


Go to topic 3116

Return to index page 1043
Return to index page 1042
Return to index page 1041
Return to index page 1040
Return to index page 1039
Return to index page 1038
Return to index page 1037
Return to index page 1036
Return to index page 1035
Return to index page 1034