Quote:
quote:Originally posted by owain
One of the biggest advantages of relational data models is the reduced amount of repeated data...
|
One of the biggest advantages of
normalization (and not an RDBMS per se) is the reduced amount of repeated data. Normalization is the concept of "one fact, one time, one place, one way". That's the goal because experience (and a lot research) has shown us that a properly normalized database is better protected against data anomalies which can arise because of improper or insufficient inserts, updates, or deletes of data.
The reality is that normalizing data often means placing the data representing the entity being modeled into multiple tables. When it comes time to bring that data back together for processing, to again fully represent the entity being modeled, there is no choice but to JOIN the tables together. This means that there must be sufficient information in the tables being JOINed to relate them to each other, and that (typically) means that the primary key in one table must be present as a foreign key in another.
This means that the relating data must be present in both tables, and that is a duplication that cannot be avoided.
So the question is whether the relating data should be some sort of artificial meaningless value like an IDENTITY value, or whether it ought to be the "natural" value that already is the real world attribute that distinguishes one entity from another. Why make up a value when nature has already provided one?
We've discussed one reason - the "natural" key is unwieldy (composite), so we invent a strictly internal value which stands in for the natural key. Another reason may be primary key volatility - if, for example, an account number can be changed because of entry error or other circumstances, then it may be that the account number is not a good candidate for the primary key.
I have a hard time understanding any other reasons... :)
Jeff Mason
Custom Apps, Inc.
www.custom-apps.com