Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > Database > BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4
Password Reminder
Register
Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4
This is the forum to discuss the Wrox book Beginning Database Design Solutions by Rod Stephens; ISBN: 9780470385494
Welcome to the p2p.wrox.com Forums.

You are currently viewing the BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4 section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developersí questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
 
 
Thread Tools Display Modes
  #1 (permalink)  
Old April 15th, 2012, 05:06 PM
Registered User
Points: 8, Level: 1
Points: 8, Level: 1 Points: 8, Level: 1 Points: 8, Level: 1
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Apr 2012
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default how to reconcile inconsistent data sources?

I bought the kindle version of Begging Databese Design Solutions and immediately read it hoping I could find an answer to my problem, but while I filled in holes in my understanding of DB design, I am still struggling with a real world problem and hoping someone can offer some advice or direction.'

I have an online store that sells thousands of products. I buy these products from multiple suppliers with data feeds. Their data is full of errors, so It is difficult to cross-reference my SKUs with theirs. How do I reconcile the contradicting sources of data? It is an on going problem because products are added and discontinued by suppliers regularly.

For example, my product catalog has my SKU (primary key), a UPC (optional unique key) and a combination of manufacturer code and manufacturer part number (alternate candidate key).

The data from my suppliers always has their own variation of a SKU (primary key) which I try to cross-reference, but they often have a contradicting UPC , manufacturer part number or manufacturer (a result of historic corporate mergers and spinoffs). Sometimes I can algorithmically repair the data, or verify it by matching it with other sources that agree, but I still have a lot of inconsistent data. I imagine that anyone with multiple independent data sources has the same problem.

What does such a multi-datasource model look like?

How do I keep track of all the exceptions and manual corrections while building a cross-reference table?

I even have situations where a manufacturer will accidentally/intentionally reuse a UPC for more than one product or significantly change a product but keep the same part number. Not sure if it is worth incorporating an artificial key just for rare anomalies.

Advice?
Online references?
Books?
  #2 (permalink)  
Old April 16th, 2012, 10:59 AM
Rod Stephens's Avatar
Wrox Author
Points: 3,187, Level: 23
Points: 3,187, Level: 23 Points: 3,187, Level: 23 Points: 3,187, Level: 23
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jan 2006
Location: , , .
Posts: 646
Thanks: 2
Thanked 96 Times in 95 Posts
Default

Wow. Sorry but I don't know what to do about that.

I would try to clean the data as much as possible coming in so your database isn't full of garbage but it sounds like that will be really hard. You could start by making mapping tables to map bad vendor UPCs and SKUs to good ones but it they aren't consistent that will be hard.

Re Reusing UPCs and other keys, you could make a table that lists UPCs and effective dates, sort of like a temporal table for pricing. Or perhaps use UPC plus description or name as they key. It sounds like a major pain but I'm not sure what else you can do.

You could try pushing back and seeing if you can get vendors to clean thing up a bit but you'll need some leverage (like a really big purchaser would have) to get them to do anything.

Sorry I can't think of anything better.
__________________
Rod

Rod Stephens, Microsoft MVP

Essential Algorithms: A Practical Approach to Computer Algorithms

(Please post reviews at Amazon or wherever you shop!)
  #3 (permalink)  
Old April 16th, 2012, 11:46 AM
Registered User
Points: 8, Level: 1
Points: 8, Level: 1 Points: 8, Level: 1 Points: 8, Level: 1
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Apr 2012
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default thanks

Your prompt response is helpful - it confirms that there is no simple solution that everyone but me knows about. I will continue to slog it out. This is where theory and the real world meet ;-)

btw- I found the book easy to digest and interesting to read. Well worth the money spent.
  #4 (permalink)  
Old April 16th, 2012, 12:18 PM
Rod Stephens's Avatar
Wrox Author
Points: 3,187, Level: 23
Points: 3,187, Level: 23 Points: 3,187, Level: 23 Points: 3,187, Level: 23
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jan 2006
Location: , , .
Posts: 646
Thanks: 2
Thanked 96 Times in 95 Posts
Default

If there is a perfect solution, then you and I may be the only ones who don't know about it.

I would post the question on other forums if you can, too. This one doesn't get as much readership as some others. I would at least try:

SQL Server Developer Center > SQL Server Forums > Database Design

You might also look at some of the other Microsoft forums. Go to Microsoft Developer Network > Forums Home and search for "database" to see what's available.

If you find something worthwhile, please post a link to it here as a followup. I'd like to know what people do about this, too. I'm sure you're not the only one out there.

Quote:
btw- I found the book easy to digest and interesting to read. Well worth the money spent.
Thanks for the kind words! Post a review when you have a chance.
__________________
Rod

Rod Stephens, Microsoft MVP

Essential Algorithms: A Practical Approach to Computer Algorithms

(Please post reviews at Amazon or wherever you shop!)

Last edited by Rod Stephens; April 16th, 2012 at 12:31 PM.
 


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
inconsistent data stealthdevil Access VBA 2 November 10th, 2006 05:03 PM
Joining data from multiple data sources ClaesR BOOK: Professional SQL Server 2005 Reporting Services ISBN: 0-7645-8497-9 0 August 28th, 2006 10:30 AM
Extract part of field from inconsistent data Roly Reefer Access VBA 4 March 31st, 2005 09:25 AM
Inconsistent error when retrieving data from a db darkhalf Classic ASP Databases 2 March 14th, 2004 12:38 PM



All times are GMT -4. The time now is 04:12 PM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.