I'm not completely sure I understand your questions so let me know if I miss something.
I don't think you need the domainsID and benefitsID fields because they aren't really used for anything. Instead I would make the combination of jobID and domain (or benefit) be the primary key. That would guarantee uniqueness so a particular job cannot have the same domain (or benefit) twice. (For example, a particular job could not have "retail" listed in two records.)
Quote:
If I join all I will have a number of records equal to 1 + number of domains with that jobID + number of benefits with that jobID.
|
I can think of two main approaches.
1. This is the method you mention where you just fetch everything in one big query. It's fairly brute force but may not be all that bad. In this example, the duplicate data is the company and job data and there isn't much of that here so you're not wasting too much space.
However, in a real application there might be a lot more information in the company and job tables and then fetching duplicates of that data for every row would add a lot of unnecessary data to be fetched and transferred from the database. You could just fetch the minimum of data from the company and job tables and then get more data later if you need it.
2. A second approach would be to have the program fetch the company and job data and then perform separate queries to fetch the corresponding domain and benefits data. This would require three separate queries (four if you fetch the company and job data separately--that would depend on how you fetch the job). There will be some extra overhead in each fetch but it shouldn't be too bad.
Some of this goes back to how the user selects the job. For example, you could list the companies and let the user pick one, then list the jobs and let the user pick one, and then finally list the domains and benefits of the job. In that scenario the program doesn't have the information it needs to fetch all of the data at once anyway so it can naturally fetch more information as the user selects details. In this case, this approach works very well.
For another example, suppose the user enters some data such as desired start date, salary, and domains. You could execute a query to get the job data for jobs that match those criteria and then display a list of the matching jobs. The user could then pick one to see the detail. Again the program doesn't need to fetch all of the data until the user makes some choices.
In contrast, suppose you want to select every job and process them all in some way. In that case it would probably be better to get all of the data at once instead of performing a huge number of separate fetches.
Unfortunately I donât think standard SQL syntax does a good job of supporting this kind of hierarchical data where one set of job data corresponds to many rows of domain and benefit data. Relational databases are really designed to select table-like results.
You may be able to get some mileage from XML queries. I havenât done much of that so Iâm not sure if it will be a big help or whether it will slow things down. For more information, see:
XML Support in Microsoft SQL Server 2005
I usually use the approach where the user makes a series of selections to drill down to the desired data. I would probably either use that or try selecting everything in one huge result and see if thereâs a problem. My guess is that it should work without causing you too many problems.