Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > Database > BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4
Password Reminder
Register
Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4
This is the forum to discuss the Wrox book Beginning Database Design Solutions by Rod Stephens; ISBN: 9780470385494
Welcome to the p2p.wrox.com Forums.

You are currently viewing the BOOK: Beginning Database Design Solutions ISBN: 978-0-470-38549-4 section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old March 21st, 2012, 09:00 AM
Authorized User
Points: 143, Level: 2
Points: 143, Level: 2 Points: 143, Level: 2 Points: 143, Level: 2
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Mar 2012
Posts: 32
Thanks: 9
Thanked 0 Times in 0 Posts
Default Questions regarding the implementation of book concepts

Dear Mr. Stephens, your book is one of my favorites, it presents the concept brilliantly and it is a very pleasant read (for a technical book it reads like a novel).

This being said I found difficulties implementing the concepts presented, probably because my lack of experience.

I would like to start with one of the things that should be easy but I am not clear when it comes to implementation: “Countries” and “States” tables (I will use what you call informal language). Also I assume the three-tier model. Next I will try to implement your ideas following your instructions.

You said that if we have a validation list to convert it into a foreign key constraint. So I assume countries and states qualify, so I create two separate tables. Now both have a (full) name and an abbreviation (standard). In the user interface we want to display countries and states as dropdowns/select elements. Now it will probably make sense to have the abbreviation as (UI Form) select value and the full name as (UI Form) select label. In this case I assume that in the main table I will create a char(2) column to hold the values and in the linked table I will have two columns, one char(2) holding the abbreviated form of the countries or states and the second varchar holding the name of the countries or states accordingly. Is this approach correct as in book’s diagrams I always see something like “State” and sometime not even linked to the main table, but in the main table.

Code:
Registration
+++++++++
…
Country
State
….
Country
++++++
Country
CountryName
The second question will be if the Country and States tables must have a primary key? I assume that the foreign key will be State (and Country respectively), but should I also add a primary key like bellow?

Code:
Country
++++++
Id
Country
CountryName
Assuming that we want to display the States/Provinces for several countries, let say for Canada and United States. In many places I’ve seen both put together in the (UI Form) Select. I guess this is adhoc and probably goes in what you called “mishmashed “ table and you advise against. This means that I will have to have Two tables with States/Provinces, one for Canada and one for United States? In this case the diagram becomes something like States depend on Country that depends on Registration? How will the diagram look in this situation?

I will take a real life example: a job / recruiting site (such as careerbuilder.com or monster.com). The Employer post an ad and the Employee apply/answer to it. I will slim down the table for the Ads to something like:

Code:
Ads
+++
Id
Description
Salary
ContractType
Availability
CareerLevel
Domain
Country
State
AgeMin
AgeMax
CivilStatus
Education
Experience
…
This table will probably have a FK to link it to an Employer account and some other columns, but we can leave them aside as I guess this is enough to clarify a few ideas. In this table a lot of columns are such validation list that in UI will be dropdown menus. Some have a finite number of options: ContractType (full time, part time, per diem, per project…), Availability (local, state wide, nation wide, international), CareerLevel, CivilStatus, Education, Experience. My first question that pops my mind is if it feasible to have so many columns linked to other tables for meaning. To create the UI I assume that I will have to join all in a query to create the dropdown menus. Wouldn’t be better if I do not make this thing at this tier level, but in the programming language basically hard-coding the values?

Another easy thing that bothers me is the “sex” column. I am tempted to use a char(1) and m/f; but when it is true when the Employee register that one can be either Male or Female, when the Employer post the ad he has in fact 3 options: male, female and both. How this issue should be handled?

Another general question is if all these lists go to separate tables, how it will be better to have something similar to the abbreviation in the State/Country in the main table (here Ads) and the full name in the linked table (for example ContractType)? I assume I can even move the meaning entirely in the linked table using in the main table placeholders and defining the column as int(1). Now I will have “1” in the main table, “1” FK in one of the columns of the linked table and “Full Time” (meaning) as another column in the linked?

The most difficult aspect it is probably related to the Domain column that I left it last. Here I have a list of domains from which the Employer can select how many he wants. I assume I will have another two tables besides the “Ads” table: one to hold the Employee selections and one to hold the list of domains available. The same questions above apply here: do I place placeholders in the main table and SelectedDomains table, explain the meaning in the last table DomainsList?

Last question, I took this table as being a compound object. Do I interpret correctly the problem? I would have been tempted to take out some of the columns and leave the filtering at the account level when the Employer sort the information; However what I want is to implement and enforce them at the level of ad. Still it remains the questions if it will be better to split this table further or not.

Thank you.
Reply With Quote
  #2 (permalink)  
Old March 21st, 2012, 11:51 AM
Rod Stephens's Avatar
Wrox Author
Points: 2,999, Level: 22
Points: 2,999, Level: 22 Points: 2,999, Level: 22 Points: 2,999, Level: 22
Activity: 22%
Activity: 22% Activity: 22% Activity: 22%
 
Join Date: Jan 2006
Location: , , .
Posts: 601
Thanks: 1
Thanked 84 Times in 83 Posts
Default

Thanks for the kind words. Post a review when you have a chance! (Just pasting in your first sentence would make a good review.)

(A long one! )

Quote:
“Countries” and “States” tables
You said that if we have a validation list to convert it into a foreign key constraint. So I assume countries and states qualify, so I create two separate tables. Now both have a (full) name and an abbreviation (standard).
The idea behind the lookup table isn’t necessarily to match long names with abbreviations. It’s to make a list of all allowed entries. For States, you could use a table that only has a single field: the states’ full names. Then the main table (Customer or Order or whatever) would have a foreign key constraint to validate a record’s State field so the user can’t enter “Taxas” or “Funzona.”

You could put the abbreviation in the main table if you wanted to, but let’s think about how much disk space that would save.

Suppose the longest state has a 20-letter name. Using the abbreviation in the main record would save 18 characters. Assuming only 1 byte per character (which may not be the case with Unicode) and assuming you have 1,000 records, you save around 18KB of disk space. That’s nothing. Suppose you have 1 million records. In that case you save about 18MB. That’s more but still not all that much. Back in the 1970s and 1980s when disk space cost hundreds of dollars per MB, you would go to great lengths to save that much space. But now that disk space costs about $0.10 per gigabyte, it’s not worth spending much effort to save 18MB.

So I would probably decide which value (full name or abbreviation) that you want the user to see and store that value in the main record. You may still want to store them both in the lookup table for other purposes. For example, the USPS prefers that you use the abbreviation in the address so if you let the user see the full name, you may still want to be able to look up the abbreviation to print address labels and such.

You also need to consider things like how hard would it be to change the name of a state leaving the abbreviation alone. It would be moderately hard (although a cascading update will take care of it for you in many databases) but it’s a very unlikely scenario.

Quote:
The second question will be if the Country and States tables must have a primary key? I assume that the foreign key will be State (and Country respectively), but should I also add a primary key like bellow?
All tables should have a primary key. If there is no combination of fields that is guaranteed unique (which is required by a primary key), then there’s something wrong with the table design.

Use the field that you use to look up records most often because the primary key should have the fastest indexing.

The primary key requires uniqueness, which is something you want in a lookup table because, for example, you don’t want two states with the same name. You also don’t want two states with the same abbreviation, however, so you should put a uniqueness constraint on whichever field is not the primary key in this example. (I think I would use full name as the primary key because it will be hit every time you look up a main record and give abbreviation the uniqueness constraint. Unless you leave out the abbreviation, in which case that’s moot.)

Quote:
Assuming that we want to display the States/Provinces for several countries, let say for Canada and United States. In many places I’ve seen both put together in the (UI Form) Select. I guess this is adhoc and probably goes in what you called “mishmashed “ table and you advise against.
An excellent question. If you put states and provinces in the same table with no indication that they are in different countries, then that is a mismashed table because it doesn’t acknowledge the relationship between state and country. To look at it in another way, there’s a missing piece of information. California and Quebec are almost but not exactly the same sort of thing. And if that’s all you store in that table, then the user could select incorrect combinations such as Ontario USA.

I think a lot of programs let you make that invalid selection and then the user interface validates the data when you accept the form. That’s a workable solution but I prefer not to let the user enter invalid data/combinations if possible.

From a DB design point of view, you could simply include the country name in the table. So at this point, you could have a table with the fields:
  • StateName
  • StateAbbreviation
  • Country

You could make a lookup table listing the countries if you want and use it as a foreign key constraint in this table, but this may be the only place you need to use country, in which case I think many developers would skip the extra table. (The downside is if you need to rename a country or add a new state record, then you need to be careful that you do it correctly and don’t misspell Mexicho.)

From a user interface point of view, what does the user select to make an address? You might make the user pick the country first and then make a combo box show the list of allowed states/provinces. A lot of web pages do that and it’s not too bad.

An alternative would be to let the user select the state/province and then automatically fill in the country by looking it up in the table. That would work if you don’t have too many states/provinces. The US plus Canada might work (although Canadians would need to look through a lot more entries than usual). If you have lots of countries, it would probably be better to do it the other way because this could be a huge list.

Allowing type-ahead would help, too. For example, you type “Was” and it jumps to “Washington.”

To really do this right, you’d also need a postal code lookup table somewhere to prevent the user from entering an invalid country/state/city/postal code combination.

Quote:
(Job/recruiting records) Some have a finite number of options: ContractType (full time, part time, per diem, per project…), Availability (local, state wide, nation wide, international), CareerLevel, CivilStatus, Education, Experience. My first question that pops my mind is if it feasible to have so many columns linked to other tables for meaning.
Yes I think that’s feasible. The database really only takes a hit when you try to create or modify a record and it needs to check the lookup tables to verify that your values are correct. I don’t think that will be a problem for modern databases. Fetching the data takes a bit longer but shouldn’t be too big a burden as long as you’re not trying to fetch a million records, and in that case it’s going to take a while no matter what.

If you’re worried about a particular program, I would try it and see. Most of the time, this won’t be a problem.

[quote]To create the UI I assume that I will have to join all in a query to create the dropdown menus. Wouldn’t be better if I do not make this thing at this tier level, but in the programming language basically hard-coding the values?[quote]

Yes and no. I would probably handle validation in the UI but you don’t need to hard code the values. When a form loads, it can do a “SELECT StateName FROM States ORDER BY StateName” and use the results to build the items in the states combo box. That way the data is all in the database and not hard coded but the combo box can still perform validation in the UI getting the latest values from the database.

The program can rebuild the combo box when the user picks a country. (Actually the combo box would probably be disabled until the user picked a country, at which point the program would populate the combo box and then enable it.)

Quote:
Another easy thing that bothers me is the “sex” column. I am tempted to use a char(1) and m/f; but when it is true when the Employee register that one can be either Male or Female, when the Employer post the ad he has in fact 3 options: male, female and both. How this issue should be handled?
I think different developers handle this one in various ways. I think most hard code M/F and leave it at that. Unless the application focuses on such issues, I don’t think anyone spends any effort other genders.

For the employer who can say M, F, or both/either, you could use M/F/B and write a little ugly code to handle it. Or you could give the Job table two fields AllowMale and AllowFemale and set one or both of them. That might make the query a little simpler.

Code:
SELECT * FROM Job WHERE
    ((Employee.Gender = ‘M’ AND Job.AllowMale) OR
     (Employee.Gender = ‘F’ AND Job.AllowFemale))
    AND …
Quote:
Another general question is if all these lists go to separate tables, how it will be better to have something similar to the abbreviation in the State/Country in the main table (here Ads) and the full name in the linked table (for example ContractType)? I assume I can even move the meaning entirely in the linked table using in the main table placeholders and defining the column as int(1). Now I will have “1” in the main table, “1” FK in one of the columns of the linked table and “Full Time” (meaning) as another column in the linked?
I think this depends on why you’re using the lookup table. If you’re just using it to list all of the available possibilities, then you probably don’t need to have the shorter abbreviation or code. Instead you can store the full value in the main table.

That makes it a little harder to change the value. For example, if you decide that “Full Time” should be changed to “Exempt,” then you need to update more records. The database should ensure that this update occurs correctly, however, via cascading updates, which are supported by the more advanced databases at least. And hopefully that shouldn’t occur very often.

The case is less clear with a “heavier” table such as InventoryItems. That table might store an item’s name, description, and price. In that case you may want to store item IDs in the main table and look them up in the InventoryItems table instead of storing the item’s name everywhere.

Then you can do things like update the item’s name or description without updating the main OrderItems table.

Even there you may want to take care, however, because those records probably represent past sales, too. For example, if you change the price, you would effectively be changing the price retroactively on past sales and that could be confusing.

To avoid that, you might store the actual price with the past orders (which adds redundant data, but which I think is the most common approach) or you could add another level of tables storing pricing history over time (which is probably “correct” but more confusing).

Quote:
The most difficult aspect it is probably related to the Domain column that I left it last. Here I have a list of domains from which the Employer can select how many he wants. I assume I will have another two tables besides the “Ads” table: one to hold the Employee selections and one to hold the list of domains available. The same questions above apply here: do I place placeholders in the main table and SelectedDomains table, explain the meaning in the last table DomainsList?
I think so, if I understand the question. This is a many-to-many relationship so you need to add an intermediate table.

You would want a lookup table to list the domains, perhaps called Domains. Then you would need a JobDomains (or SelectedDomains) table to list the domains selected for each job. It would have JobId as a link to the Jobs table and DomainId as a link to the Domains table. (DomainId could be the domain’s name if you like, just as the States table could include only the state’s name and not an abbreviation. This might depend on how long the domain names are and how often then change.)

Quote:
Last question, I took this table as being a compound object. Do I interpret correctly the problem? I would have been tempted to take out some of the columns and leave the filtering at the account level when the Employer sort the information; However what I want is to implement and enforce them at the level of ad. Still it remains the questions if it will be better to split this table further or not.
Do you mean would it better to keep all of the fields in one huge record or split out the filtering fields and then link them to the body of the data such as the long job description?

In the old days, the clear answer would be to split the record because shuffling around huge chunks of data was slow. Now days the choice is less clear. Databases are pretty good at only using the data they need not the entire record. So if you query on a huge record but are only looking at a date field, the database should only really look at the date field. There will still probably be some issues with the disk drive taking longer to spin across the bigger records but modern disk drives are pretty fast so I doubt it would be a problem.

This should only be a problem if you have a really huge database, which CareerBuilder, Monster, and Dice probably do. In that case, they may get better performance by using a distributed database so the data is stored on several different servers that can all search their data at the same time.

Even for those databases, I’m not sure there will be a problem. Modern databases use typically B+trees and B*trees to perform searches and they are amazingly fast. They don’t actually need to look at the entire record to just look at key fields so they don’t really use the huge record until they find the records they want.

What will really kill performance is a poor database design so you need to search non-indexed fields. For example, searching for all jobs with the word “fish” in the description would require a linear search of every record and would be horribly slow. That’s why this type of application usually requires keyword lists. Those are indexed and much faster to search.

And if things really are too slow, it’s not too terribly hard to split a table later. It’s not fun but possible. If you start with a split table and then decide that was unnecessary, you will be unlikely to try to join it back together and rewrite all of the logic that puts the pieces together.

In general my rule in all programming (user interface, database, algorithms, whatever) is to first make the program make logical sense. Do what’s most natural for you. Then make it more complicated only after you prove there is a problem. If you suspect there might be a problem, perform some tests to see if you’re right before you make the code unnecessarily complicated. I have seen projects fail because their logic was so convoluted that no one could get them to work. I’ve rarely seen a project fail because it was too slow and couldn’t be speeded up.

I hope I’ve answered all of your questions. If not, please let me know!
__________________
Rod

Rod Stephens, Microsoft MVP

Essential Algorithms: A Practical Approach to Computer Algorithms

(Please post reviews at Amazon or wherever you shop!)
Reply With Quote
  #3 (permalink)  
Old March 23rd, 2012, 08:06 AM
Authorized User
Points: 143, Level: 2
Points: 143, Level: 2 Points: 143, Level: 2 Points: 143, Level: 2
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Mar 2012
Posts: 32
Thanks: 9
Thanked 0 Times in 0 Posts
Default follow up

Dear Mr. Stephens, thank you for your reply and extensive answer to my questions. With great pleasure I placed a review on Amazon, they probably process it at this time, needless to say my enthusiasm. I simply kept my feelings at bay as this is a forum, but in my books this is the best book on the subject and one of the best ever.

There were so many things that I thought I know them when I didn't that I would like to continue the discussion if you have some spare time.

Taking my "Jobs" table ("Ads"), there are about 10 such drop-downs in the UI tier, maybe more in a realistic and more detailed example. So at the database level this table is linked to 10 or more lookup tables and one intermediate table for the "Domains" many-to-many relation, as you noticed.

I estimate this table will have around 6,000 new records per month and I am tempted to keep an archive with all of them (all posts ever posted). At the level of employees (not previously discussed, the figures are much higher, at around 100,000 per month.

When I asked about splitting the "jobs" table I was thinking that the active ads (around 6,000 at a time) need the active filtering options (age, gender, education,experience,...etc.) to filter and eventual block the applicants for a listing. However, when the ads expire, the filtering part is no longer needed. The archived jobs will still be displayed as archive, but the filtering part is never displayed, not even in the live ads. There is always the possibility that an Employer to want to re-post an add, in which case it will probably be better to have the filtering part as well. Anyway, this is the real situation, what would you do in this case?

Your suggestion with keywords I thought about it. Now how do you see it? Should it be something ad-hoc like in UI a single input text field where Employer add keywords separated by keyword? This is the temptation as it is the easiest implementation of the idea. I believe that the column in database will contain information that is no longer atomic and it is not the best thing to do, but how wrong will it be? It will still be better than searching for keywords in the description I guess...

The implementation probably should be something similar to the "Domain" with the difference that instead of a dropdown it will be an input text field and the input will be suggested from the linked table once you begin type it. I imagine there will be another table that will hold the new entered keywords if they are not in the database with a counter and once the counter rich a number that reflect the popularity of that keyword it is taken from this table "SuggestedKeywords" to the "Keywords" table that contain the keywords and it is linked to the intermediary table "JobKeywords" that it is linked to the "Job" table. May I have your opinion, please?

In both cases, using description I would probably use index: a FULLTEXT index for "description", especially now when InnoDb allows it. Considering the real production example, do you think it will still be feasible to have this table without the keyword implementation or at least will be ok something ad-hoc as suggested?

I agree with you that convoluted logic should be avoided. But because my lack of experience it is difficult to evaluate or balance some issues. For example this table at the level of UI it will be a form with 10 or more dropdown/select elements that should be populated from database. Isn't this a scary thing to make 10 queries only to populate the UI? (hence my question with hardcoded dropdowns, which I would like to avoid as you suggested). I have to make a SELECT for the "carrerLevel" table, one for "Education", one for "Experience" and so on... These rarely (probably never) change. Isn't another way around? Basically every time when somebody post a job or register (206,000 per month) I will have to make these queries only to populate the dropdowns from the UI, is this the way it should be? I live with the sensation that I am missing something...

Last thing, unrelated, it's the following: use of functions. I am much tempted to use for example SHA1 for passwords at database level and AES_ENCRYPT / AES_DECRYPT. The thing is that I found it difficult in programming language I am using PHP, (PHP6-MySQL5.6.4 w/InnoDB1.1) to write a class to reflect this kind of input. Using prepared statements I got to use placeholders like '?', if I get the hand dirty and not using classes, than I can simply put where I want SHA1(?) for example. When using a class, to allow the dynamic binding of the variables, the type of the variable will be taken from the meta part of the statement. However, I can't think to any way to say something like "here, the 3rd variable you should apply this function to it". Now either I use the SHA1 from PHP or I guess I should write something in SQL so the database to make this for a certain column without receiving the instruction from the SQL query written in PHP. My guess is that I have to use a stored procedure, but I never did something like this before. I guess it is simple, but basically you teach us in the book how to think so I would like to know what you think so I can think better in the future.

Thank you so much for your help.
Reply With Quote
  #4 (permalink)  
Old March 23rd, 2012, 02:20 PM
Rod Stephens's Avatar
Wrox Author
Points: 2,999, Level: 22
Points: 2,999, Level: 22 Points: 2,999, Level: 22 Points: 2,999, Level: 22
Activity: 22%
Activity: 22% Activity: 22% Activity: 22%
 
Join Date: Jan 2006
Location: , , .
Posts: 601
Thanks: 1
Thanked 84 Times in 83 Posts
Default

Quote:
Taking my "Jobs" table ("Ads"), there are about 10 such drop-downs in the UI tier, maybe more in a realistic and more detailed example. So at the database level this table is linked to 10 or more lookup tables and one intermediate table for the "Domains" many-to-many relation, as you noticed.
I think that’s not unusual.
Quote:
The archived jobs will still be displayed as archive, but the filtering part is never displayed, not even in the live ads.
I think a lot of people would build a data mart to hold the old data. Every month (or however often you like) you would move data into the data mart for long-term storage. At that point you can reformat the data, for example by removing unnecessary fields, combining related records into one record if you like, etc. The data mart would also not necessarily need the lookup tables because it’s not a “live” database where you will be adding new records.
Quote:
There is always the possibility that an Employer to want to re-post an add
If you want to recover an old ad completely, then you’ll need to save all of its data. There isn’t much you can do about that.
Quote:
Your suggestion with keywords I thought about it. Now how do you see it? Should it be something ad-hoc like in UI a single input text field where Employer add keywords separated by keyword?
You can’t do that because it would be very slow to search such a field. What you can do, however, is let the user enter data in that format and then in the middle tier parse out the keywords and put them in separate records. For example, you could call it the JobKeywords table and it would have JobId and Keyword values. You’d use the JobId to link back to the job holding the keywords.

One problem with this is that then people can chose anything for the keywords but you’ll get better matches if the list is consistent. For example, if I put “chef” as a keyword but there’s a job that lists “cook” then I may not see that match. It’s also possible that the user might mistype something and enter “chief” instead of “chef.”

To get around that, some systems make a (usually large) list of all allowed keywords and then either make the user pick from a list or validate against the list when the user accepts the form. This also lets you use the list of allowed words as a lookup table.

In this case, you might also allow users to add new values to the list. For example, if the program finds a keyword that it doesn’t recognize, it can ask the user whether to add it to the list or whether it should go back so the user can fix it. This requires a higher level of trust with the users so I probably wouldn’t recommend it for a huge program like Monster where you have thousands of users who are not your employees.
Quote:
The implementation probably should be something similar to the "Domain" with the difference that instead of a dropdown it will be an input text field and the input will be suggested from the linked table once you begin type it.
That might work well. You might also be able to categorize keywords by field or something. For example “Debugging” probably doesn’t need to be a keyword if the field is “Food Services” or “Finance.”
Quote:
I imagine there will be another table that will hold the new entered keywords if they are not in the database with a counter and once the counter rich a number that reflect the popularity of that keyword it is taken from this table "SuggestedKeywords" to the "Keywords" table that contain the keywords and it is linked to the intermediary table "JobKeywords" that it is linked to the "Job" table. May I have your opinion, please?
That could be very interesting. Twitter also has a notion of “trending” that might be interesting. You could display to the user a list of suggested keywords or keywords that are growing more popular lately.
Quote:
In both cases, using description I would probably use index: a FULLTEXT index for "description", especially now when InnoDb allows it. Considering the real production example, do you think it will still be feasible to have this table without the keyword implementation or at least will be ok something ad-hoc as suggested?
Indexing unformatted fields like a textual description usually doesn’t help. The index only helps if the word you’re searching for begins with the same letters as the data you’re searching. If you need to find keywords in the middle of a paragraph, then an index won’t help you so you need to search every record in the database.

I suppose a fancy database could extract the words in the free-form field and try to make an index based on key words in the text. I don’t know if any of them do that. If you have a list of keywords you could do it yourself. For example, it would be silly to index the word “the” because it will be in most of the records and doesn’t really tell you anything. This would be an interesting thing to try but it sounds like it could be a lot of work.
Quote:
I agree with you that convoluted logic should be avoided. But because my lack of experience it is difficult to evaluate or balance some issues.
That’s the sort of thing that mostly comes with experience.
Quote:
For example this table at the level of UI it will be a form with 10 or more dropdown/select elements that should be populated from database. Isn't this a scary thing to make 10 queries only to populate the UI?
Usually this is just one query that joins multiple tables. Modern databases are pretty good at that.
Quote:
(hence my question with hardcoded dropdowns, which I would like to avoid as you suggested). I have to make a SELECT for the "carrerLevel" table, one for "Education", one for "Experience" and so on... These rarely (probably never) change.
The form can query the database when it is first created and populate the dropdowns. Then it can be used many times to display different records and it doesn’t need to rebuild the values allowed by the dropdowns. If the users close the forms at least once per day, you shouldn’t need to worry about rebuilding the dropdowns. If you make a change to the lists and they want to see it right away, users can log out and log back in.
Quote:
Isn't another way around? Basically every time when somebody post a job or register (206,000 per month) I will have to make these queries only to populate the dropdowns from the UI, is this the way it should be? I live with the sensation that I am missing something...
The users for this type of system probably log in, make one entry, and then are done for now. In that case, it takes them several seconds just to log in and get things running. Adding an extra 1/10 second to load 10 dropdowns won’t hurt them.

If it’s a web page, there may be a script (like asp or aspx) that builds the page. It could query the database to build the dropdowns. Or you could make some sort of files to make building the pages easier and then update those files whenever you change the database.

Some of the really huge databases (like the ones you’ve been talking about) do have special needs and I won’t pretend to be an expert in them. Google in particular is at the cutting edge of truly massive database searches using more than 100,000 computers and I have very little knowledge about how their searches work. For example, which words on this page deserve to be used as keywords?

Mostly what I talk about in the book is more typical usage where you want to get the most out of the database so you don’t have to write too much extra code (which could contain bugs). These databases are pretty powerful, however, so you can get a lot out of them. Probably more than most people would expect.
Quote:
Last thing, unrelated, it's the following: use of functions. I am much tempted to use for example SHA1 for passwords at database level and AES_ENCRYPT / AES_DECRYPT. The thing is that I found it difficult in programming language I am using PHP, (PHP6-MySQL5.6.4 w/InnoDB1.1) to write a class to reflect this kind of input. Using prepared statements I got to use placeholders like '?', if I get the hand dirty and not using classes, than I can simply put where I want SHA1(?) for example. When using a class, to allow the dynamic binding of the variables, the type of the variable will be taken from the meta part of the statement. However, I can't think to any way to say something like "here, the 3rd variable you should apply this function to it". Now either I use the SHA1 from PHP or I guess I should write something in SQL so the database to make this for a certain column without receiving the instruction from the SQL query written in PHP. My guess is that I have to use a stored procedure, but I never did something like this before. I guess it is simple, but basically you teach us in the book how to think so I would like to know what you think so I can think better in the future.
I’m not completely sure about what you’re asking. And I definitely don’t know how you would do this in PHP using SHA1 or AES.

I know that several database tools use the ? syntax to represent a placeholder for a value to plug in. For example, a Visual Basic or C# program might do that. The benefit is that you can then compile a query and use it many times with different parameters. If you need to process a huge number of parameters, that can be faster than building a new query for each one separately.

If you need to, though, you may be able to use program code to write out the SQL statement and insert the values you need to instead of using question marks. And the SQL statement can include any function calls you need.

I try never to store user names or passwords in a database, though, because an attacker can easily get hold of them. Even if you store them encrypted the attacker can read your code and figure out how they were encrypted so it can decrypt them. The approach I prefer is to let the operating system or database security do as much of the work as possible.

If the database needs to store encrypted data, you can make the user enter a password and then use those to decrypt the data. That way the program never needs to store the password.

(I'm going to be unplugged for a while so I may not be able to reply quickly for a bit.)
__________________
Rod

Rod Stephens, Microsoft MVP

Essential Algorithms: A Practical Approach to Computer Algorithms

(Please post reviews at Amazon or wherever you shop!)
Reply With Quote
  #5 (permalink)  
Old March 23rd, 2012, 04:30 PM
Authorized User
Points: 143, Level: 2
Points: 143, Level: 2 Points: 143, Level: 2 Points: 143, Level: 2
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Mar 2012
Posts: 32
Thanks: 9
Thanked 0 Times in 0 Posts
Default follow up

Thank you so much for your help, I understand and I look forward to hearing from you as soon as you have some time to spent on the board.

Quote:
If it’s a web page, there may be a script (like asp or aspx) that builds the page. It could query the database to build the dropdowns. Or you could make some sort of files to make building the pages easier and then update those files whenever you change the database.
What do you mean, that it may be a way to run only one query that will join everything needed for the dropdowns in the UI everywhere in the application and that page will hold the dropdowns as they were hardcoded, probably in a function or something, than everytime I need them I only include this file and call the function for the dropdown I need? Any idea how can I do that? Is it related to something like cached pages? I am definitely interested in this issue. thank you.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
book related questions here jminatel BOOK: Professional JavaScript for Web Developers ISBN: 978-0-7645-7908-0 1 January 30th, 2013 06:52 PM
Great book with 2 questions NIGEL86 BOOK: Beginning Cryptography with Java 1 June 19th, 2011 07:15 PM
Forum Purpose: Book questions only please jminatel BOOK: Professional Ajax 2nd Edition ISBN: 978-0-470-10949-6 0 February 26th, 2007 04:23 PM
Book related questions here jminatel BOOK: Professional Ajax ISBN: 978-0-471-77778-6 0 September 27th, 2006 11:02 AM
book - "XML Design and Implementation" Juryi All Other Wrox Books 2 May 30th, 2004 02:17 PM



All times are GMT -4. The time now is 11:46 AM.


Powered by vBulletin® Version 3.7.0
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.