'AddedBy' NVARCHAR field - inefficient?
I have expanded the Polls module quite extensively for a project I am working on.
A feature I want to implement is to only show Polls that a user has not voted on yet. I am doing this as follows using SQL Server 2005:
Each time a user votes on a poll, that action is added to a 'Votes' table, and the User ID is recorded in an 'AddedBy' NVARCHAR field in that table.
When I want to get a list of Polls that a user hasn't voted on, I do something like this:
SELECT DISTINCT p.PollID
FROM vs_Polls p
WHERE p.PollID NOT IN
(SELECT DISTINCT p.PollID
FROM vs_Polls p
LEFT JOIN vs_PollOptions o
ON p.PollID = o.PollID
JOIN vs_Votes v
ON o.OptionID = v.OptionID
WHERE (v.AddedBy = 'marco'))
ORDER BY p.PollID
There is a Sub-Query, where the Poll table is joined to the PollOptions table, which is then joined to the Votes table.
There is also a 'WHERE' clause which operates on the 'AddedBy' field.
It is this WHERE clause which causes me some concern. Imagine a situation whereby there were thousands of polls, and many more thousands of votes on those polls. At this sort of scale the difference between using an INT vs. an NVARCHAR field could make a difference.
I understand why Marco chose to do it this way but at the scales I am looking at I think it might become a problem.
What do you think? Should I add an INT Primary Key to the ASP.NET generated 'aspnet_Users' table, and rewrite everything to work off this? Or am I worrying about it for nothing?
I suppose the question comes down to: Just how inefficient is 'WHERE'ing and Indexing on an NVARCHAR field compared with an INT?
Any views would be really welcome.