duplicates and unique

sheba0907 · July 3rd, 2006, 01:03 PM

I have a table that has accountID (unique) and accountname (not unique)
If I put in the following query:

SELECT accountname, COUNT(*) as Sum
FROM accounts
GROUP BY accountname
HAVING count(*) > 1;

I get a count of the account names that are duplicates - good

Now, I want to display the corresponding accountID associated with those accountnames that are duplicates.

How do I do that?

Thanks

Jeff Mason · July 3rd, 2006, 01:42 PM

A subquery will do the trick. Since your query above correctly gives you a resultset of duplicate accountname values, you'll want to select those rows whose accountname is in that set:

Code:

SELECT accountID FROM accounts
 WHERE accountname IN (SELECT accountname FROM accounts 
                       GROUP BY accountname
                       HAVING count(*) > 1);

Jeff Mason
Custom Apps, Inc.
www.custom-apps.com

sheba0907 · July 3rd, 2006, 02:00 PM

Thanks
Sorry I probably didn't explain myself correctly because I still didn't get what I was looking for.

Using your suggestion:
SELECT accountID FROM accounts
WHERE accountname IN (SELECT accountname FROM accounts
                       GROUP BY accountname
                       HAVING count(*) > 1);

I only get the accountID returned

Using my orig query:
SELECT accountname, COUNT(*) as Sum
  FROM accounts
  GROUP BY accountname
HAVING count(*) > 1
I get the accountname and the number of times it appears

ex:
accountname sum
j w Account 6

I'm hoping to get a result like this

accountID accountname sum
1234123414 j w account 6

Thanks!

sheba0907 · July 3rd, 2006, 02:16 PM

I got pretty cvlose using this

SELECT accountID, accountname FROM accounts
WHERE accountname IN (SELECT accountname FROM accounts
GROUP BY accountname
HAVING count(*) > 1);

It listed the accountID and the account name and since there isn't too many (this time) I can count the # of times the account name is in there

Thanks

Jeff Mason · July 3rd, 2006, 02:29 PM

I'm guessing that your accountID is unique, probably the primary key of your accounts table. If that's true, then in your example 'j w account' is an accountname which occurs 6 times, presumably with 6 different accountid's. What would your desired result then be?

Would you be looking for:

accountID accountname sum
1234123414 j w account 6
9876543210 j w account 6
...

which doesn't seem terribly useful ... but what do I know...

But let's work on the subquery idea. Your original query returns the set of duplicate accountnames and a count. The results of a query can also be thought of as a table; in fact, in SQL this is called a 'derived table'. We can then JOIN this derived table with your original accounts table and display columns from each:

Code:

SELECT accountID, dups.accountname , dupcount
FROM accounts
INNER JOIN (SELECT accountname, count(*) as dupcount FROM accounts 
            GROUP BY accountname HAVING count(*) > 1) as dups
        ON accounts.accountname = dups.accountname;

There's a little housekeeping to be done. The derived table must be given a name in the FROM clause, and we'll need to give a name to the COUNT expression so we can refer to it.

Jeff Mason
Custom Apps, Inc.
www.custom-apps.com

sheba0907 · July 3rd, 2006, 02:33 PM

Great, thanks for your help since as you can see, I'm just a beginner! This query gives me what I need, even though it does (like you say) display a row more than once. It's better than me manually counting!

Jeff Mason · July 3rd, 2006, 02:43 PM

Quote:

quote:
It's better than me manually counting!

Yeah, computers are much better than you are at that. :D

You may want to consider an ORDER BY clause to put the results in the order you desire, presumably either by accountID or accountname, or maybe even by the count, listing the most (or least, whatever) duplicates first. Without an ORDER BY clause, there is no guarantee that the results will be in any order whatsoever, despite what you may see when you run the query without one.

Jeff Mason
Custom Apps, Inc.
www.custom-apps.com