Hi,
Alternatively you can also try this out in SQL server:
--To extract the duplicate rows
SELECT COL1, COL2, COUNT(*)
FROM DUP_ROWS
GROUP BY COL1, COL2
HAVING COUNT(*) > 1
--First, run the above GROUP BY query to determine how many sets of duplicate PK values exist, and the count of duplicates for each set.
--Select the duplicate key values into a holding table. For example:
SELECT COL1, COL2, COL3 = COUNT(*)
INTO HOLDKEY
FROM DUP_ROWS
GROUP BY COL1, COL2
HAVING COUNT(*) > 1
--Select the duplicate rows into a holding table, eliminating duplicates in the process. For example:
SELECT DISTINCT DUP_ROWS.*
INTO HOLDUPS
FROM DUP_ROWS, HOLDKEY
WHERE DUP_ROWS.COL1 = HOLDKEY.COL1
AND DUP_ROWS.COL2 = HOLDKEY.COL2
--At this point, the holddups table should have unique PKs, however, this will not be the case
--if t1 had duplicate PKs, yet unique rows.
--Verify that each key in holddups is unique, and that you do not have duplicate keys,
--yet unique rows. If so, you must stop here and reconcile which of the rows you wish to
--keep for a given duplicate key value. For example, the query:
SELECT COL1, COL2, COUNT(*)
FROM HOLDUPS
GROUP BY COL1, COL2
--should return a count of 1 for each row.
--If no, you have duplicate keys, yet unique rows, and need to decide which rows to save.
--This will usually entail either discarding a row, or creating a new unique key value for this row.
--Take one of these two steps for each such duplicate PK in the holddups table.
--Delete the duplicate rows from the original table. For example:
DELETE DUP_ROWS
FROM DUP_ROWS, HOLDKEY
WHERE DUP_ROWS.COL1 = HOLDKEY.COL1
AND DUP_ROWS.COL2 = HOLDKEY.COL2
--Put the unique rows back in the original table. For example:
INSERT DUP_ROWS SELECT * FROM HOLDUPS
Hope this helps,
Lalit
Life Means More...;)
|