p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/index.php)
-   SQL Server 2000 (http://p2p.wrox.com/forumdisplay.php?f=20)
-   -   Compare two fields from different tables (http://p2p.wrox.com/showthread.php?t=55656)

pallone March 21st, 2007 06:01 AM

Compare two fields from different tables
 
Hi,

I would appreciate very much if someone could help me to figure out how to do the following using SQL:

I have to compare the values of two fields called Name(nvarchar) in two different tables and then if they match (or contain something similar) I have to get the CompanyID of the Company table and check if this is already present in the CompanyCompare table. If it is not then I will insert the whole record in the CompanyCompare table.

Using a cursor I was able to do the insertion based on the CompnayID. However, I cannot figure out how to compare the two string fields for a match or similarities in the loop.

I was reading about the CONTAINS keyword but am not sure this would work here since I would need to use a variable in the loop and I think it only accepts a string as parameter.

Also, I would appreciate if someone could show me some sample code of how to do this. Please, feel free to change my code below.

Cheers

declare @p_TableID as int
declare @match as int

Begin Transaction
   declare tb_cur cursor
   for Select CompanyID from Company
     open tb_cur
        fetch next from tb_cur into @p_TableID
        while @@fetch_status = 0
        begin
           set @match = isNull((Select CompanyID from CompanyCompare where CompanyID = @p_TableID),-1)
           if @match = -1
           begin
              Insert into CompanyCompare (CompanyID, Name, MaxValue)
              Select CompanyID, Name, MaxValue from Company where CompanyID = @p_TableID
           end
           fetch next from tb_cur into @p_TableID
        end
     close tb_cur
     deallocate tb_cur
Commit Transaction

happygv March 21st, 2007 06:27 AM

Quote:

quote:I have to compare the values of two fields called Name(nvarchar) in two different tables and then if they match (or contain something similar) I have to get the CompanyID of the Company table and check if this is already present in the CompanyCompare table...
What is the Relationship you have set between these two tables? Does CompanyCompare have NAME column in it? Or the CompanyID is the common column in both these tables?

Also where do you get the NAME and MAXVALUE from for the insert statement?
Code:

Insert into CompanyCompare (CompanyID, Name, MaxValue)
_________________________
- Vijay G
Strive for Perfection

pallone March 21st, 2007 10:29 AM

Hi Vijay G,

Thanks for your reply.

CompanyCompare is the 3rd table that will have the old and new records but no repetition. If the old values are different in table 1 then it will need to be updated as well. Basically, I have two tables with company info in them. However, the ID of both tables do not match. If they matched the code I wrote above would work.

So, I will have to read the Name field of both tables and check if the string is equal or have some similarities. If they do, then I will have to get the ID of table 1 plus ID from table 2 and insert them to the CompanyCompare table together with the updated MaxValue from Table 1.



Table 1
+------------+----------+------+-----+
| ID | Name | MaxValue |
+------------+----------+------+-----+
| 1 | Wilbey sx| 2000 |
+------------+----------+------+-----+
| 2 | Aimor sp | 10000 |
+------------+----------+------+-----+

Table 2
+------------+----------+------+-----+
| ID | Name | MaxValue |
+------------+----------+------+-----+
| 37 | Wilbey sx| 6000 |
+------------+----------+------+-----+
| 104 | Aimor | 10000 |
+------------+----------+------+-----+

After running the sql script the CompanyCompare table will have to look sothing like this:

CompanyCompare Table

+------------+----------+------------+-----------------------+
| IDTab1 | IDTab2 | Name | MaxValue |
+------------+----------+------------+-----------------------+
| 1 | 37 | Wilbey sx | 2000 |
+------------+----------+------------+-----------------------+
| 2 | 104 | Aimor sp | 10000 |
+------------+----------+------------+-----------------------+
| 2 | 104 | Aimor | 10000 |
+------------+----------+------------+-----------------------+
| | | | |
+------------+----------+------------+-----------------------+

Because the first row of both tables have the same string value i.e. Wilbey sx, I have to copy the IDs from both tables to the CompanyCompare and also the Name and MaxValue fields from table 1 since table 1 will be the table with the most recent values.

However, since the string values of the Name columns in the second row for both tables do not match but have some similarities, I will have to add both records to the CompareTable so that I can compare them later.

Hope it is clearer now.

Cheer

CPall

happygv March 21st, 2007 10:39 AM

The probably you got to use LIKE OPERATOR in this case. Still when you are uncertain that which of the tables will hold the subset of the other, this can't be the right solution. Let me still give you some idea on how it can be accomplished. Also I dont understand where does this COMPANY table fit into this... Do you have a separate table in that name?

_________________________
- Vijay G
Strive for Perfection

pallone March 21st, 2007 11:58 AM

I will have two tables. These tables will have company records like ID, Name and MaxValue.

Unfortunatly, the IDs will not match since one table comes from one country and the other table comes from another. However, they represent the same company.
I know this is stupid but that is the way it is and I would love to find a solution so that I didn't need to go record by record manually.

Therefore, I will have to check for the same company name or a simiar company name in the Name column. I say similar because people in one country may type an abbreviation of the company name whereas people from another country would write the full company name.

What a lack of referential integrity isn't it?

Well, thanks for you help

CPall

pallone March 21st, 2007 12:04 PM

Just to clarify a bit more. If you have a look at the tables above you will see that the third table (CompanyCompare) will have two records repeated:

 2 | 104 | Aimor sp | 10000 |
+------------+----------+------------+-----------------------+
| 2 | 104 | Aimor | 10000

This is because 'Aimor sp' is not exact 'Aimor' but they are similar. So I will have to insert both records in my CompanyCompare table.

Hope this helps

Cheers

happygv March 22nd, 2007 04:20 AM

I am a bit confused with the explanation and the code that you posted in your original post.

Try this and see if this is what you were looking for?
Code:

INSERT INTO CompanyCompare
-- Using Table1 as the base for comparision with Table2
Select Table1.ID, Table2.ID, Table1.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table1.name like '%' + table2.name + '%' AND Datalength(Table1.Name) = Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )
UNION
Select Table1.ID, Table2.ID, Table1.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table1.name like '%' + table2.name + '%' AND Datalength(Table1.Name) <> Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )
UNION
Select Table1.ID, Table2.ID, Table2.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table1.name like '%' + table2.name + '%' AND Datalength(Table1.Name) <> Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )

UNION
-- Using Table2 as the base for comparision with Table1
Select Table1.ID, Table2.ID, Table2.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table2.name like '%' + table1.name + '%' AND Datalength(Table1.Name) = Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )
UNION
Select Table1.ID, Table2.ID, Table1.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table2.name like '%' + table1.name + '%' AND Datalength(Table1.Name) <> Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )
UNION
Select Table1.ID, Table2.ID, Table2.Name, Table1.MaxValue
    FROM table1, table2
    WHERE table2.name like '%' + table1.name + '%' AND Datalength(Table1.Name) <> Datalength(Table2.Name)
        AND Table1.ID NOT IN ( SELECT IDTab1 FROM CompanyCompare )
        AND Table2.ID NOT IN ( SELECT IDTab2 FROM CompanyCompare )

        This doesn't use Cursor, at the same time it doesn't insert duplicate values into CompanyCompare Table.

Hope this helps.
Cheers

_________________________
- Vijay G
Strive for Perfection

pallone March 22nd, 2007 06:45 AM

Hi Vijay G,

Many thanks for your help.

I have just seen your post and will try the code straight away. However, I think there is a problem with the code because I can have ID repetitions in table CompareCompany if the Name field is similar. Please have a look at the example table below.

1 - I will need to insert both IDs in CompareTable since the ID from country 1 is different from country 2
2 - People will type the name of the company differently i.e some people will use abbreviations.
3 - If the Name field is exactly the same then I just need to update the record with the values from table 1 but also need to add the ID from table 2.
4 - If the Name field is similar then I need to insert both records from table 1 and table 2 including the IDs of both.


CompareTable
+------------+----------+------------+-----------------------+
| IDTab1 | IDTab2 | Name | MaxValue |
+------------+----------+------------+-----------------------+
| 1 | 37 | Wilbey sx | 2000 |
+------------+----------+------------+-----------------------+
| 2 | 104 | Aimor sp | 10000 |
+------------+----------+------------+-----------------------+
| 2 | 104 | Aimor | 10000 |
+------------+----------+------------+-----------------------+
| | | | |
+------------+----------+------------+-----------------------+

Table 1
+------------+----------+------+-----+
| ID | Name | MaxValue |
+------------+----------+------+-----+
| 1 | Wilbey sx| 2000 |
+------------+----------+------+-----+
| 2 | Aimor sp | 10000 |
+------------+----------+------+-----+

Table 2
+------------+----------+------+-----+
| ID | Name | MaxValue |
+------------+----------+------+-----+
| 37 | Wilbey sx| 6000 |
+------------+----------+------+-----+
| 104 | Aimor | 10000 |
+------------+----------+------+-----+

pallone March 22nd, 2007 08:35 AM

Hi Vijay G,

I have tried your code and I think it is going to help me a lot. After I run your code
I am doing a select - SELECT * FROM Compare ORDER BY IDTab2 - and I can compare the names, ids and values. The MaxValue is always coming from table1 which is right.

Could you please explain why you used the UNION clause. I have never used it before and am not sure in what context this should be used. Also, you are using the Datalength() function with '=' and '<>'. Can you elaborate a bit about why you are doing that.

Cheers,

CPALL

happygv March 22nd, 2007 09:01 AM

There are 6 Select statements used, out of which the first 3 are based on Table1 and the later 3 are based on Table2

As you said, you have 2 cases
1) The Names could be exactly the same as the other - So '=' is used in the First select statement.
2) When you say it can be a subset of the other, like 'Aimor sp' to 'Aimor'. So I used <> and datalength() combinations in the second select query.

The third was used to pick the name from Table2, except that the entire statement would be same as that of 2nd select.

Similarly, the last 3 select statements are based on Table2, because... there is no guarantee that
'Aimor sp' can always appear in Table1, It could appear in Table2 also. So for the second set of 3 Select Statements, you can see the difference in the Where Clause, which is marked in BLUE. The difference is Table2.Name like '%Table1.Name%'. This is to handle the other possibility of subset. Thus there are six statements for all thos possibilities.

Here the catch is the UNION. UNION Combines all the 6 results in to One result. Also it gives you the NON-DUPLICATE result among the all 6 statement's results based on how the UNION is Used. When you run each select statement separately, you will understand which select statement result in what. There could be duplicates returned by the select statements. UNION takes care of duplicates. If you want to see the Duplicates too, Replace all UNIONs with UNION ALL. You will see the difference.

Try to add Rows into Table1 and Table2 with different posibilities and test if the code I posted is fool proof. Let me know if that helps.

Hope that explains.
Cheers.

_________________________
- Vijay G
Strive for Perfection


All times are GMT -4. The time now is 07:43 AM.

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.