Wrox Programmer Forums
|
SQL Server 2000 General discussion of Microsoft SQL Server -- for topics that don't fit in one of the more specific SQL Server forums. version 2000 only. There's a new forum for SQL Server 2005.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the SQL Server 2000 section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old February 2nd, 2006, 05:39 PM
Authorized User
 
Join Date: Jun 2003
Posts: 29
Thanks: 0
Thanked 0 Times in 0 Posts
Default Syntax for 'Housholding'

Hi,

I've got a customer table with over 250,000 records. I need to "household" this customer table by combining records with matching criteria, such as same home phone number, or same address + city + postal code.

To do that the only thing I could think of was to create a cursor and go through every record, assign it a household number and then match all other records that should go with this record (if there are any) and give them the same number. Then move on to the next record. It doesn't work that well and takes almost 11 hours to run!!

Can anyone think of an elegant way of coding this? Getting rid of the cursor would be sooo great!

Pat
 
Old February 2nd, 2006, 05:58 PM
Registered User
 
Join Date: Jan 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default

so, you have following columns in table user
id, firstname, lastname, address, city, zipcode, phone

and you want to add a 'householdid'

create a table household -> id, city, zipcode, phone. id has a seed.

   insert into household(city, zipcode, phone)
   select city, zipcode, phone
   from user
   group by city, zipcode, phone

all having same city, zipcode and phone will be created in household.

then create a householdid on the user table and update it.

update user set householdid = (select householdid from household where household.city = user.city and household.zip = user.zip and household.phone = user.phone)



this will assign a distinct household id to all households.

thanks

jigs
 
Old February 3rd, 2006, 11:04 AM
Authorized User
 
Join Date: Jun 2003
Posts: 29
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Thanks Jigs!

I figured out a way to make that work - waaaaay faster!!

Pat
 
Old February 3rd, 2006, 12:55 PM
Friend of Wrox
 
Join Date: Jul 2003
Posts: 599
Thanks: 6
Thanked 3 Times in 3 Posts
Default

Please post the solution!

 
Old February 3rd, 2006, 01:00 PM
Authorized User
 
Join Date: Jun 2003
Posts: 29
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Ok, here's what I did:

/* GROUP 1 */
SELECT CUSSUR, CUSAD1, CUSPST, COUNT(*) as 'Counts'
INTO [group1]
FROM [customer_new]
WHERE CUSSUR IS NOT NULL AND
      CUSAD1 IS NOT NULL AND
      CUSPST IS NOT NULL
GROUP BY CUSSUR, CUSAD1, CUSPST
ORDER BY 'Counts' DESC

/* GROUP 2 */
SELECT RPH, COUNT(*) as 'Counts'
INTO [group2]
FROM [customer_new]
WHERE RPH IS NOT NULL
GROUP BY RPH
ORDER BY 'Counts' DESC

The first groups by name, address, postal code. The second is by home phone number. Then I do the following to match it back to the customer table.

/* SET HOUSEHOLD IN CUSTOMER TABLE FROM GROUP 2 */
UPDATE a
SET a.household = b.household
FROM [customer_new] a, [group2] b
WHERE a.rph = b.rph

/* SET HOUSEHOLD IN GROUP 1 FROM CUSTOMER TABLE */
UPDATE c
SET c.household = a.household
FROM [group1] c, [customer_new] a
WHERE c.CUSSUR = a.CUSSUR AND
      c.CUSAD1 = a.CUSAD1 AND
      c.CUSPST = a.CUSPST

/* SET HOUSE IN CUSTOMER TABLE FROM GROUP 1 */
UPDATE a
SET a.household = c.household
FROM [customer_new] a, [group1] c
WHERE a.CUSSUR = c.CUSSUR AND
      a.CUSAD1 = c.CUSAD1 AND
      a.CUSPST = c.CUSPST

/* RECORDS WITH NULL HOUSEHOLD INTO GROUP 3 */
SELECT env_id
INTO [group3]
FROM [customer_new]
WHERE household IS NULL

/* SELECT MIN, MAX, DISTINCT */
SELECT MIN(household)
FROM [customer_new]

SELECT MAX(household)
FROM [group3]

SELECT COUNT(DISTINCT(household))
FROM [customer_new]

/* GROUP 3 IS THOSE RECORDS STILL WITHOUT A HOUSEHOLD */
SELECT *
FROM [group3]
WHERE household IN (SELECT household FROM customer_new)

I increment a household ID higher than those I have already set on the group 3 table, and then match back.

/* SET A UNIQUE HOUSEHOLD TO ALL THE GROUP3 THEN MATCH BACK TO CUSTOMER TABLE */
UPDATE a
SET a.household = b.household
FROM [customer_new] a, [group3] b
WHERE a.env_id = b.env_id
 
Old February 6th, 2006, 02:06 PM
Friend of Wrox
 
Join Date: Jul 2003
Posts: 599
Thanks: 6
Thanked 3 Times in 3 Posts
Default

Thanks for posting your solution. It may be helpful to someone else.

Richard






Similar Threads
Thread Thread Starter Forum Replies Last Post
syntax error Adam H-W Classic ASP Basics 3 December 12th, 2006 08:20 AM
help syntax please khautinh SQL Server 2000 1 August 26th, 2006 02:47 AM
c# syntax msrnivas .NET Web Services 2 October 15th, 2004 01:26 AM
syntax errors Adam H-W Classic ASP Databases 4 July 2nd, 2004 07:44 AM
Syntax Error snoopy92211 Access VBA 3 June 23rd, 2004 08:26 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.