View Single Post
  #3 (permalink)  
Old July 2nd, 2007, 11:20 PM
Jeff Moden Jeff Moden is offline
Friend of Wrox
Points: 1,536, Level: 15
Points: 1,536, Level: 15 Points: 1,536, Level: 15 Points: 1,536, Level: 15
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Oct 2006
Location: , MI, USA.
Posts: 475
Thanks: 0
Thanked 9 Times in 9 Posts
Default

Hi there, Greg...

Yes, I have some ideas as to what is wrong with your script... my idea is to rewrite what you have for much greater performance.

First, I'm making the assumption that Pt_Text_ID is not unique in the Pt_Text table and that each Pt_Text_ID could have one or more rows as identified by the Pt_Seq column... if that's not true , then post back and tell me what you're really trying to do and we'll fix this together :)

Second, your blank stripping function is absolutely ingenious... but it has two major flaws and a bit of a performance problem...

1. Because you initially double the number of characters occupied by blanks, you've limited the function to no more than 4,000 characters meaning that you could never do a full 8,000 character VARCHAR.

2. When the 1,000 character limit is breeched in your current function, truncation of the string occurs without error and without warning which could lead to some pretty serious errors downstream.

3. The method you used, although very clever, is actually slower than other methods.

For example, on 10,000 rows with data that looks like 'X'+SPACE(998)+'Y' (the limit of your function), your function takes about 10.1 seconds to execute. Although the following code is butt ugly and not clever at all, it only takes 5.6 seconds to accomplish the same thing AND it's capable of doing a full 8000 character varchar AND it deletes any leading or trailing spaces to boot!

Code:
 CREATE FUNCTION dbo.udf_StripBlanks
        (@Var1 VARCHAR(8000))
RETURNS VARCHAR(8000)
     AS
  BEGIN

 RETURN (SELECT RTRIM(
                LTRIM(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                REPLACE(
                @Var1
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                ,SPACE(2),SPACE(1))
                )
                )
        )
   END
...by the way... why 13 REPLACEs? Because 2^13 > 8000.

OK, moving on... there is no need for the following functions...

Code:
udf_Rates_Trans_GetNumLines 
udf_Rates_Trans_GetLine 
udf_Rates_Trans_GetNextID 
udf_WB_Trans_WB1
... nor is there any need for a temp table or a While loop to do this concatenation. Think set based...

...The following function, when used with the code that follows it, will return the result-set you desire without any of that other stuff...

Code:
 CREATE FUNCTION dbo.udf_AssemblePT
        (@Pt_Text_ID INT)
RETURNS VARCHAR(8000)
     AS
  BEGIN
        --===== Declare the return variable to hold the concatenated parts in
        DECLARE @Return VARCHAR(8000)

        --===== Concatenate the various parts together in order by sequence number
             -- This assumes that you want a space to separate the various parts and 
             -- that you want to exercise the delete dupe blanks function
         SELECT @Return = ISNULL(@Return+' ','')+dbo.udf_StripBlanks(Pt_Text)
           FROM Pt_Text
          WHERE Pt_Text_ID = @Pt_Text_ID
          ORDER BY Pt_Seq

        --===== Return the concatenated parts for this Pt_Text_ID
         RETURN @Return
    END


Once you have that function, then your code simply looks like this...

Code:
 SELECT DISTINCT
        Pt_Text_ID                     AS Text_ID,
        dbo.udf_AssemblePT(Pt_Text_ID) AS Text_IN
   FROM Pt_Text
  ORDER BY Pt_Text_ID
Because this is mostly set based, it will run circles around your While loop and it's a whole lot easier to maintain/modify...

Admittedly, I've not tested this exact code, but I've written similar code a thousand times. So, I may have a bug in the code (not intentionally, of course), but it should be easy to fix if you find one.

Any questions?

--Jeff Moden
Reply With Quote