Wrox Programmer Forums
|
SQL Server 2000 General discussion of Microsoft SQL Server -- for topics that don't fit in one of the more specific SQL Server forums. version 2000 only. There's a new forum for SQL Server 2005.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the SQL Server 2000 section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old March 30th, 2007, 08:23 AM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 839
Thanks: 0
Thanked 1 Time in 1 Post
Default

No harm, no foul.

I said it was morning for me, and I can be a little slow on the uptake then. Unfortunately, forum postings can all too easily be misinterpreted as to intent, so I just want to ask.

Anyway, this question of "string aggregation" ought to be in a FAQ somewhere. It gets asked a lot.

I've never really understood why. The resultset violates normalization rules (which is sort of OK since a resultset isn't a table), but more to the point it is really a presentation issue which properly belongs in the client code, IMO. If you approach it that way, all the concern over ordering and cursors and using undocumented features just evaporates...

Jeff Mason
Custom Apps, Inc.
[email protected]
 
Old March 30th, 2007, 11:20 AM
Friend of Wrox
 
Join Date: Aug 2004
Posts: 385
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Can't agree with this comment:
"there is no guarantee that the returned rows will be in the order of the clustered index"
If you have a clustered index and you order by the clustered index when you select the data the retrieveal is most efficient and it will be in order.

 
Old March 30th, 2007, 12:10 PM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 839
Thanks: 0
Thanked 1 Time in 1 Post
Default

Quote:
quote:Originally posted by robprell
 Can't agree with this comment:
"there is no guarantee that the returned rows will be in the order of the clustered index"
If you have a clustered index and you order by the clustered index when you select the data the retrieveal is most efficient and it will be in order.
My comment was in regards to the string aggregation query, but in fact applies to any query. Your statement is correct as long as you "...order by the clustered index...".

In fact, if you ORDER BY anything, you are guaranteed that the resultset will be in that order, and that has nothing to do with any index.

Conversely, if a query has no ORDER BY clause, then you cannot guarantee that the resultset is in any particular order.

It happens that the optimizer will be greatly influenced by the presence of an index, but it may in fact choose not to use it if it thinks another plan is more efficient. It is dangerous to depend on empirical observation and side-effects: that the optimizer will always give you results in a particular order just because every time you run it, you see those results in that particular order.

The fact is that only way to guarantee a specific order is to state you want it that way.

The question is in regard to this query:

SELECT @stringvar = @stringvar + ', ' + somecolumn
FROM sometable
ORDER BY someclusteredindexcolumn

In point of fact, that ORDER BY clause will have absolutely no (guaranteed) effect on the order of the strings as the concatenation is being done. This is because (in effect) an ORDER BY clause is the very last operation that is performed on a resultset.

That is, the resultset is constructed from columns in the SELECT clause from the rows in the tables in the FROM clause, filtered by the WHERE clause and GROUPed accordingly. All of these operations are set-based operations, in that they happen on all the rows and columns all at the same time (or at least that is how they appear).

Only after all that is done is the resultset sorted as per the ORDER BY clause. Note that sorting is a cursor operation. In effect, the ORDER BY clause takes the set produced by the rest of the SQL query and processes the rows one at a time (that's what a cursor is) to place them in the requested order. If there is no ORDER BY clause, then no ordering is necessary so the resultset is passed back to the consumer unmodified. It is in whatever order those set based operations left it in.

Now it happens that the optimizer is smart enough to recognize that indexes exist on tables and those indexes present the rows in a known order, so the last sorting step may be able to be avoided because the optimizer will construct a query plan taking those indexes into account. Since sorting is expensive, those indexes will influence that choice heavily.

But the fact is that the string aggregation in the query above is a process that happens well before any ordering, so you simply cannot depend on what order the rows will be processed, and thus in what order the strings will have been concatenated in the resultant output variable.

Just to throw more fuel on the fire, :D it is also not true that the use of the clustered index always is the most efficient means of data retrieval. If a covering index exists for a given query, then the optimizer may choose it and avoid the clustered index altogether, getting the results from the index itself...

Jeff Mason
Custom Apps, Inc.
[email protected]
 
Old March 30th, 2007, 08:05 PM
Friend of Wrox
 
Join Date: Aug 2004
Posts: 385
Thanks: 0
Thanked 0 Times in 0 Posts
Default

I don't think you understand me at all.

If you have a clustered index on your key value. Each row you insert is retained in order it was inserted. Thus if you have five rows, insert a sixth, seventh and eight row. The clustered index on a primary key WILL allow you to select them in the order your entered them. You would have to do something like this in order to do what the requestor is trying to do. If you think this is wrong. Please by all means post code that demonstrates it's wrong.


 
Old March 31st, 2007, 07:04 AM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 839
Thanks: 0
Thanked 1 Time in 1 Post
Default

Quote:
quote:Originally posted by robprell
 I don't think you understand me at all.

If you have a clustered index on your key value. Each row you insert is retained in order it was inserted. Thus if you have five rows, insert a sixth, seventh and eight row. The clustered index on a primary key WILL allow you to select them in the order your entered them. You would have to do something like this in order to do what the requestor is trying to do. If you think this is wrong. Please by all means post code that demonstrates it's wrong.
I know what you are saying.

With all due respect, I don't think you understand me.

If you have a clustered index on a key value (this applies to any index, actually), the clustered index will order those keys by their value. They are not ordered "...in the order you entered them...". They are ordered by whatever their value is. There is no requirement that any index value (including the clustered index) be restricted to monotonically increasing values (like an IDENTITY column or a 'current' datetime value) which would impose the ordering you seek.

As you know, a clustered index is simply an index which includes the data comprising the row itself, rather than a pointer to the data which other indexes have. It's still just an index, albeit with the special property of having the data real close by...

Because the optimizer's job is to take into account any number of factors in determining its best query plan, it is free to choose whatever index its thinks best, or none at all if that results in the best query.

I stand by my statement that absent any ORDER BY clause, the rows in a resultset may be returned in any order at all, that is to say, the resultset must be considered unordered, despite what it might look like to your eye. If you want to be guaranteed that a result is to be in some order, you must explicitly state what you want that order to be.

I've said this before. If you want to take advantage of a perceived side-effect (and officially undocumented behavior) because every time you run a query you observe a certain behavior, by all means be my guest. The fact remains that the behavior you see is an accident and is not supported by the specification of the SQL language.

I think we've strayed far enough from the original topic. Presumably the OP has enough information to solve his particular problem. I'd be happy to continue this debate in another thread if you wish.

Jeff Mason
Custom Apps, Inc.
[email protected]
 
Old April 2nd, 2007, 01:15 PM
Friend of Wrox
 
Join Date: Aug 2004
Posts: 385
Thanks: 0
Thanked 0 Times in 0 Posts
Default

You still don't understand me.

If you have an identity key value (with clustered index). The identity will itterate with every row you add. If you add the data in the same order you want to retrieve it. You will get data in the same order.

For example....

Create table Sometest (IdentityID int, SomeText varchar(120))

then you make the IdentityID field a clustered index and auto incriment it. Thus you can insure you retrieve the data in the same order you entered it. Thus making it possible to do what the requester wants to do.

If you don't think this is true the please post the SQL code that demonstrates what your saying.

 
Old April 2nd, 2007, 02:28 PM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 839
Thanks: 0
Thanked 1 Time in 1 Post
Default

Quote:
quote:Originally posted by robprell
 You still don't understand me.
Nor you me. I guess we'll just have to agree to talk past each other.

I'll say it just once more.

The optimizer is free to return the rows from your table via a query like:

    SELECT * FROM Sometest

in any order it wants. The fact that it looks like it returns those rows in the order of the clustered index does not change the fact that it is not required by the SQL specification to return those rows in any particular order at all. I've already listed several non-deterministic situations in prior posts which may cause the optimizer to return the results in a different order than the clustered index, absent an ORDER BY clause.

If you wish to rely upon solely what you observe in a particular version of Microsoft's implementation of the SQL language, and not the language specification itself, by all means go ahead. For your sake I hope MS doesn't change it's query processor in the next release, or that you have enough time to react to an innocent optimizer upgrade so you can change all your non-standard code, if you can find all of it. I hope it all works out for you.

Since I'm repeating myself, we're obviously going around in circles so I yield.

Jeff Mason
Custom Apps, Inc.
[email protected]
 
Old April 2nd, 2007, 08:22 PM
Friend of Wrox
 
Join Date: Aug 2004
Posts: 385
Thanks: 0
Thanked 0 Times in 0 Posts
Default

No I am repeating myself.

If you do a select with an "Order by" as I said earlier it will not be in any order. When you do an order by the clustered identity field it will return in exactly the order you entered the data. Which was my point.



 
Old April 3rd, 2007, 05:11 AM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 839
Thanks: 0
Thanked 1 Time in 1 Post
Default

Quote:
quote:Originally posted by robprell


If you do a select with an "Order by" as I said earlier it will not be in any order. When you do an order by the clustered identity field it will return in exactly the order you entered the data. Which was my point.
Now we're getting somewhere, though I think we are still talking past one another.

I'll assume that the "with" in the first sentence above actually should be "without".

I've said in several earlier posts that resultsets will be ordered by the columns in the ORDER BY clause, and that absent an ORDER BY clause, a resultset is not guaranteed to be in any order at all.

Can we agree on that?

Given all that, though, it has nothing at all to do with the original post regarding string aggregation. To quote myself from an earlier post:
Quote:
quote:
The problem with the (idiomatic) statement:

SELECT @stringvar = @stringvar + ', ' + somecolumn
FROM sometable

is there is no way to guarantee that the various rows will be returned in any particular order, clustered index notwithstanding. The (internal) execution of this statement happens way before the resultset is constructed for return back to the client, so it is highly dependant on the execution plan.
I stand by my statement that the attempt at aggregation cannot be guaranteed to be in any order at all, whether you have a clustered index, identity or not, and whether you specify an ORDER BY clause or not. And that is why this method is considered undocumented behavior.

I'm done beating this poor dead horse...



Jeff Mason
Custom Apps, Inc.
[email protected]
 
Old April 3rd, 2007, 07:01 AM
Friend of Wrox
 
Join Date: Jun 2003
Posts: 2,480
Thanks: 0
Thanked 1 Time in 1 Post
Default

Quote:
quote:Originally posted by Jeff Mason
I stand by my statement that the attempt at aggregation cannot be guaranteed to be in any order at all, whether you have a clustered index, identity or not, and whether you specify an ORDER BY clause or not. And that is why this method is considered undocumented behavior.
Yes, I agree with you on your last post Jeff. As I had said in the other post where I posted this type of aggregation methedology that I used for having Email IDs constructed in a string for passing it to the TO parameter of send_mail procedure... I have always noticed that it gives the result in different order everytime though the table has an Identity column as primary key. Since the sort order is not a constraint at all in my case, I still continue to use it wherever the sort order is not mandatory.

Here is the other post comma separated list I referred to. The third reply from me and corresponding reply from Jeff explains it...

_________________________
- Vijay G
Strive for Perfection





Similar Threads
Thread Thread Starter Forum Replies Last Post
concatenating records rashi Access VBA 9 November 6th, 2008 05:07 PM
Concatenating strings of code sassora Word VBA 1 March 31st, 2006 02:39 PM
Concatenating many XML files srivalli9 XML 4 March 30th, 2006 03:05 AM
Concatenating column values arnabghosh Access 1 September 21st, 2005 06:45 AM
concatenating arrays allee_man BOOK: Beginning PHP4/PHP 5 ISBN: 978-0-7645-4364-7; v5 ISBN: 978-0-7645-5783-5 0 August 19th, 2005 08:55 AM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.