Quote:
quote:Originally posted by robprell
Can't agree with this comment:
"there is no guarantee that the returned rows will be in the order of the clustered index"
If you have a clustered index and you order by the clustered index when you select the data the retrieveal is most efficient and it will be in order.
|
My comment was in regards to the string aggregation query, but in fact applies to any query. Your statement is correct as long as you "...order by the clustered index...".
In fact, if you ORDER BY anything, you are guaranteed that the resultset will be in that order, and that has nothing to do with any index.
Conversely, if a query has no ORDER BY clause, then you cannot
guarantee that the resultset is in any particular order.
It happens that the optimizer will be greatly influenced by the presence of an index, but it may in fact choose not to use it if it thinks another plan is more efficient. It is dangerous to depend on empirical observation and side-effects: that the optimizer will
always give you results in a particular order just because every time you run it, you see those results in that particular order.
The fact is that only way to guarantee a specific order is to state you want it that way.
The question is in regard to this query:
SELECT @stringvar = @stringvar + ', ' + somecolumn
FROM sometable
ORDER BY someclusteredindexcolumn
In point of fact, that ORDER BY clause will have absolutely no (guaranteed) effect on the order of the strings as the concatenation is being done. This is because (in effect) an ORDER BY clause is the very last operation that is performed on a resultset.
That is, the resultset is constructed from columns in the SELECT clause from the rows in the tables in the FROM clause, filtered by the WHERE clause and GROUPed accordingly. All of these operations are set-based operations, in that they happen on all the rows and columns all at the same time (or at least that is how they appear).
Only after all that is done is the resultset sorted as per the ORDER BY clause. Note that sorting is a cursor operation. In effect, the ORDER BY clause takes the set produced by the rest of the SQL query and processes the rows one at a time (that's what a cursor is) to place them in the requested order. If there is no ORDER BY clause, then no ordering is necessary so the resultset is passed back to the consumer unmodified. It is in whatever order those set based operations left it in.
Now it happens that the optimizer is smart enough to recognize that indexes exist on tables and those indexes present the rows in a known order, so the last sorting step may be able to be avoided because the optimizer will construct a query plan taking those indexes into account. Since sorting is expensive, those indexes will influence that choice heavily.
But the fact is that the string aggregation in the query above is a process that happens
well before any ordering, so you simply cannot depend on what order the rows will be processed, and thus in what order the strings will have been concatenated in the resultant output variable.
Just to throw more fuel on the fire, :D it is also not true that the use of the clustered index
always is the most efficient means of data retrieval. If a covering index exists for a given query, then the optimizer may choose it and avoid the clustered index altogether, getting the results from the index itself...
Jeff Mason
Custom Apps, Inc.
[email protected]