Wrox Home  
Search P2P Archive for: Go

  Return to Index  

sql_language thread: Random results


Message #1 by Amir Meskovic <meskovic@e...> on Fri, 21 Sep 2001 03:30:08 +0400
Oh dear 

We are of course using indexes here. A unique record
identifier kind of implies that.

I chimed in again because of the use of a temporary
table and of course came back with an alternative use
of temporary table that did not require a dump copy. 

The original request spoke of I believe two records.

Of course I follow the logic of TOP 'n' and a 
temporary table and inserting 'n' created random
numbers. Of course I can see where this might be
appropriate with large numbers and perhaps a one off
dump.

Remember I set out a practical approach to potshots. 
the rule is equal to or greater than or less than. 
The 100 record will always be returned because it is
caught by the rule its the last one left!

You mention of course 'bags' and 'sets' and I played
around of course with these with C++. This leads of
course into where such activity is taking place in
memory or cached memory or disk access. Search
strategy covers a number of factors. Which begs the
question of how the SQL Database itself optimises its
SQL code (which of course like all good compilers it
does).

Of course you have to consider the needs and
requirements of your own system, and then devise an
appropriate strategy. If you want build a optional
strategy into the Proc so that it can go either
potshots or copy well OK there are of course pros and
cons with each. I can also think of combining them.

Optimising queries with data is fun I have actually
done this kind of thing in the dim and distant past.

--- Steve Carter <Steve.Carter@t...>
wrote:
> > 
> > Actually I am not satisfied. Potshots are
> scalable.
> > 
> You've be fine for the first few iterations but then
> you will slow down
> as you progress, so the time taken for the (i)th
> number takes (i) units 
> of time to check that it hasn't already been picked.
>  Thus to pick (n)
> random numbers, you take n*n/2 units or time, or 'It
> has a time complexity
> of Order of n squared'
> 
> By contrast, the 'top n' approach is linear on the
> surface but does require
> a sort on the table.  Sorts can be achieved in
> (iirc) n*log(n) time which 
> as you get to taking more and more random numbers,
> improves its performance 
> wrt the former.
> 
> Of course for small n this may well be slower.
> 
> But consider also the case of taking 100 random
> numbers from 100 supplied 
> numbers.  Using potshots, it will take you forever
> to get the last number,
> when only one number is valid and you must check
> each candidate against 99 
> others to see whether you already picked it.
> 
> I also notice from your reply that you don't seem to
> have grasped the other
> algorithm:
> 
> > What is a random number if it is not a potshot? Of
> > course there is the possibility of returning the
> same
> > record. 
> 
> If you are interested, it could be worth looking up
> some math pages.  The
> key word you want is 'combinatorics'.  The
> difference in the two approaches
> is that you are dealing with a 'set' and the other
> algorithm is dealing with
> a 'bag'.  This is a fascinating subject, if you can
> crack into it.

  Return to Index