MySQL Forums
Forum List  »  Optimizer & Parser

Re: Strategy needed for picking random rows
Posted by: Rick James
Date: September 08, 2011 07:51AM

I'm glad to see you have figured out the inner-outer query trick. Almost. The LIMIT needs to be in the subquery. The index does need to include `id`. (If it is the PRIMARY KEY on InnoDB, it will be implicitly part of the index.)

See if this runs faster:
SELECT  m.id, m.bulky_field1, ...
    FROM  members m
    INNER JOIN  
      ( SELECT  id
            FROM  members
            WHERE  (sex = 'm')
              AND  (img = 1)
              AND  (home IN (1,2,3,4,5,6,7,8))
              AND  (dob BETWEEN '1960-09-07' AND  '1979-09-08')
            ORDER BY  RAND()
            LIMIT  200
      ) x USING (id) ;

Alas, that does not take care of haircolor.

Something to try, for the more general case:
Add an extra FLOAT column `rnd`; populate it with RAND().
You may need to repopulate that column periodically, else the following query will tend to clump the same people together often.
Have INDEX(rnd, id)
Then do this:
SELECT  ...
    FROM  members m
    JOIN  
      ( SELECT  id
            FROM  members
            WHERE  rnd > RAND()   -- **
              AND  sex = 'm'
              AND  ...
            ORDER BY  rnd         -- **
            LIMIT  200            -- **
      ) USING (id);
However, if RAND() is too big, it will run out of rows before it gets 200. For that case, I suggest keeping the rows you did get, then grabbing the rest of the 200 via a second SELECT that is missing "rnd > RAND()" and has a lower LIMIT. This will "wrap" around.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Strategy needed for picking random rows
1124
September 08, 2011 07:51AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.