MySQL :: Re: Partition on day of year

New Topic

Re: Partition on day of year

Posted by: Rick James
Date: March 04, 2011 08:58AM

Now this is a INDEX question more than a PARTITION question.

select  *
    from  test_jobs
    where  customer_id = 1
      and  job_date_time between '2010-06-14 00:00:00' and '2010-06-14 23:23:59'
      and  printer_name like 'Shar%'
      and  username like 'jo%'
    order by  job_date_time
    limit  0,25;

1. job_date_time BETWEEN will do "partition pruning" to decide to search only 1 or 2 partitions. (Run EXPLAIN PARTITIONS SELECT ...)

2. Then it will look for a good index to use. Let me explain possibilities...

* INDEX(job_date_time, ...) -- not useful since you are asking for the entire partition.

* INDEX(customer_id, ...) -- Good (assuming they aren't all (or mostly) =1). Since it is '=', this is a good way to _start_ a compound index.

* INDEX(customer_id, printer_name, ...)
* INDEX(customer_id, username, ...)
Either of these would completely deal with the customer_id and the LIKE. Note: the LIKE does _not_ start with a wild card; this is critical for making these indexes useful. But since the LIKE has a wild card, there is no use having any further fields in the INDEX. MySQL won't use them (usually).

* INDEX(customer_id, job_date_time, ...) This, on the other hand, might handle the ORDER BY. Again, there is no use going beyond the two fields. And, since you have a LIMIT, this _might_ be the best. (If you add all 3, see EXPLAIN SELECT.)

Caution: adding an index to a 200M-row table will probably take hours.

Any of these would help your PARTITIONed table; I cannot say which would be best (it is heavily data-dependent). You could add all 3, if you like; then the optimizer would peak at each of them and (usually) pick the best.
* INDEX(customer_id, printer_name)
* INDEX(customer_id, username)
* INDEX(customer_id, job_date_time)

The same query, in a non-PARTITION table, would be best served by only:
* INDEX(customer_id, job_date_time)
The others could also be useful if printer_name SHAR 'Shar%' (etc) is selective enough.

SHOW VARIABLES LIKE '%buffer%';
SHOW VARIABLES LIKE 'innodb%';
A common mistake is to not set the appropriate cache to a useful size. Another is in having the InnoDB parameters set to secure rather than fast.

EXPLAIN SELECT ... -- to see what index it is using.

Your existing index
INDEX(`customer_id`,`job_date_time`,`printer_name`,`username`)
might be useful if you pull a subquery trick:

SELECT t.*
    FROM test_jobs t
    JOIN (
        select  id
            from  test_jobs
            where  customer_id = 1
              and  job_date_time between '2010-06-14 00:00:00' and '2010-06-14 23:23:59'
              and  printer_name like 'Shar%'
              and  username like 'jo%'
            order by  job_date_time, id
            limit  0,25
         ) s  ON t.id = s.id
    where  t.job_date_time between '2010-06-14 00:00:00' and '2010-06-14 23:23:59'
    order by  t.job_date_time, t.id;

Here's the rationale...
The 'subquery' can do all its work using your INDEX. (You will see 'Using index' in the EXPLAIN.) It will boil down to 25 ids.
Then the outer query will look up the full rows for those 25 ids.
Notes:
* Since you are using InnoDB, the PRIMARY KEY is included in every secondary key. (`id` is useful here.)
* "where t.job_date_time" is redundantly in the outer query so that it focuses on the one partition.
* The ORDER BY is repeated because, technically, the outer query could lose the ordering that the inner query created.
* In case there are duplicate times, I added ",id" to make the order deterministic (though you possibly don't care).
* This trick will have some efficiency even for "LIKE '%HP%'", whereas my previous INDEX discussion would ignore any LIKE with an initial wildcard.

Postscript: You probably have other flavors of SELECT that you have not showed us. I have covered only the two SELECTs you mentioned.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Partition on day of year

8827

Brian Wilson

February 28, 2011 07:18PM

Re: Partition on day of year

3016

Brian Wilson

February 28, 2011 08:43PM

Re: Partition on day of year

4236

Shantanu Oak

March 01, 2011 12:53AM

Re: Partition on day of year

2700

Rick James

March 01, 2011 10:33PM

Re: Partition on day of year

2566

Brian Wilson

March 02, 2011 10:00AM

Re: Partition on day of year

2200

Rick James

March 02, 2011 11:16AM

Re: Partition on day of year

2366

Brian Wilson

March 03, 2011 09:47PM

Re: Partition on day of year

2792

Rick James

March 04, 2011 08:58AM

Re: Partition on day of year

2220

Brian Wilson

March 04, 2011 10:45AM

Re: Partition on day of year

2773

Rick James

March 04, 2011 08:23PM

Re: Partition on day of year

1954

Mattias Jonsson

March 02, 2011 02:58AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.