Re: What's the worst that can happen with below database design? Immediately or in the future?
Posted by: Rick James
Date: September 05, 2014 06:03PM

> i think it decreases search performance a lot, and all data fetches requires json_decode in php.

Either MySQL parses the fields its way, or json_decode parses the fields. I doubt if there is much performance difference. The #1 MySQL performance cost is I/O. #2 is locating rows in the cache after the I/O has put them into cache. I would put bursting the row to be about #3.

SELECT  mt.*
    FROM  main_table mt
    LEFT JOIN  attribute_table1 at1 ON at1.primary_key = mt.primary_key
    LEFT JOIN  attribute_table2 at2 ON at2.primary_key = mt.primary_key
    WHERE  mt.field1 LIKE '%Bamboozle%'
      AND  at1.field1 = '501835'
      AND  at2.field5 <> 'NOTTHIS'
That has to be performed as follows:

1. Scan the entire mt, filtering on field1. (A leading wildcard on the LIKE prevents use of any index.) For each successful row, do:
2. Do point-query via primary_key into at1. If 501835 is not found, remove the row from consideration.
3. Similarly for at2.

Total rows scanned: The number of rows in mt (you mentioned 10K?), plus a (hopefully) smaller number of rows in at1 and at2. If mt cannot be cached, then the query will be I/O-bound even after the first time.

Since they are _LEFT_ JOINs (presumably deliberately), the optimizer must start with mt, in spite of the likelihood that at1.field1 = '501835' would be very selective.

The alternative is to understand the application enough to realize that
1. att.field1 is not optional
2. that it is likely to be used for lookups
3. move it into the main table so that an index can be used.
Now, it looks more like:
SELECT  mt.*
    FROM  main_table mt
    LEFT JOIN  attribute_table2 at2 ON at2.primary_key = mt.primary_key
    WHERE  mt.field1 LIKE '%Bamboozle%'
      AND  mt.account_number = '501835'
      AND  at2.field5 <> 'NOTTHIS' 
-- with this on mt:
INDEX(account_number)
Now the query
1. hits the new index to locate one (or a small number of rows) with 501835,
2. double checks for '%Bamboozle%'
3. moves on to at2.
This adds up to a few (or few dozen) rows, not 10K+. It is unlikely to be I/O-bound, even for a huge dataset.

Then... Taking the JSON step would be to put mt.field1 and at2.field5 into the JSON column in mt:
CREATE TABLE main_table (
    id ...,
    account_number ...,
    json TEXT NOT NULL,
    PRIMARY KEY(id),
    INDEX(account_number)
);
-- and the query becomes
SELECT  mt.*
    FROM  main_table mt
    WHERE  mt.account_number = '501835';
-- plus code in PHP to filter for Bamboozle and not NOT THIS.
This is even less effort on the part of MySQL (since there is no JOIN). But it is more effort in PHP -- it's either a lot of PHP code to construct all the LEFT JOINs, or a lot of PHP code to do the filtering of secondary search criteria.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.