Re: Sorting over multiple JOINS
Posted by:
Rick James
Date: January 24, 2015 12:36AM
True, in _some_ cases, normalization leads to performance difficulties. I like to say "normalize, but don't over-normalize".
EAV (key-value) schemas are notoriously clumsy. (I have a blog on how to make that work somewhat better.)
Do not normalize "continuous" values such as FLOATs, DATETIMEs, etc., especially if you are likely to need to fetch a 'range' of such values.
Do not normalize "short" values. Example: a 2-letter country code is not much bigger than a TINYINT, but it avoids an essentially unnecessary normalization (and the JOINs that go with it).
> Will the sorting be faster if I raise innodb_buffer_pool_size or sort_buffer_size?
In general, the buffer_pool should be 70% of available RAM -- any larger threatens to cause swapping, which is really, really, bad for performance.
Making the sort_buffer_size too big (bigger than, say, 1% of RAM) can be a problem if a lot of SELECTs are running simultaneously and needing a 'sort buffer'. Again, if this leads to swapping, there is trouble.
We might be able to help you if we had some clues of what type of data you have -- "thingID" and "info1" give us no clues.
> The resultset is about 500 MB
Big results take a big amount of time -- I/O is not cheap. I consider a resultset of 1MB in a production system to be "big" -- sort of an upper limit on what makes sense for production. Again, please explain what you are doing so we can better help you.
Sometimes, one has to bite the bullet and do _some_ denormalization to avoid problems like yours.