Re: DATETIME vs. INT(11)?
Posted by:
yawnmoth
Date: August 26, 2009 09:23AM
Rick James Wrote:
-------------------------------------------------------
> Lots of rows --> lots of time to execute.
> Especially if it has to hit the disk.
>
> Functions or no functions -- this is relatively
> insignificant in performance issues.
>
> Consider building a summary table indexed by
> year+month.
In my original post, I did an explain on the table with and without an index. The problem is, some functions break the indexability of an otherwise indexed column. from_unixtime() is one such function. That row A's purchase_date column is greater than row B's purchase_date column does not mean that that row A's from_unixtime(purchase_date,'%m') is going to be greater than row B's from_unixtime(purchase_date,'%m'). For example, 2009-03 > 2008-04, but 03 > 04. As such, although you could perform a binary search on purchase_date if it were indexed, you can't perform a binary search on from_unixtime(purchase_date,'%m'), even if purchase_date is indexed - you have to recalculate everything.
from_unixtime(purchase_date,'%y%m') does not have that same problem, however, is MySQL smart enough to differentiate between from_unixtime() calls that invalidate the index and from_unixtime() calls that don't? And if it is, is it smart enough to see that "GROUP BY from_unixtime(purchase_date,'%y'), from_unixtime(purchase_date, '%m')" and "GROUP BY from_unixtime(purchase_date, '%y%m')" are the same thing?
In contrast, if DATETIME were internally implemented as three columns, such differentiation would be unnecessary. MONTH(purchase_date), in this scenario, could just tell MySQL to look in one of the three columns as opposed to looking in all of them.
You suggested "building a summary table indexed by year+month". I'm not even sure what you mean by that. How is a summary table different from a regular table? And are you proposing year and month be separate columns? If so, that's certainly do'able, but that still ignores the fundamental issue I raised - that DATETIME seems to offer no real advantage over INT(11).