MySQL Forums
Forum List  »  Partitioning

Re: Partitioning or indexing for millions of records
Posted by: Rick James
Date: November 28, 2013 09:07PM

Shrink datatypes where practical -- an "hour" or "quarter" does not need a 4-byte INT does it? How about TINYINT UNSIGNED (range of 0..255), which takes only 1 byte.

BIGINT? That's 8 bytes; do you really need more than 4 billion values? INT UNSIGNED is 4 bytes.

TIMESTAMP is 4 bytes; DATETIME is 8.

> schema is normalized
You have 3 VARCHARs. What are the values like? Are they mostly repeated? If so, normalize them, too.

key_buffer_size is set in my.ini (my.cnf). It controls how much index space to allocate in RAM. See
http://mysql.rjweb.org/doc.php/memory

With those changes, the disk footprint might be cut in half. (Smaller -> faster.)

select  c.msisdn,sum(c.dataVolumeDownLink+c.dataVolumeUpLink) as datasum
    from  cdr c
    where  c.eventDate >= <date_var>
    group by  c.msisdn
    order by  datasum desc;
That smells like something that can (should!) be periodically (and incrementally) rolled into a summary table. (See my links.)

Open question: Will you be purging 'old' data? If so, PARTITION by eventDate. If not, then I don't yet seen any use for PARTITIONing.

With a summary table, you won't need the secondary keys; this will significantly speed up inserts. And the SELECT will probably run 10 times as fast against the summary table.

Back to your question:
> which is better indexing or partitioning
Tentatively, I say neither -- summary table is what you need.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Partitioning or indexing for millions of records
1945
November 28, 2013 09:07PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.