MySQL Forums
Forum List  »  Newbie

Re: MySQL Compression
Posted by: Rick James
Date: January 13, 2010 11:49PM

There is no automated way to do what you describe. The PHP code to do such is not too difficult. Some comments...

* MD5(TextField) (in PHP or MySQL) can be useful for shrinking a large field into a shorter, unique value for comparision/lookup. (Granted, there is a chance of unwanted duplicate md5 values, but it is a very tiny chance.)

* Vertical partitioning -- this will help with performance in certain situations when you data gets big. (What you have might be called 'medium' sized.) Some references:
http://forums.mysql.com/read.php?24,255788 (RAM requirements- new server)
http://forums.mysql.com/read.php?10,254231 (TINY TEXT vs Varchar(240) and TEXT vs Varchar(300-500))
http://forums.mysql.com/read.php?24,240757 (Questions about performance for huge table (100M+ rows))
http://forums.mysql.com/read.php?10,241445 (maximum number of columns can my mysql version support)

* I would compress the data -- but do the compression in PHP. More refs (perhaps):
http://forums.mysql.com/read.php?24,246927 (350+ million rows -- is this as fast as it can get? (too slow at the moment!))
http://forums.mysql.com/read.php?10,247779 (compressing tables)
http://forums.mysql.com/read.php?24,243890 (Use InnobDB or MYISAM?)
http://forums.mysql.com/read.php?20,243655 (compression; selection of db engine)

* Normalization (as you suggest)... Since it is <255 characters, you could index the text field. If it were larger, I would recommend getting MD5 involved.

* If you do normalize, then future INSERTs will need to check for the item already existing, if so, get its id, if not insert it and get its id.

Options: ReplyQuote


Subject
Written By
Posted
January 12, 2010 10:20AM
Re: MySQL Compression
January 13, 2010 11:49PM
January 14, 2010 04:13PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.