Re: Storing a large, dense tensor (35gb scientific data) -- is mysql appropriate?
Posted by: Rick James
Date: November 07, 2010 06:54PM

Guess what? An INDEX stores redundant data. Granted, the storing of it is automatic, so it is not as bad as maintaining the redundant data yourself.

8 bytes? DOUBLE? FLOAT (4 bytes) is not precise enough?

Don't write your own storage engine. That would be a lot of work, and you might not achieve much.

I can't say much more without getting a better feel for the SELECTs you need to apply.

Also, please provide SHOW CREATE TABLE; this will give me some concrete names to talk around.

"with its full coordinates is ~36gb" -- That's an average of 15 bytes per dimension! Normalize! You might get down to 7GB.

The YEAR datatype is 2 bytes (no normalization needed). You could shrink that to 1 byte if you can be sure you have a range of under 255 years, say, by subtracting 1900. That could still allow 'range' queries (like: WHERE year BETWEEN 2003-1900 AND 2010-1900; do _not_ do WHERE year+1900 BETWEEN 2003 AND 2010).

Country could at least be the 2-byte Intl codes, or possibly a 1-byte number plus a lookup table. If "state" is a subdivision of "country", then these could be combined, thereby eliminating dimension(s). On the other hand, if you need to 'report' on a 'country' or report on a 'state', then arrange for the state-country pairs to be consecutive within a country. Maybe something 'simple' like (assuming no more than 300 states in a country):
country_id * 300 + state_id
and store the pair in a SMALLINT UNSIGNED (2 bytes (0..65535) for both dimensions).

Are any other dimensions not 'independent'?

Indexing... Having lots of indexes may not be worthwhile. Again, let's see the SELECTs before commenting.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.