MySQL Forums
Forum List  »  Performance

Re: Millions storage...
Posted by: James Day
Date: March 28, 2005 07:53PM

Don't create one table per day. Unlikely to be efficient or necessary.

What is the usual ordering of reports? Is it positions for one vehicle (or a few) over time or position now of all vehicles? If one at a time, InnoDB and a primary key with the vehicle ID as the first part of the key and timestamp as second part will help because many records for each vehicle will be in the same 16k database page. This should be very efficient even with very large amounts of data. The current position information will probably be cached by InnoDB (cached database pages), so that won't matter so much. Can't do it for MyISAM, which stores in time added order (but you can alter table order by for MyISAM as a batch job).

If you are using MyISAM now, getting all records for one vehicle for a range may need one disk seek per record. Change to that InnoDB approach with that primary key and it maybe one disk seek for every few hundred records - much faster!

If you don't need to retain all records forever you might consider using the merge storage engine ( ) with MyISAM and monthly tables. Then you can remove each month from the merge when it is no longer needed.
Or you could mix InnoDB for current and recent data with merge for older data, moving the records form InnoDB to the merge tables in a batch job. That way you'll get the right ordering for the recent information automatically from InnoDB and reduced storage space for old data.

MyISAM and merge for old records sounds most likely to be best for your purpose, if you do not need the ACID properties InnoDB provides.

But I'd still try InnoDB first, because it's more automatic and less likely to break if customers don't follow procedures properly. It would get slower over time but nothing would break. I think that with the primary key approach I described, it would be faster than MyISAM.

For your common queries, consider creating a "covering index". That is an index which contains all fields in all parts of the statement, in the order needed by any order by part. This will avoid the need to get the field values from the data records and can make a very big difference to the number of disk seeks needed for a query. Explain select will say "using index" when a covering index is being used.

Many disks are probably most important for the hardware needs. Maybe four in RAID 10? If the RPM is the same, SATA and SCSI are very close in performance now. For a given budget you'd probably do better with more disk drives and SATA than with SCSI unless case space restrictions limit the number of drives.

For 32 bit if your budget allows it, up to 2GB of RAM can be useful. I can't tell you how much you need because I don't know the properties of the application. 64 bit systems can use much more and with InnoDB I operate one server with 8GB and four with 4GB. So, get RAM of sizes so you can add more later - it's an easy way to add speed. If your motherboard can take 256k, 512MB, 1GB or 2GB RAM, maybe buy 1GB with one 1GB part instead of four 256, so you can use the other three slots later.

CPU is less important. You're probably disk seek limited. More drives, good schema and covering indexes are the solutions for that.

If you have not yet created a my.cnf file, do that and use good values. Very important. See my_large.cnf for an example. Very common for people not to do this and then to wonder why it is very slow.

Options: ReplyQuote

Written By
Re: Millions storage...
March 28, 2005 07:53PM
March 30, 2005 12:50AM
March 30, 2005 01:16AM
April 01, 2005 12:51AM
March 31, 2005 11:34PM
April 01, 2005 09:07AM
April 01, 2005 09:09PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.