Hi Ingo,
Thanks for the feedback, although I am still a little confused between the 'locked' and 'system locked' states.
We regularly see 'Locked' states which I thought meant that one thread was waiting for another thread (query) to finish before it executed, for example one INSERT was locking a table and a subsequent SELECT was waiting for it to finish etc. In our case, its more likely that the locks are applied to subsequent INSERTs as we never delete data from the table for performance reasons (avoiding the 'holes' in the .MYD files).
What I don't understand is 'System Locked' - the documentation seems to suggest that this is only used when running multiple instances of the MySQL servers against a single database, and would presumably mean that its some locking between processes on the same physical machine? All the documentation seems to suggest that --skip-locking is (a) disabled by default and (b) only comes into play when running multiple servers. Is this therefore not the case?
As you said, the key cache file is small and I'd be surprised if this was taking a long time to process.
We definitely aren't hitting a disk limit - the RAID array is over 2 terrabytes and the table file (.MYD) is currently only 180Gb (max). The table contains approximately 380 million rows (growing by around 2.8 million rows per hour max, usually less), and was created with AVG_ROW_LENGTH=512 and MAX_ROWS 1,000,000,000 which I believe limits the table to around 500Gb ? Is it true to say that MySQL will not extend the .MYD beyond this limit or is it a guideline? I don't know of any file size limits on Windows 2003 server although there may well be?
I am worried about performing a myisamchk against the table in question to find out the sizes you talk about as its a live system, and if myisamchk creates a system lock for any period of time we'll start running into problems.
According to
http://dev.mysql.com/doc/refman/4.1/en/system.html skip-external-locking does exist? I recall reading it was renamed in 4.0.x from skip-locking?
re: my comments regarding the performance of the SELECT query - the query itself generally starts returning data almost immediately - certainly there is no significant delay before we can start streaming data. It can take around 5 minutes to stream the data back though, but bear in mind this is reading 2.5 - 3.5 million rows in sequential order from the RAID array (striped etc). The query is performed once every 45 minutes or thereabouts, sometimes more, sometimes less. We had to read the entire set of results (which includes stuff we don't need) and filter it client-side as MySQL spent forever scanning around the disk trying to find the records we needed (largely we think due to disk seek times), so its just quicker and easier for us to pull the entire data set in the order it was written to disk, effectively eliminating disk seeks.
Once again, any suggestions would be welcome!
Cheers
Brian