MySQL Forums
Forum List  »  MyISAM

Re: .ini settings for ADD KEY to 30Gb table on 96Gb machine?
Posted by: Peter Griffioen
Date: November 01, 2010 12:38AM

Thanks for that Rick. I've made that change.

I've been 'playing' with this a lot and its a bit frustrating but now I think I know what I have to do. I've given this some test runs and found the way I was doing it got everything loaded but running really slowly. To explain....

The PointId's act as locations and the TileLoadId as time of image.
The DB is primarily going to have a variety of queries thrown at it like (pseudo SQL):
Select PointId, AVG(Red), AVG(Blue)... from ModisPixelStack
Where PointId >= NN and PointId <= MM
AND ModisTileLoadId >= T1 AND ModisTileLoadID <= T2
AND (VIQA & WantedFlags = WantedFlags)
Group by PointId Order by PointId;

The AVG function may be a whole suite of others such as Max, Stdev or there may be no 'GROUP BY'. I may ask for every instance to do my own Median or Change stats. I'm even thinking of doing a FFT.
My software creates the SQLs to the specification of the Output Image Row by Row, generally asking for say 10000 pixels at a time. It does its calcs and reformatting and then spits out the requested image.

I just loaded the year 2000 images and it came to 30GB of data and 35GB of indicies. Only 10 more years to go! It has become obvious to me that I should have a new MYISAM table for every year, each being about 30GB and then use a Merge Table if I want to access across years.

Originally I wanted to Primary index by PointId, ModisTileLoadId but someone (my wife) pointed out to me that the DB would have to interlace every pixel which would take forever as more tiles were loaded. She suggested setting the Primary Key to ModisTileLoadId, PointId and then the primary key would be exactly the load order if I sorted the files correctly. She was right and everything loaded as quickly as could be expected, given 1200 "Load Data Infile" commands. However performance was terrible. 8 min to return a list of values (say 10 tiles) of a single pixel. i.e.
Select ModisPixelStack_2000.* from ModisPixelStack_2000 where PointId = nnnn;

Nasty! Thus I had to return to my original idea. Now I'm using:

ALTER TABLE `modisimagestack`.`modispixelstack_2001` ENGINE = MEMORY;
ALTER TABLE `modisimagestack`.`modispixelstack_2001` DROP PRIMARY KEY, ADD PRIMARY KEY USING BTREE(`PointId`, `ModisTileLoadId`);
LOAD DATA LOCAL INFILE 'C:/Temp/ModisTXT/Modis_101_366_2044801_1.txt' INTO TABLE ModisImageStack.ModisPixelStack_2001;
LOAD DATA LOCAL INFILE 'C:/Temp/ModisTXT/Modis_106_382_2044801_1.txt' INTO TABLE ModisImageStack.ModisPixelStack_2001;
..........
ALTER TABLE `modisimagestack`.`modispixelstack_2001` ENGINE = MYISAM;

with the Primary Key set to PointId, ModisTileLoadID and Temporary Table Size = 80GB and Max Heap Table Size = 80GB. All of the re-arranging of the data is done in memory and so should be fast. This is executing right now so I'll see how it goes, but I may run out of memory....

Peter

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.