MySQL :: Re: Questions about performance for huge table (100M+ rows)

New Topic

Re: Questions about performance for huge table (100M+ rows)

Posted by: Vladan Stefanovic
Date: January 03, 2009 06:02PM

Thank you for your answers.

I would like to clear up some things that weren't so well explained in my first post.

- I am using `id` because, like I said, the `name` is it's base 36 encoded value. However, because `name` is required to be unique the only way I can be sure that it is, is by having an auto_increment field to base it on. Furthermore, the `name` can also be custom (but is so somewhat rarely), and before any INSERT I'm also doing a SELECT for the requested `name` to notify of it's unavailability and fetch other relevant fields. Also, if the INSERT is done without a custom `name`, it is possible that the `name` generated from `id` already exists (since custom `name`s are allowed), so I need to insert a blank row and continue doing so until the `name` from current `id` is unique. Blank rows are deleted every 7 days.
Considering all that, SELECTs are done against the `name` only.

- The `updated` field is a timestamp, but it's actually a UNIX timestamp adjusted to GMT, which is basically an integer. I'm using that because it can then easily be converted to any needed timezone and date format.
Same thing for `added`.

- 90% of the SELECTs on this table will fetch a single row by it's `name`. But this row also needs to be updated, namely it's `updated` and `hits` fields.
The rest of the queries will fetch either 100, 1000 or 10000 rows and they need to be ordered by either `id`, `special`, `added` or `hits`. The only conditional is that the `name` is not NULL or that it specifically is NULL. I don't need these results to be real time, so I can cache them for 3 to 24 hours.

- Based on your input, I modified the structure of the table. Here it is:

CREATE TABLE `test` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varbinary(50) default NULL,
`special` tinyint(1) NOT NULL default '0',
`content` text collate utf8_unicode_ci NOT NULL,
`hash` varbinary(40) default NULL,
`ip` varbinary(15) NOT NULL,
`added` int(10) unsigned NOT NULL default '0',
`updated` int(10) unsigned NOT NULL default '0',
`hits` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`),
UNIQUE KEY `hash` (`hash`)
) ENGINE=InnoDB

The main problem here is the fact that my employers say that their load varies wildly; at certain times there can be only a few dozen queries per second but a couple of hours later there could be hundreds per second on this table alone. So, I need to prepare for the worst as they need shortest possible response times in any given moment. And after the new application is done (which consists of PHP frontend and Java backend, both written by me) the average number of queries will continue to rise as will the number of rows in this table.

I'm sorry if I showed ignorance somewhere, but like I said, this is my first project with such a large database and thus I never needed to pay too much attention to optimization. Hopefully it won't be the last :)

Edited 2 time(s). Last edit at 01/05/2009 11:16AM by Vladan Stefanovic.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Questions about performance for huge table (100M+ rows)

12877

Vladan Stefanovic

January 02, 2009 08:20PM

Re: Questions about performance for huge table (100M+ rows)

4900

Rick James

January 03, 2009 12:38AM

Re: Questions about performance for huge table (100M+ rows)

3363

Boyd Hemphill

February 03, 2009 10:01AM

Re: Questions about performance for huge table (100M+ rows)

3402

Vince Gatto

January 03, 2009 01:17AM

Re: Questions about performance for huge table (100M+ rows)

2937

Rick James

January 03, 2009 04:58PM

Re: Questions about performance for huge table (100M+ rows)

2883

Vladan Stefanovic

January 03, 2009 06:02PM

Re: Questions about performance for huge table (100M+ rows)

2540

Rick James

January 04, 2009 07:35PM

Re: Questions about performance for huge table (100M+ rows)

3122

Vladan Stefanovic

January 05, 2009 09:53AM

Re: Questions about performance for huge table (100M+ rows)

2827

Rick James

January 05, 2009 06:49PM

Re: Questions about performance for huge table (100M+ rows)

3175

Vladan Stefanovic

January 05, 2009 09:21PM

Re: Questions about performance for huge table (100M+ rows)

2734

Rick James

January 06, 2009 10:00PM

Re: Questions about performance for huge table (100M+ rows)

2582

Victor Andres

January 21, 2009 11:50AM

Re: Questions about performance for huge table (100M+ rows)

2867

Rick James

January 21, 2009 11:04PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.