MySQL Forums
Forum List  »  Newbie

Re: New to design; how to optimize 3 column single table with high I/O?
Posted by: Marie Sutherland
Date: July 11, 2011 10:31PM

Rick James Wrote:
-------------------------------------------------------
> > per user per event
> Eh? I don't see any column for "event"?
>
> Can you really guarantee that you won't insert two
> rows with the same millisecond? Of so, then
>
> CREATE TABLE foo (
> userid MEDIUMINT UNSIGNED NOT NULL,
> timestmp BIGINT UNSIGNED NOT NULL, -- or
> DOUBLE
> state ENUM('yes', 'no') NOT NULL,
> PRIMARY KEY(userid, timestmp) -- Here's your
> index.
> ) ENGINE=InnoDB;
>
> CREATE TABLE Users (
> userid MEDIUMINT UNSIGNED NOT NULL
> AUTO_INCREMENT,
> ..., -- info about the user
> PRIMARY KEY (userid)
> ) ENGINE=InnoDB;
>
>
> Any query that includes WHERE userid=... will be
> quite efficient. In particular, for
> > For User X at Time X, Yes or No?
>
> SELECT state
> FROM foo
> WHERE userid = ...
> AND timestmp <= ...
> ORDER BY timestmp DESC
> LIMIT 1;
>
> Very efficient.
>
> If you are starting with a username, then a JOIN
> would be indicated.


Thank you so much for your response.

I have a couple of questions I'd like to ask.

(1) I cannot guarantee that I will not insert unique times; multiple user events will occur and they'll happen with a probability of n users/event numbers. Given a few million records, would it make sense to create a numbered first column that is nothing more than a unique key (if I understand the notion of unique keys correctly in MySQL)? The useful life of each record is no more than a few hours; I will probably implement a cleanup of all records with a timestamp of >24 hours as part of the design.

(2) Why have you chosen an ENUM instead of a binary for the state? It will never be other than 1/0. Is that the most efficient type?

(3) Does it make sense to index by username and timestamp given my description? It seems like a lot of work to index by timestamp when it doesn't seem like it will provide any more useful information than the user_id will.

(4) There may be a necessity to add another column to the db. I mentioned events; the db would now look like:

UNIQUE_ID, USER_ID, EVENT_ID, TIMESTAMP, STATE

That event_id won't be part of the high I/O; it will be checked later, though. Each of the timestamps + state updates will be occurring during an event. Now, given this information, does it make sense to index by user_id first and event_id second instead of by timestamp? I'd give a ballpark figure of 350 timestamp + state updates during each event.

(5) I just went and looked up InnoDB; it is a product of Oracle and it looks like Oracle has substituted InnoDB for MyISAM as the default storage engine in MySQL 5.5. Why are you recommending that I use InnoDB instead of MyISAM here?

Options: ReplyQuote


Subject
Written By
Posted
Re: New to design; how to optimize 3 column single table with high I/O?
July 11, 2011 10:31PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.