MySQL Forums
Forum List  »  General

Database design for multiple choice questionnaire results?
Posted by: qwerty awerty
Date: October 18, 2004 10:55AM

Hello all,

Would greatly appreciate informed advice on how to best structure a database for multiple choice questionnaire results. Here's the situation:

We poll 20,000 people. The questionnaire has ~600 questions to which there are ~3,000 possible responses. Each response has a uinique numeric response id, and each respondent has a unique numeric respondent id.

We compile statistics by cross-referencing responses:

How many people gave response a and b and c?

Once the survey is complete, the data is never changed, except maybe by an admin if an error is found. So I want to optimize for selects.

A plain old 20,000-row by 3,000-column table, with boolean columns containing 1's and 0's to indicate if the respondent gave that reply, is proving verrrrrry slow. With 3,000 columns, and all columns being equally likely to be searched, indexing them all is impossible.

I also tried a two-column structure: respondent_id, response_id, with a compound primary key on those. One row per respondet - response pair. This produced a 12-million row table! This was much faster! Using a benchmarking module in Perl, I measured that this reduced select times to ~0.16-seconds. BUT, I have to execute ~100 such queries to produce a single report, and results are expected by users faster than 16 seconds - reports are generated in real time and there are far too many possible combinations of reports to pre-generate them.

I'd like to get the execution time for a single select down below 1/10th of a second.

I also tried a reverse index, using a separate table for each response_id and filling rows with respondent_id's who gve that response. Included a primary key index on each. That way, the search became:

select count(*) from table_b join table_a on (table_a.response_id = table_b.response_id);

Each reponse being given by only about 20% of respondents, I thought this would be lightening fast, since only two little tables have to be loaded and compared. This was again better, but even this remained at about 0.13 seconds.

So now I'm looking for even more imaginative strucutres or query strategies that may be faster.

Thanks for any help.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.