Re: Theoretical Database Table Structure
Lots of "attributes" with varying types of values (age ranges, binary values, semi-binary values (smoker/non-smoker/don't-care), vague things (like eye-colour), open ended things (hobbies), hierarchical things (Christian > Protestant > Baptist > Southern Baptist), etc) is a mess to scale.
I recommend...
1. Pick about 6 attributes that are likely to be present in a query. You will have to require that the users specify at least a few of the 6.
2. Build one table with those 6 attributes (of suitable types; indexed), plus user_id (PRIMARY KEY), plus a single column for the rest of the attributes (TEXT).
3. Have another table for user info (name, phone, etc) and user_id (PRIMARY KEY, AUTO_INCREMENT).
4. The attribute column should be JSON string (or XML or YAML).
The search goes something like this:
1: Use your front-end language (PHP, VB, whatever) to construct and run the query based on the available attributes (of the 6).
2: That query will return the JSON, which you 'decode' and use to further filter any other attributes given. You now have a list of user_ids.
3: Use the user_ids to fetch the user info from the other table.
Search table:
Indexed columns (perhaps) : sex, race, height, hair_colour, weight, drinker, smoker, age, education
JSON column: eye_colour, hobbies_and_interests, religion, style, education
User table:
user_id, username, first_name, last_name, email_address
The INDEXes should lean toward likely combinations...
INDEX(sex, race, weight)
INDEX(race, sex, height)
INDEX(smoker, sex, education)
Most people will be quite clear about their preference (insistence!) of sex. Many will be definite on race. Non-smokers and non-drinkers may have strong preferences, etc. Hence, these fields should be early in the indexes.
Keep in mind that
* MySQL will (usually) use only one index in a SELECT, and
* The early fields should (usually) be tested via '='.
Example:
WHERE sex = 'M' AND weight BETWEEN 100 AND 150
will make much better use of INDEX(sex, weight) than INDEX(weight, sex). Furthermore, INDEX(race, sex, weight) will be useless for that WHERE clause (because the first field, race, was not mentioned).