N-to-N Model: Table Size vs Complexity
I've been instructed to add basic social networking functionality into a website I'm working on. I am entirely familiar with the fundamentals of the N-to-N relational model for social networking:
CREATE TABLE `friends` (
`userID` MEDIUMINT NOT NULL ,
`friendID` MEDIUMINT NOT NULL ,
INDEX ( `userID` , `friendID` )
)
And that is of course linked to a users table. Very basic stuff. However I am a bit undecided on how to STORE the information as there are two different ways to do it:
Assume I wanted user 1 and user 2 to be friends. Storing information in the typical way (the way most sites do it), I would just store 1 record for the relationship (of both people) and use a UNION select to get all the friends:
+--------+----------+
| userID | friendID |
+--------+----------+
| 2 | 1 |
+--------+----------+
Select userName FROM users WHERE userID= ANY(
SELECT userID FROM friends
WHERE friendID = 1
UNION ALL
SELECT friendID from friends
WHERE userID=1)
This, of course, eliminates redundancy, allowing for a smaller table. It simplifies the storing of information, but complicates the retrieval of information as it forces me to write a query that reads from the table twice (although I am not 100% versed on the optimization intricacies of UNION). It also complicates several other aspects of my design, including but not limited to "friend categories".
The alternative method:
+--------+----------+
| userID | friendID |
+--------+----------+
| 2 | 1 |
| 1 | 2 |
+--------+----------+
Obviously this will increase table size, but will simplify retrieval queries as I will no longer have to UNION select (reading the table twice) and it allows me to extend the table if I need to (i.e. friend categories).
Weighing the pros/cons of both storage methods, I imagine many people have run type of dilemma before, so I'm just trying to gather some input before I finalize my schema.
Edited 1 time(s). Last edit at 09/02/2008 05:51PM by Adam Dougherty.