MySQL Forums
Forum List  »  Newbie

Efficient database design for auto-incrementing, grouped, sets of data?
Posted by: Charles Carver
Date: June 27, 2016 03:41PM

Basically, I have multiple hashes associated with an entity:

    hash1: Entity 1
    hash2: Entity 1
    ...
    hash6: Entity 2
    hash7: Entity 2

Consequently, I'm attempting to create a hashes table that looks something like this:

    hash (unique primary key), entity_id (auto increments with the *group*)

For example, if I inserted the above example information, I'd like it to be reflected in the hashes table like so:

    hash   | entity_id
    -------|------------
    hash1  | 1
    hash2  | 1
    hash6  | 2
    hash7  | 2
    hash8  | 2
    hash13 | 3
    hash14 | 3

From there, things get a bit more tricky. Say I want to add some new hashes, all for the same entity but I don't know if its in the tables or not. To decide that, I use the following logic:

[ol]
[li]If none of the new hashes exist in the hashes table, I want to create a new entity_id for that group and use that id for these hashes and in subsequent insert statements.[/li]
[li]If, however, any of the new hashes already exist in the hashes table, I want to merge all of the entity_id's into a new value in the hashes table and any other table that has rows referencing that id. Finally, I want to add any additional data using the new entity_id.[/li]
[/ol]

Here's a quick test case. Say I have these hashes to add for a single entity:

    hash1  | ?
    hash6  | ?
    hash17 | ?
    hash18 | ?

The result, after inserting them, should be:

    hash   | entity_id
    -------|------------
    hash13 | 3
    hash14 | 3
    hash2  | 4      -- entity_id modified 1->4, hash2 appeared in the new hashes
    hash6  | 4      -- entity_id modified 2->4, hash6 appeared in the new hashes
    hash1  | 4      -- entity_id modified, (due to hash2)
    hash7  | 4      -- entity_id modified, (due to hash6)
    hash8  | 4      -- entity_id modified, (due to hash6)
    hash17 | 4      -- added, new hash
    hash18 | 4      -- added, new hash

Does this design make any sense? Part of me thinks I'm going to have to do a portion of this with PHP but I am trying to make this as efficient/fast as possible (I will be handling a lot of data with whatever method is chosen). As of now, my workflow is something like this:

    *Select all entity_id`s where the hash equals any of the existing hashes
    *Select the largest entity_id from the `hash` table
    *If no entity_id's exist from step 1, insert appropriate rows with an entity_id of max(entity_id) + 1
    *If one or more entity_id exists, change all rows with those entity_id's to an entity_id of max(entity_id) + 1
    *Perform steps 3 and 4 on any other tables that reference the entity_id, writing most of the logic in PHP

As you can see, my method neglects MySQL almost entirely. I have a feeling I can use `ON DUPLICATE KEY` effectively here, and I'm wondering if there's a way to skip step 5 entirely (as in other tables automatically update to reference the changes, possibly through a trigger).

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.