MySQL Forums
Forum List  »  Newbie

Re: Efficient database design for auto-incrementing, grouped, sets of data?
Posted by: Charles Carver
Date: June 28, 2016 11:39AM

Quote

Peter Brawley Wrote:
-------------------------------------------------------
> I still don't understand what problem all this
> bookkeeping busywork is meant to solve.
>
> For some reason, this table stores multiple hashes
> per entity_id. You've not explained what that's
> for, and perhaps we don't need to know. But a
> simple query retrieves all hashes for any given
> entity_id value(s), and another simple query finds
> the entity_id corresponding to a given hash value,
> so, in your "workflow" list ...

I thought I did explain that, let me try explaining it better... Let's say my users have information for a given entity. In this example, let's assume each user has information regarding a business "AJAX Inc."

User1 has a business name, a phone number, and an address for AJAX. He wants to input it into the database, so a hash is created for all the unique combinations of data, e.g hash(name+number), hash(number+address), hash(name+address) (this is simplified for the example, I have a list of actually unique combinations). He goes and inputs this information in the database, and since no hashes exist, the group of hashes gets a unique id referencing this specific entity. User1 then goes and inputs additional information in separates tables referencing the id that was just created.

Now, User2 comes along with different information for the same AJAX Inc. He only has an email address and a phone number. So hashes are created once again, and no repeats are found. Consequently, these hashes are given a different unique id since there's no way to determine if they reference anything already saved. User2 also goes and inputs additional information in separate tables, referencing the id that was just created.

However, User3 shows up and has an email address, phone number, and business name for AJAX. Two of his hashes are hash(email+phone) and hash(name+number), both of which are already present in the database. In this case, they reference two separate entity ids. Since there's an overlap of unique information between them, however, they are merged. Similarly, the data in the other tables is now merged under the same entity id, since they were determined to be the same entity.

Quote

> > *Select all entity_id`s where the hash equals
> any of the existing hashes
>
> simple query

Yep.

Quote

>
> > *Select the largest entity_id from the `hash`
> table
>
> simple query

Yep.

Quote

>
> > *If no entity_id's exist from step 1, insert
> appropriate rows with an entity_id of
> max(entity_id) + 1
>
> what purpose would that serve?

The entity id doesn't have to be auto incrementing, I suppose. I just thought it would be cleaner.

Quote

>
> > *If one or more entity_id exists, change all
> rows with those entity_id's to an entity_id of
> max(entity_id) + 1
>
> what requirement is this meant to meet?

I think I explained this with my example.

Quote

>
> > *Perform steps 3 and 4 on any other tables that
> reference the entity_id, writing most of the logic
> in PHP
>
> Yikes. Other tables reference these hash values?
> Why are you editing them at all? Why aren't they
> just referenced by those tables as foreign keys or
> as simple lookup values?

Not quite: other tables reference the entity id, not the hash. Multiple hashes exist per entity id.

I understand my questions come across as trivial, or even simpleminded. I'm fairly new to MySQL and this is the most complex database I've designed so far, which is why I'm asking such questions in the "Newbie" forum.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.