Should I split data based on path names
Posted by: Victor Porton
Date: September 03, 2017 10:09AM

We hold files at a remote storage service (Google Storage).

I am going to store the list of all these files in a local MySQL DB. This is to query about filenames faster (without contacting Google's server).

Files are classified by specifying both a "service" (one of the root directories on the Google's server) and by Group ID (the direct subdirectory of a service).

Group ID determines grouping of files. For example we may need to remove all files sharing the same Group ID and the same service.

I have in mind two different ways of the database design. Please help to choose the best one:

1. One table which contains the service, file ID and the part of the file path which does not include the service.

2. Two tables: a. the table containing the Group ID and the corresponding subdirectory of a service; b. the table linked to the first table and also containing the ID and the partial path to the file (excluding the service and the part determined by Group ID).

Note that "2" does allow quick querying for a group of files sharing the same group ID because the filename field can be indexed (even UNIQUE-indexes) and thus searching by a prefix specifying the "group" part of the part is quick.

Please help me to evaluate pros and cons of "1" and "2" to decide which variant to choose.

One reason to choose "1" is that it probably requires less disk space and group path is not repeated for each file in the group (that is the set of files which share the same Group ID).

--
http://freesoft.portonvictor.org

Options: ReplyQuote


Subject
Written By
Posted
Should I split data based on path names
September 03, 2017 10:09AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.