Search in a tagging system
I have the following working, but it's slow. It takes several minutes for a query to run.
**--Process Description--**
There are three tables:
-images (35,000 rows)
-images_to_tags (192,000 rows)
-tags (4,500 rows)
There is a many-to-many relationship between images to tags, with images_to_tags being the join table.
Users can type in a simple query like this:
"funny cool -gross +cats"
That would give the user images that are funny or cool that aren't gross, and definitely includes cats.
I have the parsing and all that figured out so that I get three arrays:
ors, ands, nots
These are used to create the queries to the database
Here's what I do:
1. Create four temporary tables: 'ors', 'ands', 'nots' & 'results' each is only a single column: image_id int.
2. Insert image ids into the temporary tables like this:
INSERT INTO ors(image_id)
SELECT image_id
FROM images_to_tags INNER JOIN tags
ON (images_to_tags.tag_id = tags.id)
WHERE tags.name="funny" OR tags.name="cool";
INSERT INTO ands(image_id)
SELECT image_id
FROM images_to_tags INNER JOIN tags
ON (images_to_tags.tag_id = tags.id)
WHERE tags.name = "cats";
INSERT INTO nots(image_id)
SELECT image_id
FROM images_to_tags INNER JOIN tags
ON (images_to_tags.tag_id = tags.id)
WHERE tags.name = "gross";
3. Remove any images that are not tagged with "cats", put the rest into the 'results' temp table.
INSERT into results(image_id)
SELECT DISTINCT ors.image_id
FROM ors INNER JOIN ands
ON (ors.image_id=ands.image_id);
4. Remove images that have been tagged with "gross"
DELETE
FROM results
WHERE graph_id IN
(SELECT graph_id from nots);
**--Problem Description--**
Step 4 takes a long time.
In my current test case, 'results' will have 7,000 images and 'nots' will have 12,000. That portion of the query can take several minutes to run. Is there any way to speed this up? Is there a better way to do what I'm doing than creating these temporary tables and working on them?
I've also tried this:
DELETE results
FROM results, nots
WHERE results.image_id = nots.image_id;
Same problem. Takes forever.
Subject
Written By
Posted
Search in a tagging system
February 15, 2007 04:14PM
November 14, 2007 02:09AM
November 14, 2007 05:17PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.