Creating Index for Many to Many Junction Table
Posted by: Raphael Thomas
Date: February 21, 2008 04:05PM

I'm trying to associate 'keywords' to 'products'. Obviously a keyword can belong to more than one product and vice versa. This is a 3rd party database and originally they stored all keywords for each product in its row in a single column (`all_keywords`) with delimiters for each keyword.

CREATE TABLE `catalog_products` (
`PUBLISHR` char(3) NOT NULL default '',
`NUMBER` varchar(50) NOT NULL default '',
`PROPER_TITLE` varchar(255) default NULL,
`ALL_KEYWORDS` text,
`CAT_SUBJECT` varchar(255) default NULL,
PRIMARY KEY (`publisher`,`number`),
CONSTRAINT `FK_catalog_products` FOREIGN KEY (`PUBLISHER`) REFERENCES `catalog_publisher` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 MAX_ROWS=250000 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC


I wanted to normalise this so I used PHP to parse the all the unique keywords and place them in their own table. Which enabled me to create this.

CREATE TABLE `catalog_keyword` (
`keyword` varchar(250) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`keyword`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

I used the same script to associate the correct keywords to the correct products in a junction table whose schema is as follows.

CREATE TABLE `catalog_product_keyword` (
`publisher` char(3) NOT NULL,
`number` varchar(50) NOT NULL default '',
`keyword` varchar(100) NOT NULL default '',
`status` char(1) NOT NULL default 'A',
PRIMARY KEY (`publisher`,`number`,`keyword`),
CONSTRAINT `FK_products_PK` FOREIGN KEY (`publisher`, `number`)
REFERENCES `catalog_products` (`publisher`, `number`)
ON DELETE CASCADE
ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=FIXED

The problem is that when I perform a search, I only know the KEYWORD on which I'm searching on. Consequently, my query will not be able to take advantage of using the index since I'm not providing it a PUBLISHER and NUMBER which make up the rest of the composite primary key for this table.

For example:

SELECT *
FROM `catalog_product_keywords_assoc`
WHERE `keyword` = 'boat';

The above query will not be able to use the index and runs very slow (10+ seconds) but returns the correct results.

SELECT *
FROM `catalog_product_keywords_assoc`
WHERE `keyword` = 'boat'
AND `publisher` = 'TFA'
AND `number` = 'abc123';

The above query here will return instantaneous however will only return one specific result.

How should I change my design around to handle searching? This is still in the developmental stage and not live anywhere so I'd be willing for a full overhaul.

Note: There are approxiamately:
160,000 products
50,000 keywords
2.4 million keyword-product associations



Edited 1 time(s). Last edit at 02/21/2008 10:20PM by Raphael Thomas.

Options: ReplyQuote


Subject
Written By
Posted
Creating Index for Many to Many Junction Table
February 21, 2008 04:05PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.