Comparing accented characters with their closest English equivalents
Posted by: Karl Dane
Date: March 03, 2006 10:36AM

I have a load of articles stored as utf8_general_ci longtext, and I'm using full-text searches against them. The articles contain almost the full range of Latin diacritics (French, Danish, German, etc). What I want is to be able to perform a full-text search using non-accented characters and have them match against their accented equivalents. For example, searching for 'cafe' will find any articles containing 'café'.

It seems to me that some combination of Collate, Cast or Convert might do the trick, but I can't find a solution. Or would I be better off storing my data using something other than utf8_general_ci?

Failing that, is it possible to keep my data as utf8, but create a full-text index with all accented characters flattened to their closest English equivalent?

All help is gratefully received!

Thanks,

Karl Dane



Edited 1 time(s). Last edit at 03/03/2006 04:40PM by Karl Dane.

Options: ReplyQuote


Subject
Views
Written By
Posted
Comparing accented characters with their closest English equivalents
2523
March 03, 2006 10:36AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.