UTF8 collation for Maltese characters is broken
Posted by: Matthew Caruana Galizia
Date: November 19, 2007 08:26PM

A full-text search for 'zejt' or 'żejt' brings up the result 'Ħobż biż-żejt', but a full-text search for 'hobz' does not - while one for 'ħobz' does.

I have taken a look at the collation document here: http://myoffice.izhnet.ru/bar/~bar/charts/utf8_general_ci.html

Bizarrely, z==ż, but Ħ!=h. In Maltese h and ħ (lowercase Ħ) are two different characters. But then again so are z and ż. Who draws up the collation algorithms, and are they hard-coded into MySQL?

I am running the latest 5.0 release, and the search is being performed on a MyISAM table with utf8_general_ci collation.

SHOW VARIABLES LIKE 'char%' gives everything except filesystem as utf8, and SHOW VARIABLES LIKE '%coll%' gives everything as utf8_general_ci.

Options: ReplyQuote

Written By
UTF8 collation for Maltese characters is broken
November 19, 2007 08:26PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.