utf8_croatian_ci - how to make croatian utf8 collation?
Posted by: Neven J
Date: January 17, 2008 07:49AM

Hi everybody!

We are using mysql5.0 in production and we are located in Croatia. Few months ago we had to make a switch to utf8. We switched from latin2_croatian_ci to utf8_general_ci - the general utf8 charset and collation.

Offcourse, order by is returning nonsense since mysql orders all our national letters with same weight.

Many months have passed, but solution is still unavailable. Closest to our collation is utf8_slovenian_ci, but:
1. slovenian sees "lj", "nj" and "dž" as two letters, and in croatian they are single letters
2. slovenian doesn't support letter "ć"

... so order by still returns badly sorted values.

We can make some wizardry with converting only single field (on which we will order by) into latin2_croatian_ci and having rest of DB in utf8_general_ci. Documentation on convert() is pretty slim (
http://dev.mysql.com/doc/refman/5.0/en/charset-convert.html). Other ever worse wizardry is making new field in which we will make our pseudo language and replace our letters in following way:
c = c
ć = cx
č = cxx
...
This is very bad in long run.

The question is... How can we make our own collation for Croatian language?

What to do? Where to go? :)

Thank you all in advance!

---
update: changing the source and recompiling mysql isn't so compelling. :) Instead I've found this - /usr/share/mysql/charsets/Index.xml . And example of Vietnamese experimental collation. Am I on a right track?

Best regards
Neven

---
seven | the witchdoctor
http://www.nivas.hr - uploading 24/7!



Edited 2 time(s). Last edit at 01/17/2008 02:09PM by Neven J.

Options: ReplyQuote


Subject
Views
Written By
Posted
utf8_croatian_ci - how to make croatian utf8 collation?
13630
January 17, 2008 07:49AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.