Re: Chinese sorting and filtering on an table with charset UTF-8
Posted by: Alexander Barkov
Date: April 14, 2011 12:15PM

MySQL-5.6.2, which was released a couple of days ago,
includes some collation customization improvements.
You can create a Chinese collation using 5.6.2.

Moreover, the file mysql-test/std_data/Index.xml in the 5.6.2 sources
has an example of Chinese pinyin order:

<collation name="utf8_5624_1" id="354">
<rules>
...
<!-- long tailoring: pinyin order from CLDR's zh.xml -->
<reset><last_non_ignorable/></reset>
<pc>ㄅㄆㄇㄈㄉㄊㄋㄌㄍㄎㄏㄐㄑㄒㄓㄔㄕㄖㄗㄘㄙㄚㄛㄜㄝㄞㄟㄠㄡㄢㄣㄤㄥ>... all 20000+ characters here...座袏做葄蓙飵糳咗</pc>
</rules>
</collation>


So you can just copy its definition and paste into your Index.xml file
with very minor changes:

- Remove the non-Chinese part from the example collation
- Make sure to change collation id to any vacant number
under the range 1-256 (InnoDB currently does not suport 2-byte collation IDs).



Note, earlier MySQL versions will not work, because
the supported tailoring size was limited to only a few kilobytes,
(which was not enough for Chinese).

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.