Re: Best collation to use for ancient languages
Posted by: Rick James
Date: May 25, 2011 08:03PM

The character set would be utf8.
The collation may as well be utf8_general_ci; utf8_unicode_ci probably does nothing different.

Modern Hebrew and Greek are certainly supported in utf8 in MySQL. Aramaic has an issue. That character set is supported in utf8:
http://www.fileformat.info/info/unicode/block/imperial_aramaic/list.htm
But the encoding involves 4-byte sequences. MySQL (through 5.1) supports only through 3-byte sequences.

To get the 4-byte utf8 characters, see my document
http://mysql.rjweb.org/doc.php/charcoll
and search for the "4-byte utf8" section.

Declaring some column to be utf8 or utf8mb4 will provide validation checking and some level of ordering. Alternatively, VARBINARY and BLOB can receive any bytes, but will provide no checking; ordering will be binary.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Best collation to use for ancient languages
2134
May 25, 2011 08:03PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.