MySQL :: UTF-8 vs UCS-2 (especially on ndb)

Contact MySQL |
Login | Register

The world's most popular open source database

Documentation Downloads MySQL.com

Developer Zone

Section Menu:

New Topic

UTF-8 vs UCS-2 (especially on ndb)

Posted by: Mirko Raner
Date: April 10, 2012 06:53PM

I'm in the process of migrating a InnoDB database to ndb, and we're frequently running into the 14000-byte row size limit imposed by ndb. There are quite a number of VARCHAR columns in our DB, and we're using the utf8 character set.
My question is: what (if any) is the benefit of using UTF-8 over UCS-2?
Since VARCHAR needs to allocate memory for the worst-case scenario (i.e., the maximum length) VARCHARs in UTF-8 require 3 times the length, whereas UCS-2 only requires 2 times the length. UTF-8 is optimized for scenarios with mainly one- and two-byte characters, but if the storage mechanism has to assume three bytes anyway, UCS-2 seems to be the better choice.
Am I overlooking something here? It seems like using UTF-8 for VARCHARs is a waste of space (especially problematic for MySQL Cluster with the smaller row memory limit of 14000 bytes).
Any insights?

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

UTF-8 vs UCS-2 (especially on ndb)

3834

Mirko Raner

April 10, 2012 06:53PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.