Re: Per field encoding
Posted by: Filipe Silva
Date: May 03, 2024 06:46AM

You are correct, MySQL Connector/J encodes values as per connection encoding and not per table/column (query placeholder, actually) encoding. This is generally fine since since databases tend to be uniform in terms of encoding definitions and, if not, the server handles session-to-column encoding conversions properly.

However, large data (BLOB, CLOB/TEXT, ...) are handled differently and may be affected by such different encodings. This is why Connector/J supports the connection option clobCharacterEncoding which you can use to override the session encoding for this particular case.

I'm guessing that in your case you are setting the connection option `characterEncoding=latin1`, which is causing `setCharacterStream()` to encode as latin1 while it should be utf8. Setting also `clobCharacterEncoding=utf8` may work for you. Of course, this only works as long as you don't mix multiple encodings in the same query.

Mind that Connector/J does not know beforehand the encoding of each one of the placeholders you specify in the SQL command. This is only true when you use server-side prepared statements. If the driver, in this case, took into account the target encoding, then we would see different behavior when using other kinds of statements for the same query, and that is something to avoid at all cost.

IHTH

Options: ReplyQuote


Subject
Written By
Posted
May 02, 2024 05:39AM
May 03, 2024 05:02AM
Re: Per field encoding
May 03, 2024 06:46AM
May 06, 2024 03:58AM
May 06, 2024 04:23PM
May 10, 2024 07:17AM
May 10, 2024 10:58AM
May 10, 2024 11:03AM
May 13, 2024 06:51PM
May 13, 2024 11:56PM


Sorry, only registered users may post in this forum.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.