MySQL Forums
Forum List  »  Full-Text Search

Issues with Fulltext UTF search with a parser for word boundaries
Posted by: xephyris tahara
Date: April 18, 2008 10:47PM

Hi,
I have a table with UTF8 columns storing some Japanese data with a fulltext index. I have parsed the Japanese with the parser Sen and inserted spaces (" ") between each word. Some example data is:

+--------+------------------------------------------------+-----------------------+----------+
| lineno | jseg | eseg | id |
+--------+------------------------------------------------+-----------------------+----------+
| 0 | これ は テスト です | this is a test | 5 |
| 1 | テスト データ の 二 行 目 です 。 | test data second line | 5 |
| 2 | 三 行 目 です 。 | third line | 5 |
+--------+------------------------------------------------+-----------------------+----------+

Running this query:

SELECT eseg FROM segments WHERE MATCH (jseg) AGAINST ('データ')

returns:
Empty set (0.00 sec)

Is this the expected behaviour?

I'm inserting a space character as the word boundary, maybe I should use something else?
I realise Senna is available, but would like to know why this isn't working first :)



Edited 1 time(s). Last edit at 04/18/2008 10:49PM by xephyris tahara.

Options: ReplyQuote


Subject
Views
Written By
Posted
Issues with Fulltext UTF search with a parser for word boundaries
5098
April 18, 2008 10:47PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.