Issues with Fulltext UTF search with a parser for word boundaries
Hi,
I have a table with UTF8 columns storing some Japanese data with a fulltext index. I have parsed the Japanese with the parser Sen and inserted spaces (" ") between each word. Some example data is:
+--------+------------------------------------------------+-----------------------+----------+
| lineno | jseg | eseg | id |
+--------+------------------------------------------------+-----------------------+----------+
| 0 | これ は テスト です | this is a test | 5 |
| 1 | テスト データ の 二 行 目 です 。 | test data second line | 5 |
| 2 | 三 行 目 です 。 | third line | 5 |
+--------+------------------------------------------------+-----------------------+----------+
Running this query:
SELECT eseg FROM segments WHERE MATCH (jseg) AGAINST ('データ')
returns:
Empty set (0.00 sec)
Is this the expected behaviour?
I'm inserting a space character as the word boundary, maybe I should use something else?
I realise Senna is available, but would like to know why this isn't working first :)
Edited 1 time(s). Last edit at 04/18/2008 10:49PM by xephyris tahara.
Subject
Views
Written By
Posted
Issues with Fulltext UTF search with a parser for word boundaries
5199
April 18, 2008 10:47PM
3241
April 21, 2008 03:00PM
3058
April 21, 2008 07:21PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.