MySQL :: Issues with Fulltext UTF search with a parser for word boundaries

New Topic

Issues with Fulltext UTF search with a parser for word boundaries

Posted by: xephyris tahara
Date: April 18, 2008 10:47PM

Hi,
I have a table with UTF8 columns storing some Japanese data with a fulltext index. I have parsed the Japanese with the parser Sen and inserted spaces (" ") between each word. Some example data is:

+--------+------------------------------------------------+-----------------------+----------+
| lineno | jseg | eseg | id |
+--------+------------------------------------------------+-----------------------+----------+
| 0 | これはテストです | this is a test | 5 |
| 1 | テストデータの二行目です。 | test data second line | 5 |
| 2 | 三行目です。 | third line | 5 |
+--------+------------------------------------------------+-----------------------+----------+

Running this query:

SELECT eseg FROM segments WHERE MATCH (jseg) AGAINST ('データ')

returns:
Empty set (0.00 sec)

Is this the expected behaviour?

I'm inserting a space character as the word boundary, maybe I should use something else?
I realise Senna is available, but would like to know why this isn't working first :)

Edited 1 time(s). Last edit at 04/18/2008 10:49PM by xephyris tahara.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Issues with Fulltext UTF search with a parser for word boundaries

5098

xephyris tahara

April 18, 2008 10:47PM

Re: Issues with Fulltext UTF search with a parser for word boundaries

3179

Faury Rodriguez

April 21, 2008 03:00PM

Re: Issues with Fulltext UTF search with a parser for word boundaries

2984

xephyris tahara

April 21, 2008 07:21PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.