Question
Hi,
It seems that
SearchEnginePluceneAddOn doesn't like special characters (with accents etc):
Can't add out-of-order term geÃxtrapoleerde lt
+geÃ\203Ã\203Ã\202Ã\203Ã\203Ã\202Ã\202Ã\203Ã\203Ã\203Ã\202Ã\202Ã\203Ã\202Ã\202Ã\203Ã\203Ã\203Ã\202Ã\203Ã\203Ã\202Ã\202Ã\202Ã
+\203Ã\203Ã\202Ã\202Ã\203Ã\202Ã\202ërodeerd (text lt text) at /usr/lib/perl5/site_perl/5.8.5/Plucene/Index/SegmentMerger.pm
+line 154
Can't add out-of-order term oriÃ\203Ã\203Ã\202Ã\203Ã lt
+oriÃ\203Ã\203Ã\202Ã\203Ã\203Ã\202Ã\202Ã\203Ã\203Ã\203Ã\202Ã\202Ã\203Ã\202Ã\202Ã\203Ã\203Ã\203Ã\202Ã\203Ã\203Ã\202Ã\202Ã\202Ã
+\203Ã\203Ã\202Ã\202Ã\203Ã\202Ã\202ënterend (text lt text) at /usr/lib/perl5/site_perl/5.8.5/Plucene/Index/SegmentMerger.pm
+line 154
I'm pretty new to this; could someone point me into the right direction for where to look / what to install?
Thanks!
Environment
--
JosMaccabiani - 26 Aug 2005
Answer
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.
I haven't used Plucene, but at first sight this looks like something is getting encoded to UTF-8 when it shouldn't be.
InternationalisationEnhancements has some possibly useful links in its Unicode section, but some
TWikiDebugging would be necessary.
--
RichardDonkin - 17 Sep 2005
I have no problems indexing and searching spanish/catalan topics/attachments (which have accents and other chars à é ï ...) with a Linux box (locale settings to
en_US.ISO-8859-1) running both major TWiki versions (latest Cairo/Dakar)
The only pending issue is about special characters in
comment field of attachments, which are not displayed properly. Plucene documentation is not clear enough about fields encoding ...
--
JoanMVigo - 14 Mar 2006