Question
Our site is working fine with mixed language characters in topic content. However, if a user makes, for example, a Korean topic name, that topic's Korean content is corrupted.
You can see this in our sandbox:
https://gopedia.gopetslive.com/twiki/bin/view/Sandbox/WebHome
Relevant settings from TWiki.cfg:
$useLocale = 1;
$siteLocale = "ko_KR.utf8";
$siteCharsetOverride = "";
$localeRegexes = 1;
Environment
--
TWikiGuest - 06 Jan 2006
Answer
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.
It looks like the page contents is having the problem, but the page name is OK with Firefox 1.5 and its built-in fonts (i.e.
this page
.) This is odd, as usually it's the URLs that have the problem and the page contents that are fine. Also, the
https://gopedia.gopetslive.com/twiki/bin/view/Sandbox/WebHome
page was fine for embedding those characters in UTF-8 as part of the URL, so it seems only that one page that has the problem.
You don't seem to have set the
CHARSET parameter in
TWikiPreferences at all, but see the
TWikiInstallationGuide section on
I18N troubleshooting.
It's interesting that
this URL using XML entity codes
, i.e.
https://gopedia.gopetslive.com/twiki/bin/view/Sandbox/고피디아, seems to work - I wouldn't have thought Apache would accept that sort of URL, but somehow it is working. Do you have any additional Apache modules for
I18N, e.g.
mod_fileiri as mentioned in
EncodeURLsWithUTF8?
You might also want to try commenting out the following line in TWiki.pm since it doesn't really help matters.
$fullTopicName = Encode::decode("utf8", $fullTopicName); # 'decode' into UTF-8
I'm on holiday until 16th Jan from tomorrow, and busy thereafter, but ping me by email if this doesn't work.
There are some Chinese sites using UTF-8 successfully, e.g.
http://www.pgsqldb.org/
, so it may help to check their testenv settings and other setup, or email their administrators.
--
RichardDonkin - 06 Jan 2006
Uncommenting that line in TWiki.pm worked like a charm! Many, many thanks!
--
TWikiGuest - 09 Jan 2006
Interesting - this sounds like a bug in using TWiki with UTF-8 as the
$siteCharset. Could you log this as a bug in Codev? This should really be fixed in Dakar since it's a low-impact fix.
--
RichardDonkin - 16 Jan 2006
Posted
Bugs:Item1421
.
--
PeterThoeny - 17 Jan 2006