Question
I have to use English headlings to get TOC working.
Why is this?
Environment
--
ChunhuaLiao - 16 Oct 2004
Answer
SmallTOCi18nIssue appears to be the same problem but for Russian characters.
I believe it has a solution that might work for you.
--
SamHasler - 20 Oct 2004
We really need a test case that shows how it fails for Chinese TOC entries. Could you provide a URL, or provide some TWiki markup text here that causes the problem?
Could you also attach the HTML output of
testenv, as I'm interested in how TWiki works on Chinese sites?
--
RichardDonkin - 21 Oct 2004
I tried many Wiki software,
MediaWiki and TWiki are the top 2 as I can tell. I settled on both
MediaWiki (for public) and Twiki (for internal)at the same time. I cannot afford to miss neither of these two great wiki software.

I bothered
MediaWiki developers many times to let
MediaWiki has complete permission control like Twiki. And I am also shamelessly bothering Twiki maintainers to borrow some good features from
MediaWiki. The ideal result would be I could use only one perfect wiki for both public (simple, easy formatting rules) and internal (flexible permission control) usage.
Since I treat my TWiki as an internal site, I cannot make a public test case there. I tried to make a test case here but failed. For example,
中文测试页面 . I also tried to make a test case in the test site,
http://donkin.org
but I cannot register a new account there.
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete
your request.
Please contact the server administrator, webmaster@donkin.org and inform them of the time
the error occurred, and anything you might have done that may have caused the error.
More information about this error may be available in the server error log.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument
to handle the request.
What should I do then?
--
ChunhuaLiao - 02 Nov 2004
RichardDonkin is away for 2 weeks. I'm sure he'll be keen to work something out when he gets back.
--
SamHasler - 03 Nov 2004
Rather a long answer so here's a TOC
I've disabled registration on my
http://donkin.org
site since it's mainly for my personal use and I don't have time to monitor registrations. However, I'm happy for
ChunhuaLiao to test things there - the best place is the 'testbin' installation, on the
Chinese test page
or a child topic. I've created an account for Chunhua there, password in email.
Test case for TOC issue
I've also created a test case for this issue on my
Chinese test page
, albeit not with latest code - on reading the latest code at
SVNget:lib/TWiki/Render.pm
, the problem seems to be this line, which only works for alphabetic languages (including Cyrillic as long as locales are working), but not Chinese, Japanese, etc.
$anchorName =~ s/[^$TWiki::regex{mixedAlphaNum}]/_/g; # only allowed chars
Try commenting out this line and see if it helps.
Use of GB2312
I note from
SomeChineseCharactersBreakWikiLinks that Chunhua is staying with GB2312, which will cause you problems in future - I'd really recommend switching to UTF-8 and helping work out the problems - even though it has some issues, it is more likely to work, and in my experience works fine for Chinese and similar languages. However, if you are using GB2312, I'm also OK with helping you out as needed, as it helps debug the code for UTF-8 as well. Batch conversion to UTF-8 is probably in your future - the good news is that this is not too hard using Perl 5.8 and the amazing
CPAN:Encode
module, or older Perl versions and Unicode::* modules
Reporting I18N problems on TWiki.org
The Chinese characters above were converted into Unicode Numeric Character References (aka NCRs or HTML/XML entities) by the web browser, since TWiki.org runs with ISO-8859-1 as the site character encoding. Even though you correctly pasted in GB2312 data, the browser converts this to NCRs, as the only way of representing Chinese characters in an ISO-8859-1 text box, before it even gets sent to TWiki.
See
JapaneseAndChineseSupport for more details on support for ideogrammatic languages.
--
RichardDonkin - 16 Nov 2004
It works! Thank a million for Richard's help!
There is a only little difference for my lib/TWiki/Render.pm, the line 1131 to be commented out looks like
# $anchorName =~ s/$regex{singleMixedNonAlphaNumRegex}/_/g; # only allowed chars
I can smoothly use chinese headings to generate TOC now.
--
ChunhuaLiao - 20 Nov 2004
Great - thanks for letting me know that it worked! I'll work on a better fix to go into the TWiki code. Let us know if you have any other
I18N problems - it's really useful to know what's broken.
UPDATE: My current thinking is that a new
TWiki.cfg parameter,
$langAlphabetic, would be set to 0 for languages such as Chinese where it's not sensible to filter out all non-alphabetic characters from headings (and in some other situations). This would avoid the need to patch TWiki code for proper TOC support in Chinese and Japanese - just set
$langAlphabetic to 0 instead.
See also
InternationalisationIssues for some other issues with
I18N.
--
RichardDonkin - 21 Nov 2004