Tags:
create new tag
view all tags

Question

I have to use English headlings to get TOC working. Why is this?

Environment

TWiki version: TWikiRelease01Sep2004
TWiki plugins: DefaultPlugin, EmptyPlugin, InterwikiPlugin
Server OS: Windows 2000
Web server: Apache
Perl version: 5.8
Client OS: Windows XP
Web Browser: IE

-- ChunhuaLiao - 16 Oct 2004

Answer

SmallTOCi18nIssue appears to be the same problem but for Russian characters. I believe it has a solution that might work for you.

-- SamHasler - 20 Oct 2004

We really need a test case that shows how it fails for Chinese TOC entries. Could you provide a URL, or provide some TWiki markup text here that causes the problem?

Could you also attach the HTML output of testenv, as I'm interested in how TWiki works on Chinese sites?

-- RichardDonkin - 21 Oct 2004

I tried many Wiki software, MediaWiki and TWiki are the top 2 as I can tell. I settled on both MediaWiki (for public) and Twiki (for internal)at the same time. I cannot afford to miss neither of these two great wiki software. smile I bothered MediaWiki developers many times to let MediaWiki has complete permission control like Twiki. And I am also shamelessly bothering Twiki maintainers to borrow some good features from MediaWiki. The ideal result would be I could use only one perfect wiki for both public (simple, easy formatting rules) and internal (flexible permission control) usage.

Since I treat my TWiki as an internal site, I cannot make a public test case there. I tried to make a test case here but failed. For example, 中文测试页面 . I also tried to make a test case in the test site, http://donkin.org but I cannot register a new account there.

Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete 
your request.
Please contact the server administrator, webmaster@donkin.org and inform them of the time 
the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument 
to handle the request. 

What should I do then?

-- ChunhuaLiao - 02 Nov 2004

RichardDonkin is away for 2 weeks. I'm sure he'll be keen to work something out when he gets back.

-- SamHasler - 03 Nov 2004

Rather a long answer so here's a TOC smile

I've disabled registration on my http://donkin.org site since it's mainly for my personal use and I don't have time to monitor registrations. However, I'm happy for ChunhuaLiao to test things there - the best place is the 'testbin' installation, on theChinese test page or a child topic. I've created an account for Chunhua there, password in email.

Test case for TOC issue

I've also created a test case for this issue on my Chinese test page, albeit not with latest code - on reading the latest code at SVNget:lib/TWiki/Render.pm, the problem seems to be this line, which only works for alphabetic languages (including Cyrillic as long as locales are working), but not Chinese, Japanese, etc.

    $anchorName =~ s/[^$TWiki::regex{mixedAlphaNum}]/_/g;      # only allowed chars
Try commenting out this line and see if it helps.

Use of GB2312

I note from SomeChineseCharactersBreakWikiLinks that Chunhua is staying with GB2312, which will cause you problems in future - I'd really recommend switching to UTF-8 and helping work out the problems - even though it has some issues, it is more likely to work, and in my experience works fine for Chinese and similar languages. However, if you are using GB2312, I'm also OK with helping you out as needed, as it helps debug the code for UTF-8 as well. Batch conversion to UTF-8 is probably in your future - the good news is that this is not too hard using Perl 5.8 and the amazing CPAN:Encode module, or older Perl versions and Unicode::* modules smile

Reporting I18N problems on TWiki.org

The Chinese characters above were converted into Unicode Numeric Character References (aka NCRs or HTML/XML entities) by the web browser, since TWiki.org runs with ISO-8859-1 as the site character encoding. Even though you correctly pasted in GB2312 data, the browser converts this to NCRs, as the only way of representing Chinese characters in an ISO-8859-1 text box, before it even gets sent to TWiki.

See JapaneseAndChineseSupport for more details on support for ideogrammatic languages.

-- RichardDonkin - 16 Nov 2004

It works! Thank a million for Richard's help! There is a only little difference for my lib/TWiki/Render.pm, the line 1131 to be commented out looks like

  # $anchorName =~ s/$regex{singleMixedNonAlphaNumRegex}/_/g;      # only allowed chars

I can smoothly use chinese headings to generate TOC now.

-- ChunhuaLiao - 20 Nov 2004

Great - thanks for letting me know that it worked! I'll work on a better fix to go into the TWiki code. Let us know if you have any other I18N problems - it's really useful to know what's broken.

UPDATE: My current thinking is that a new TWiki.cfg parameter, $langAlphabetic, would be set to 0 for languages such as Chinese where it's not sensible to filter out all non-alphabetic characters from headings (and in some other situations). This would avoid the need to patch TWiki code for proper TOC support in Chinese and Japanese - just set $langAlphabetic to 0 instead.

See also InternationalisationIssues for some other issues with I18N.

-- RichardDonkin - 21 Nov 2004

Edit | Attach | Watch | Print version | History: r16 < r15 < r14 < r13 < r12 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r16 - 2004-12-20 - RichardDonkin
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.