Tags:
create new tag
, view all tags

Question

I have set up a TWiki on my site, at lojban.org. The same project uses PHPWiki as well, at http://nuzban.wiw.org/wiki/, and I would like to move it over.

Unfortunately, the project in question uses a language in which ' is a significant character. For example, doi and do'i are two very different words. Since TWiki strips ', these end up pointing to the same place, making TWiki essentially useless to us. I like it very much, but this is a stopping point. Is there anything I can do?

  • TWiki version: Latest Debian package.
  • Web server: Apache
  • Server OS: Debian
  • Web browser: Opera
  • Client OS: Win2K

-- RobinLeePowell - 11 Nov 2002

Answer

This is a special case of internationalising WikiWords (which I assume is the problem here) - see this search for some relevant pages. In particular, WikiNamesAreNotInternational and InternationalCharactersInWikiWords discuss some approaches - the transcription based patch may be relevant too. The idea is to add \' as a character within the character classes that look like [A-Z] etc (actually the backslash should not be necessary).

Alternatively, to avoid coding, you can just define a suitable WikiWord for all pages and define all internal links inside square brackets, e.g. do'i or doi (try the links). Not as nice to use, but then it would be difficult to make doi or do'i a WikiWord anyway. You'd need to come up with a suitable transcription of the quote character, here I have used 'QQ' on the assumtpion that QQ is uncommon in lojban.

UPDATE: I tested out a very simple change to TWiki.cfg that allows links such as [[do'i doi]] and [[do'i]], resulting in URLs to Do'iDoi and Do'i - similar to what the PhpWiki does currently, as far as I can tell. Here is the change:

# $securityFilter     = "[\\\*\?\~\^\$\@\%\`\"\'\&\;\|\<\>\x00-\x1F]";
# Lojban hack - don't filter out single quote
$securityFilter     = "[\\\*\?\~\^\$\@\%\`\"\&\;\|\<\>\x00-\x1F]";
Of course, there may well be SecureSetup implications from removing single quote from the security filter... However, as discussed in TaintChecking, the filtering-out approach is better replaced by one in which only legal WikiWord characters are 'filtered in'. Once there is a single variable defining the legal characters for Wiki words, it is very easy to make this sort of change in one place, which would help internationalisation support of course.

This is probably not enough - you'd need to also get into the specificLink and other subroutines in TWiki.pm to make sure that the ' is encoded as %27, and you might also want to change the logic that uppercases spaced-out wiki words and removes the spaces (encode them as %20). This is based on the TWikiAlphaRelease - I have added some extra comments that should make it easier to see what is going on.

The other good thing about this tweak is that it should also support Klingon, as well as other languages that use apostrophes a lot smile

I'd be interested to hear how you get on. See PerlTips and TWikiDebugging for some hints on TWiki coding.

-- RichardDonkin - 12 Nov 2002

Topic revision: r3 - 2002-11-12 - RichardDonkin
 
Twitter Delicious Facebook Digg Google Bookmarks E-mail LinkedIn Reddit StumbleUpon    
  • Download TWiki
TWiki logo Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.