Question
I have set up a TWiki on my site, at
lojban.org. The same project uses PHPWiki as well, at
http://nuzban.wiw.org/wiki/, and I would like to move it over.
Unfortunately, the project in question uses a language in which ' is a significant character. For example, doi and do'i are two very different words. Since TWiki strips ', these end up pointing to the same place, making TWiki essentially useless to us. I like it very much, but this is a stopping point. Is there anything I can do?
- TWiki version: Latest Debian package.
- Web server: Apache
- Server OS: Debian
- Web browser: Opera
- Client OS: Win2K
--
RobinLeePowell - 11 Nov 2002
Answer
This is a special case of internationalising
WikiWords (which I assume is the problem here) - see this
search for some relevant pages. In particular,
WikiNamesAreNotInternational and
InternationalCharactersInWikiWords discuss some approaches - the transcription based patch may be relevant too. The idea is to add
\' as a character within the character classes that look like
[A-Z] etc (actually the backslash should not be necessary).
Alternatively, to avoid coding, you can just define a suitable
WikiWord for all pages and define all internal links inside square brackets, e.g.
do'i or
doi (try the links). Not as nice to use, but then it would be difficult to make
doi or
do'i a
WikiWord anyway. You'd need to come up with a suitable transcription of the quote character, here I have used 'QQ' on the assumtpion that QQ is uncommon in lojban.
UPDATE: I tested out a very simple change to TWiki.cfg that allows links such as
[[do'i doi]] and
[[do'i]], resulting in URLs to
Do'iDoi and
Do'i - similar to what the
PhpWiki does currently, as far as I can tell. Here is the change:
# $securityFilter = "[\\\*\?\~\^\$\@\%\`\"\'\&\;\|\<\>\x00-\x1F]";
# Lojban hack - don't filter out single quote
$securityFilter = "[\\\*\?\~\^\$\@\%\`\"\&\;\|\<\>\x00-\x1F]";
Of course, there may well be
SecureSetup implications from removing single quote from the security filter... However, as discussed in
TaintChecking, the filtering-out approach is better replaced by one in which only legal
WikiWord characters are 'filtered in'. Once there is a single variable defining the legal characters for Wiki words, it is very easy to make this sort of change in one place, which would help internationalisation support of course.
This is probably not enough - you'd need to also get into the
specificLink and other subroutines in TWiki.pm to make sure that the
' is encoded as
%27, and you might also want to change the logic that uppercases spaced-out wiki words and removes the spaces (encode them as
%20). This is based on the
TWikiAlphaRelease - I have added some extra comments that should make it easier to see what is going on.
The other good thing about this tweak is that it should also support Klingon, as well as other languages that use apostrophes a lot
I'd be interested to hear how you get on. See
PerlTips and
TWikiDebugging for some hints on TWiki coding.
--
RichardDonkin - 12 Nov 2002