Tags:
findability1Add my vote for this tag linking1Add my vote for this tag user_interface1Add my vote for this tag create new tag
view all tags

Proposed: Use Underscore Words for Autolinking

Proposed Rules

  • Add new AUTOLINKTYPE preference setting to TWikiPreferences (and WebPreferences where needed) with these supported values:
    • Set AUTOLINKTYPE = wikiword       (for WikiWord linking; current spec)
    • Set AUTOLINKTYPE = underscore   (for underscore word linking)
    • This can be combined, e.g. Set AUTOLINKTYPE = wikiword, underscore
    • The default is both, e.g. combined
    • Autolinking can also be turned off if set to an empty value; in which case you need to use the double square bracket rule for linking
  • Syntax of underscore word for autolinking:
    • An underscore word is defined as a word with alphanumeric characters and at least one underscore, like Good_style, Pattern_description
    • At least one underscore must be present; turn a single word into an underscore word by appending an underscore, like Home_
  • Autolinking of underscore words:
    • Autolinking is only done if underscore word is preceeded by a space or parenthesis (same spec for WikiWord links)
    • Link label shows spaces instead of underscore (better usability; helps search engine results)
    • Links are case-insensitive, e.g. a link to Underscore_word topic can be written as underscore_word, Underscore_Word, or Underscore_WORD
    • Underscore words can also be used in double square bracket links, like [[Good_style]] and [[Good_style][good style]]

Variants:

  • Variant #1: same as above but make underscore autolink require a terminating underscore. This would be useful as a quick check on my 10,000 wiki pages have shown a huge number of a_b words, but no a_b_ forms. E.g. you would have to write underscore_word_ (added bonus: it would be consistent with single words like home_)
  • Variant #2: single words like home_ are written as [[Home_]]. Multiple words are written =Underscore_words or =[[Underscore_words]]

Background

To improve usability and readability TWiki should support underscores as Wiki word separators. Underscores can be substituted by spaces, so links will look like normal links on normal websites. It looks more humane. It will sell TWiki better to the outside world.

Underscore wiki_words should not be a substitute for the old WikiWord syntax (we will not risk to break existing content), but as an additional option/preference.

I've been using underscored_links for almost one year with great satisfaction. I would not want to go back. Problem: my implementation is more of a hack and not ready for the core. So other CoreTeam members and/or others are kindly invited to build it in a proper way. Therefore this summarizing topic.

  • In my naive implementation (see NoBumpyCase) I have also done away with the need to use a capital for the first wiki word; for instance I can use home_ as wiki word.
  • I have used a hack of SpacedWikiWordPlugin to display the underscores as spaces.

Contributors:
-- ArthurClemens - 12 Oct 2003
-- PeterThoeny - 09 Nov 2003

Quotes from other topics

The discussion on underscores comes up periodically. Some exerpts:

From WikiWordDiscuss:

I like the idea of wiki words very much, but the decision to use ThisWayToWriteThemIsInMyOpinionVeryDefective. I prefer to use underscores to identify a word or sentence as a wiki word because:

  • Using underscores preserves the case of words.
  • As a result, it's possible to render a wiki word stated in underscore syntax LikeThis, but not vice versa. [...]
  • Underscores could be left out when rendering a wiki word [...]
  • The underscore approach handles abbrevations properly
  • A lot of computer related terms use wiki word syntax but are not supposed to be recognized as such most of the times.
  • (take from somewhere else) Some languages do not have case, so there it's difficult to create a wiki word. (Eg Chiq_chaq mentions Hebrew, Arabic, Kanji. -- EdAvis)
  • Having preseved case, it now makes sense to ignore case when creating or *go*-ing a topic, since case now doesn't serve as a word delimitter.
-- MichaelUtech - 18 Nov 2001, 23 Nov 2001

From LocalizationIsNeeded:

This is related to a plugin I'm working on to allow WikiWords that look ok in Swedish, since the StudlyCaps imposed by default WikiWords makes the text look really horrible in Swedish. This becomes very obvious if you cut and paste text from an internal TWiki page into an email as a response to a customer question. Instead of modifying the character cases throughout the text, the plugin will allow normal_words_like_these to become links. In the rendered version, the underscores are replaced by spaces and the text will cut & paste without any problems and become normal words like these. The link would look like this -> normal words like these.
[...]

-- StefanLindmark - 02 Feb 2002

From WebTeachProject:

The usual wikiwords do not render well in Italian, so most of our users tend to stick with the square-bracket mechanism. However, wikiwords appear in the index, etc. I would like to allow the use of underscores as an alternative way of indicating wikiwords.

-- FrancoBagnoli - 14 Mar 2002

From WikiWordDiscuss:

About the bumpy case (Colas Nahaboo): The Bumpy Case is some kind of "trademark" of wiki systems and these kind of "Standard" help users be confortable in using a wiki, not being locked in what they perceive as a weird syntax used only by one implementation)

I don't completely agree: how many wiki systems will the average end user see before working with one? How many times will the admin change the Wiki system? I bet the answers are: none, zero. Wikis are not well known or widely used. But when users see one, the second negative thing they will notice is the unearthly syntax, like it's some kind of programming language or secret code [...]. If you want Twiki to succeed, to be widely supported, make it look and function as humanly as possible.

One more to add to this: Wikipedia is perhaps the greatest advocate of Wiki. And they don't show WikiWord syntax.

-- ArthurClemens - 25 May 2003

[...]

Everybody agrees that there is much more reads for every page that requests to edit it. So if optimizing design, we should optimize frequently used parts first - it means reading. All new users complain about word mashed together without a space between. Some newbies even complained to me that hyperlink is not visible enough in raw text and will prefer something more apparent (they have [[]], but many links are without them). Plugins.SpacedWikiWordPlugin is smart, but not too smart - where to add spaces in McDonald?

-- PeterMasiar - 27 May 2003

Note: I am now convinced that underscore links seem a nice feature. Non-geeks do not care about the Bumpy case legacy.

-- ColasNahaboo - 05 Jan 2005

From TopicSaveErrorWithTopicsContainingSpace:

The question is backward compatibility if we relax the WikiWord syntax to allow underscores. Many programmers are use underscores in names, so code descriptions in a TWiki topic would show unexpected links. This is not an issue for non-programmers. Possibly a configuration switch?

-- PeterThoeny - 11 Aug 2003

[...]

Better a per page type - possibly in the meta info. (This way I can migrate my wiki to the better format.)

-- AndyGlew - 14 Aug 2003

Discussion on implementation

I very much like the idea of underscore wikiwords. My only concern is backward compatibility with existing content.

A selective on/off feature is needed. A flexible solution is to add an AUTOLINKTYPE = wikiword, underscore setting to the TWikiPreferences. It can be set to one or the other, both, or turned off. Default could be both.

-- PeterThoeny - 12 Oct 2003

The AUTOLINKTYPE preference seems fine, as long as it is settable on a per page basis.

-- RandyKramer - 13 Oct 2003

Will the switch also cover "forced" links like [[Make this a wiki link]]? The URL should become .../Make_this_a_wiki_link IMHO.

2 ¢ by PeterKlausner - 13 Oct 2003

With a select variable, the following situations can occur:

  • BumpyCase only
  • BumpyCase and Underscore_Wiki_words
  • Underscore_Wiki_words only

Any ideas how to proceed?

-- ArthurClemens - 21 Oct 2003

Why we need uppercases? IMHO wiki_word should be case unsensitive, so even wiki_word will link to the same page as Wiki_Word (or wiki_Word) - and display '_' as space.

-- PeterMasiar - 21 Oct 2003

I agree that Underscore_Wiki_Words should be supported. I also concur with Peter that they ought to be case insensitive. This could be easily implemented by forcing case during the matching process.

I think the way to proceed would be to provide the code. We don't seem to get progress any other way.

-- MartinCleaver - 21 Oct 2003

After a few days of thought and reading I'm revising my lukewarm acceptance of this idea to wholehearted agreement.

AndyGlew's idea of a per-page setting to allow migration of old sites to the new format is a good one.

-- MattWilkie - 22 Oct 2003

I refined and updated the rules.

This feature should be implemented. I have these concerns that need to be looked at:

  • Compatibility with existing text: There might be unwanted side-effects with existing text
    • Do we need to make the rule a little bit more strict? Like for example: alpha_alphanum
  • Clash with italic rule rendering
    • Example problem case: _This sentence shows a link to Home_ and should be shown in italic_. The current code renders italic before the internal links, e.g. it would turn the sentence into italic up to "Home". This could be fixed by reverting the rendering sequence (which could have other side-effects)
  • User confusion with italic rule
  • Performance impact for ignore-case linking
    • The case-sensitive linking is fast also for big webs with thousands of topics since a simple "file exist" check needs to be done for each link to determine how to render the link. A search must be performed for ignore-case linking which can be slow (do a topic search on TWiki.org's Main web for example). This is N/A for Windows platforms.

-- PeterThoeny - 09 Nov 2003

How about if all filesnames were in lowercase with the prefered case in the meta data. So Underscore_Wiki_words and Underscore_wiki_words would both point to the same topic with the filename underscore_wiki_words.txt , and the prefered case in the meta data could be Underscore_Wiki_Words which would display as Underscore Wiki Words in the homeward path and the <title> meta tag.

Also, how would you make the following sentence italic?

  • Another Example problem case: This sentence should be shown in italic and shows a link to Home_.
    • It would be useful if a double underscore could be recognized: _... and shows a link to Home__
      otherwise you need the brackets for this case. -- ArthurClemens - 10 Nov 2003
      • This can be done by rendering the autolink before the italic (currently it is the other way around) -- PeterThoeny - 11 Nov 2003

-- SamHasler - 10 Nov 2003

That solves the example case I put forward but doesn't solve the original example because I assume it would think that both Home_ and italic_ were links. Perhaps a solution (without changing the italic syntax) is for the previewing script to check for confusions and display a box at the top of the page highlighting what they are and possible solutions. Something like:

This page contains the following line which may not be highlighted correctly:
_This sentence shows a link to Home_ and should be shown in italic_ This sentence shows a link to Home and should be shown in italic_
Did you mean:
_This sentence shows a link to Home_ and should be shown in italic_ This sentence shows a link to Home_ and should be shown in italic
or
_This sentence shows a link to Home_ and should be shown in italic_ This sentence shows a link to Home and should be shown in italic_

Would it also be possible to have buttons to automatically make the corrections?

-- SamHasler - 11 Nov 2003

I don't see how underline-wiki-words can possibly be reconciled with underline-for-italics.

Either one or the other has to go; don't have both.

All existing topics can be converted out of underline-for-italics, if that's the concern; substitute for asterisk or other emphasis markup.

However, my two cents is using underline-wiki-words is wrong. If the biggest wiki in the world still does it, even if the big gorilla (microsoft) does it, it's still wrong. There is far more content in email and text which uses underline as emphasis, not as wiki word. Written publications and authors use underline as the markup for italics, and will still do this a hundred years from now (because they won't change, believe me).

There are plenty of other markup characters which might be used instead of underline, which do not conflict with existing text and email. Having spaces within the wiki word can always be worked around. Examples to use instead of underline are quotes or double quotes or double apostrophies.

If underline-wiki-words is really implemented, by all means completely kill the italics usage. Have a release period where twiki warns the user if using underline, and after that release, remove the warning.

-- JonathanCline - 11 Nov 2003

You paint a black picture, Jonathan. I agree that we need to keep underlines as emphasis. This does not mean we can't use underscores for wiki links. The only problem is with single wiki words as Home_, all multiple word links render fine together with emphasis underscores. Feel free to experiment.

-- ArthurClemens - 12 Nov 2003

Disallowing singleton underscore wiki words ( Home_ ) will remove most of the confusion -- at some cost. Perhaps an extra character could be used which would not display in the rendered result ( Home_a ). Or we just stick with double bracket syntax for singelton words ( [[Home]] ).

I think using double apostrophes or quotes is at least as troublesome as underscores. Underscores edge out those two simply by being less visually intrusive. I'm open to exploring an alternate character. The only one at the moment I can think of which might work well is tilde ( ~ ).

-- MattWilkie - 12 Nov 2003

Finding a way of implementing this feature would be good. In a different environment (coding rules for HDL code) it has become clear to me that some people like it one way and some like it the other and that it irks people when a system doesn't work the way they would like, so the feature would assist wiki adoption. Further, I suggest that for people who don't have the language being used as their first language, then underscore-separated names are easier to discern than bumpy case ones.

-- SimonBates - 12 Nov 2003

A simple but slightly less elegant syntax that probably works with the current italics rule is with single wiki words to put the underscore at the start: _Home, _Topic, etc.

Link to _Home

Is is trivial to render the underscore invisible with a topic view, so the line will look like:

Link to Home

-- ArthurClemens - 12 Nov 2003

I ought to restate the other idea we had - to make singleton wiki words invokable by a dot prefix. i.e. .Home, .Topic, etc. This already works if you enclose it in brackets i.e. .Home and is a natural simplification of Web.Topic

-- MartinCleaver - 12 Nov 2003

Erm... Arthur, surely moving the underscore to the start is just going to reverse the problem?!? Instead of having trouble deciding where italics end it won't be able to tell where they start.

  • Ehm yes that is true. Except the problem will occur less often. A more solid solution is needed: This can be done by rendering the autolink before the italic (currently it is the other way around) (citing PT) -- AC - 13 Nov 2003

Unless you say that both are valid as singletons and they interact differently with italics depending on position.

  • No, that would be a bad idea. -- AC - 13 Nov 2003

_start _link end_ prepend start link end
_start end_ _link prepend start end link
link_ _start end_ append link start end
_start link_ end_ append won't work

But it's not exactly obvious and I think it would be easy to introduce the fourth case without realising while editing.

-- SamHasler - 13 Nov 2003

Just remember that kitchen sink protocols do not survive the test of time. The simpler protocols most always win. This is why I say use underscore for only one purpose. Sure, it can be done, that's what smart people excell at. But it's not always a good idea. Adding configuration variables does not solve the problem, because people will usually use the default value, and having yet another configuration type only makes testing more difficult.

With that said, do what you feel is right for the product.

-- JonathanCline - 13 Nov 2003

Arthur, could you explain your reasoning behind thinking that prepending would cause problems less frequently?

  • With prepending, italic only fails if the link _Home is at the beginning of the sentence, whereas postpending Home_ will break italic everywhere with the current parsing order. Like I said, this is not really an option. -- ArthurClemens - 14 Nov 2003

Peter, wouldn't rendering the autolink first stop ALL italics from working? Like the following two cases:

  • _Home on the range_
  • _There's no place like Home_

-- SamHasler - 14 Nov 2003

Single wiki words could be done via [[]] forced links, or dot syntax: ._home

BTW do we still need mixed case? If any character , _, ~ or anything makes word a wikiword, why cling to changed capitalization too? Do we want to talk about LowercaseWikiWords

-- PeterMasiar - 14 Nov 2003

No, there is no need for capitalization. site_index can be a link too.

-- ArthurClemens - 14 Nov 2003

Users do seem to favour underscore wiki words. Its such a simple rule.

We could reserve _ in the context of a prefix to rename the special topics (Web*), giving us:

  • _Search
  • _Index

etc, which would also have the added advantage that they'd sort to the bottom of the container instead of being interspersed between Website and YellowPages.

I quite like

  • Home_

As the top page.

-- MartinCleaver - 13 Apr 2004

What's going on with this? Is this feature ready to make CairoRelease?

-- MartinCleaver - 07 May 2004

Feature bumped to Dakaar due to lack of progress.

-- CrawfordCurrie - 30 Jun 2004

See 2 proposed solutions in TopicSaveErrorWithTopicsContainingSpace. This only needs to be checked.

-- ArthurClemens - 30 Jun 2004

Personally I have ignored this as I hoped it would go away. I keep thinking "more hard coded syntax :(". Now that I've been pointed at JurajVariny's question of what now? and the lack of response and action from anyone else (especially as that patch is a bug fix), I'd like to suggest to those of you that are considering configurable linking styles.... THis must be able to be made plugable. at minimum like StoreDotPm and UserDotPm or possibly as a true plugin.

-- SvenDowideit - 03 Jul 2004

  1. I can only conclude that TWiki ignore all PatchProposals due to a systemic case of poor management
  2. You might want to ignore it but I want this feature. What harm does it do? Is it harder-coded syntax than the messy Web* topics that litter a web on creation?

-- MartinCleaver - 03 Jul 2004

I would also like voice support for this feature being included in CairoRelease. At the very least, ArthurClemens deserves complete consideration of the two solutions he put forward - if for no other reason than out of respect for his contribution in TWikiUsabilityExpertRole. What good if having such a role if we are going to ignore his suggestions? In this role, he has made a strong case for this relatively modest feature and it's precisely these kinds of details that make a big difference for usability.

Regarding "hard coded syntax", there's tons of it in TWiki so in the absense of a developed alternative, I don't think that's sufficient grounds to dismiss legitamate short-term improvements.

-- LynnwoodBrown - 03 Jul 2004

Excellent. I'm glad to see that there are at least 2 strong supporters of this feature. now write the code, supply the patch, and the documentation, and I'll commit it.

-- SvenDowideit - 04 Jul 2004

OK, I got the message: I'll avoid offering any input unless I can code it.

-- LynnwoodBrown - 04 Jul 2004

interesting. I'd say you did not "get the message" at all. The message is that there are a number of people around here that seem to think that comments like Martin's above are somehow helpful, and that this will somehow give others the motivation to do something they imply needs doing.

the message is somewhat related to: think about doing something to help, rather than complaining that no-one is doing it for you. I'm bloody tired of twiki again, and its only taken a week or two of trying to get things together for a release.

apprently this has sparked another of my bad moods.

Lynnwood, what did you want me to say? Yes sir, I'll write the code, test it and docco it right now? and I'll get Cairo out at the same time too?

-- SvenDowideit - 04 Jul 2004

I'll second Sven on this.

Rather than just whining about missing features, then write the code and generate a well documented patch.

If you don't have the confidence to do that, pick up on one of the open bugs and try to reproduce/analyse it, and free up someone who does.

If you can't write code, find some documentation to write so you free up someone who can code.

If you can't write documentation then pitch in and help answer some still-open user questions in the support web, and free up a developer that way.

If you can't code, analyse, document or answer questions, then maybe you can help out in some other way; be creative.

But please don't just complain.

-- CrawfordCurrie - 04 Jul 2004

My apologies. I know you guys are pushing to complete Cairo and I was being oversensitive to Sven?s response. Also, on more careful reading, I see that I was underestimating the work still needed to implement this feature. Crawford -- I will explore your suggestions for redirecting my impatience into moving Cairo forward. Thank you both for all the work that is getting done on Cairo!

-- LynnwoodBrown - 04 Jul 2004

Added Variant #1, and a comment retracting my previous opposal.

-- ColasNahaboo - 05 Jan 2005

Colas, you mention a_b words in your TWiki. In what context are they used? If this is code text, then linking would be turned off in any case...

I've added variant #2, because single words written as Home_ or _Home continue to conflict with italic notation. So that becomes [[Home_]]. Words written as Underscore_word do not conflict with italics.

The regex I have in mind is this:

$regex{wikiWordRegex} = qr/[$regex{mixedAlphaNum}]+[_]+[$regex{mixedAlphaNum}_]*/o;
This will allow H_, Home_, Underscore_word, but also 123_num.

Implementation notes:

  • If the notation is configurable (AUTOLINKTYPE), and it should, then in TWiki.pm $regex{wikiWordRegex} can only be set after (or in) the new handler.
    1. $regex{wikiWordRegex} can be left as is in the BEGIN handler, and conditinally rewritten in the new handler following the preference setting (ugly code?).
    2. TWiki concept regexes can be set later, in a function called from the new handler.
  • How to read AUTOLINKTYPE from the prefs and set the regex accordingly?
  • After that, I would like to hide the underscores in the links, so they become humanly readable, and as noted at the top, this helps search engine results.

-- ArthurClemens - 26 Mar 2005

I've got it basically working, the creation of the correct links, that is.

You can set AUTOLINKTYPE to wikiword or to underscore or to wikiword, underscore.

A new setWikiWordRegex handler in TWiki.pm sets $regex{wikiWordRegex} to the desired regex.

Question 1: I still have a problem with underscores that occur in code text, like : _your words_: this is rendered as:

_your words_?

some_variable, some_longer_variable and some_ all render OK.

Any idea where to look?

Question 2: the (double) regex for wikiword, underscore becomes long:

([[:upper:]]+[[:lower:]]+[[:upper:]]+[[:alpha:][:digit:]]*)|([[:alpha:][:digit:]]+[_]+[[:alpha:][:digit:]_]*)

Would that make parsing very slow? Ideas for optimizing?

-- ArthurClemens - 12 Apr 2005

No, it wouldn't make it slow. But you have to be sure it works, so you need a wealth of testcases that can be used to verify behaviour. I would recommend starting with a topic in TestCases that demonstrates the current behaviour, so you can be sure you don't break currently expectations.

Note that you have to be very very careful to make sure your setWikiWordRegex handler is called in the right place.

-- CrawfordCurrie - 13 Apr 2005

I am calling it right after the prefs are fetched, inside new:

$this->{prefs} = new TWiki::Prefs( $this );   
$this->setWikiWordRegex();
This seems to work fine. What should I look out for?

The unwanted linking in code text I have solved in Render::_fixedFontText.

I am testing the whole thing now.

-- ArthurClemens - 13 Apr 2005

That's OK, but I was expecting you wanted plugins to be able to override it. Sorry, my misunderstanding. If it's just hard-coded, why do you need the function?

-- CrawfordCurrie - 13 Apr 2005

To read AUTOLINKTYPE from the preferences. So it is hardcoded, but you can set it in preferences.

How would plugins override it?

-- ArthurClemens - 13 Apr 2005

Implementation

I have tested my code and found no problems so far.

I've added two patches, one for TWiki.pm (the actual underscore support) and one for Render.pm (fixes rendering of code text). The code works, but please check for nicer syntax or optimizations.

For underscore linking to work on a site, add the new variable AUTOLINKTYPE to TWiki.TWikiPreferences: Set AUTOLINKTYPE = wikiword, underscore

These are the cases for AUTOLINKTYPE:

  • not set (empty) : wikiword is assumed
  • wikiword : words written as WikiWord are linked automatically
  • underscore : words written as wiki_word or wiki_ are linked automatically; to prevent unwanted side effects when used together with italic syntax, I advice to use [[wiki_]] for singular wiki words
  • wikiword, underscore : both types of words are linked automatically

What's left to do:

  • Render the underscores as spaces, so search engines can index the link labels as real world text, and it reads better
  • Accept underscore wikiwords when creating new topics, possibly only when AUTOLINKTYPE has been set to contain underscore
  • Add finer granularity for NOAUTOLINK: see NOAUTOLINKForUnderscoreWikiWords

-- ArthurClemens - 13 Apr 2005


Sorry to throw a spanner in the works, but I only just realised what you are trying to do here. I think that this is a terrible idea. I'm all in favour of an agreed extension to wikiword syntax to support underscores. What I react to is the idea that it is optional - I think this is a disastrously bad idea.

My basic objection is that by making syntax optional on preference variables, you make it impossible to consistently exchange data between TWiki installations. It also makes it virtually impossible to create tools like WYSIWYG editors that recognise and handle basic TML syntax - because that syntax is not invariant. I have been looking for ways to deprecate the existing optional syntax (NOAUTOLINK), so I obviously can't support the addition of more!

Either extend the definition of TML to include underscore wikiwords, or provide a plugin to handle underscore wikiwords. Please do not make the definition of TML - and therefore the core code - even more complicated this way!

-- CrawfordCurrie - 14 Apr 2005

Well, I am for adding it to the core. But there was a lot of resistance regarding existing content, therefore Peter's suggestion for AUTOLINKTYPE seemed natural.

Wouldn't a WYSIWYG editor use TWiki's engine to render (and recognize) links? Unfortunately I cannot test KupuEditor because it does not work with Dakar code.

About exchanging data: that would still be a problem when working with a plugin. Data that is exchanged would be better off with bracket notation.

-- ArthurClemens - 14 Apr 2005

If existing content is such a big problem, then add a conversion script or a plugin (extension to DefaultPlugin) that protects underscores. But please please please don't make underscore wikiwords configurable on a user (or even web) level! It's a recipe for trouble.

Any time you add context-dependencies into a syntax, you automatically force all tools that handle the syntax to be context-dependent. For example, consider a javascript editor that parses the text and recognises wikiwords, and applies special formatting. How is it supposed to know whether underscores are on or off? If you ask it to configure itself off TWiki, you have added a huge layer of complexity for the implementation. If you ask the user to configure it locally, you have added another layer of complexity for the user, who now needs to know about switchable syntaxes, on top of all the rest of the complexity. This impacts not just editors, but any code that process TWiki syntax, such as plugins, side scripts and other twiki installations. It is much much much much better to work from an invariant syntax, even if that syntax changes from time to time.

It is far less evil for underscore words that were not previously wikiwords to suddenly become wikiwords, than it is to have this horrible extra complexity.

BTW I think the argument for compatibility presented above is extremely weak. Most code is presented in verbatim brackets, and code in topics is rare anyway.

-- CrawfordCurrie - 14 Apr 2005

Crawford, I agree with your arguments on compatibility between sites.

Then I propose that underscore wikiword syntax be a permanent part of wikiword syntax. My new patch is largely the same as the former patch, only now it does not look at the user's preferences but sets the wikiWordRegex in the BEGIN handler.

I did make a small change to the underscore regex expression; valid underscore wikiwords now:

  • May consist of several words, concatenated by underscores, like: Wiki_word
  • May be a singular word, that ends on a underscore, like: Home_
  • May not otherwise end with an underscore; Some_underscore_topic_ is not valid syntax

What is cool is that when you create a new topic with an underscore there is no complaint that this is not a valid wikiword (when the patch is applied). Just glad this works.

Question: to render the underscores in wikiword links as spaces, would be the best approach to do this in a plugin? SpacedUnderscoreWikiWordPlugin perhaps?

-- ArthurClemens - 16 Apr 2005

Whereever you implement spaced underscore wiki words, I suspect its time to make the RenderDotPm function pluggable. I unpicked this function from the cuffuddle with the intent of making it available for override by the SpacedWikiWordPlugin

Hmm. We should automate benchmarking the individual hooks - an override of that function would get called a lot!

-- MartinCleaver - 16 Apr 2005

Replacing underscores with spaces is really no different to spacing out wikiwords. i.e. it should be done in a plugin, and not in the core. However I would suggest building it into an existing plugin to simplify matters for installers; if they have to install a new plugin for every syntactic nuance, it is not only inefficient but also a lot of work!

BTW I am really looking forward to being able to use underscores in wikiwords!

-- CrawfordCurrie - 17 Apr 2005

But if we would use SpacedWikiWordPlugin, then all wiki words would be spaced, wouldn't it?

-- ArthurClemens - 17 Apr 2005

Well, you would obviously want to make it configurable for the local installation; but that would be the logical place to combine the functionality, I guess.

-- CrawfordCurrie - 17 Apr 2005

Current status for rendering underscores as spaces: waiting for the rewriting of SpacedWikiWordPlugin (by ?).

-- ArthurClemens - 28 Apr 2005

Crawford's assertion that most code appears in verbatim brackets and that it's rare anyway may be true of TWiki.org and his wikis, but it's most definitely NOT true of the TWiki webs we have here. We have all kinds of references to variables, functions, and code of all types containing underscores sitting verbatim in text in our webs. If we're lucky they're marked as typewriter with equals signs (not that this matters).

It will make me very sad and distressed when all of those turn into wikiwords :-(. It will make me even more sad and distressed to tell all my users they now have to prefix all their underscored text with .

One of the big bonuses to TWiki for us has been that we can embed code, etc. in the text without hassle. If I want a more readable link I use the [[]] syntax, and I see no reason why that's not good enough for anyone else.

Sorry, but I give this change a big thumbs down.

-- PaulSmith - 21 Jun 2005

If I want a more readable link I use the [[]] syntax, and I see no reason why that's not good enough for anyone else.

Because the topics will still be named according to WikiWord syntax - and will appear so everywhere. There is currently no way to have topics named differently. This is a problem in non-English languages like German or Dutch or Italian.

Also 'geeky' WikiWords stand in the way of general adoption of TWiki.

I hope you can think with us towards a creative solution.

For starters, your code text does not have to be prefixed, only written as code in between equal signs.

-- ArthurClemens - 21 Jun 2005

I'd like to point out to people that the WikiWord syntax is historical, and that there has been a better proposal in existance for a long time.

instead of defining a 'new' syntax that is just as geeky, and worse, is specific to TWiki, it is possible to allow free text linking of any kind.

thus you would be able to create a topic called "free text linking", and whereever this is found, you would get a link.

there is quite a bit of work to be done to make this possible, as you have to reverse the lookup of links when rendering, but I think this would produce the best possible wiki.

  • this is an idea that has been toyed with on IRC quite a bit, but no-ones taken the time to implement it - the changes in DevelopBranch will make it easier though..
  • creating new topics would require either a form, or some javascript
  • some support for overlapping topic names would be required
  • WikiWords would still be supported, as any existing topic would match - though a side benifir is that if you get the case wrong, it would still link.

Also, I prefer the exact oposite to what Paul mentions - I want code to be marked up if there is a topic by that name. But I would prefer there not to be a ? inserted into the code..

-- SvenDowideit - 22 Jun 2005

A lowtech solution would be to write [[free text linking]]. We could almost do this with TWiki, but we are not allowed to create topics with spaces.

This problem of topics with spaces exists in Sven's idea too. It would be possible of course to transform spaces to %20. But then searching in topic names would need to be adjusted to this syntax as well.

-- ArthurClemens - 21 Jul 2005

Free text linking has a major problem, as we discovered when discussing on IRC: http://koala.ilog.fr/twikiirc/bin/irclogger_log/twiki?date=2005-10-03,Mon&sel=705#l701

Free text linking as I understand it lets the user create a topic "my new topic" and all occurrences of my new topic will become links, without interference of anyone.

I can imagine, someone has written a topic named About. So all occurrences of About suddenly become links, all over the wiki

The best proposal I see is still to use underscores.

To tackle the recurring argument instead of defining a 'new' syntax that is just as geeky, and worse, is specific to TWiki (not personal, SD - this is just not true for non-English languages. The rest of the world would welcome a different syntax.

(the attached patch marked DEPRECATED still works on TWiki.pm r.6751).

-- ArthurClemens - 03 Oct 2005

ok, from the diversity of our requirements, its pretty obvious that we will need to extract the linking code to provide different linking styles, from traditional wiki, wikipedia style links (they make the same argument as you do Arthur), and brutal, everything links like Travis and I are interested in. I would however suggest that using the Wikipedia syntax for this would be worth considering more - as in [[free text]]

-- SvenDowideit - 04 Oct 2005

While bracket notation serves many of my needs, it does not cover all cases. I want the topic names in the window title, the breadcrumbs and the search results to be readable too. If a topic is called "My topic" it should be rendered My topic also in these cases.

I think that the easiest is to use character substitution for illegal filename characters. So when creating the new topic "My topic" the filename would become My+topic or My_topic (the latter is better readable in a url). Character substitution would be even more user friendly than allowing underscores in topic names.

When drawing the topic name on the screen the character substitution should be reversed. When using underscores, all underscores in a topic name would be replaced by spaces.

This still leaves open how to create/write topics that consist of one syllable. The rules I've described somewhere above say that Home_ and Topic_ are allowed topic names, but to write them in a topic you must use bracket notation: [[Home_]].

Meanwhile I am interested how you would tackle the problems with autolinking (for instance the topic "About" above).

-- ArthurClemens - 04 Oct 2005

I don't see autolinking About as a problem. Not even a little problem. If a WikiCommunity thinks that a topic called About is justified, it probably deserves to be a link everywhere. But I also understand the Wikipedia point of view - they beleive that too many links distract from the body of work, and in their context (a more general work) it makes sense. In a more Domain specific area, or a product development colaboration wiki, it seems to me that if something has a topic, its important to that domain.

I also feel that in a autolink anything context, links need to be less distracting from the text (that asumes of course that you have a pretty complete topic set for that body of work)

another thing that I would like to suggest, is that the Filename on disk does not need to resemble the UrlName, nor the rendered version. You need to provide sufficient hints for the renderer to make correct choices, but spaces and case could be considered to be non-syntactical embelishments like italics and bold

eg. 'ThisIsATopic' is the same as 'This Is A Topic' is the same as 'This isATopic' and 'thisisatopic'. and its possible to have the filename be ThisIsATopic.txt. the admin can then choose how its rendered either in how they type it, or how the renderer over-rides it.

the main concern over doing this, is that we have to reverse the way we look for topic matches, and how we create new topics

I thought that there was a SpacedWikiWordPlugin that rendered all topics with spaces? is that the end result you desire?

-- SvenDowideit - 04 Oct 2005

No, Plugins.SpacedWikiWordPlugin creates silly spaces: Home Page, Action Script, Coffee Break; or capitalizes (logical if this are wiki words): The Design Of Sites, Buttons That Work As Movieclip, Location Of Flash Cookies On Disk - all plain silly in English, let alone in German or Dutch.

But the effect that non-word characters are rendered as spaces is desired, and Plugins.SpacedWikiWordPlugin could be used for that.

-- ArthurClemens - 04 Oct 2005

oh, you want capitalisation to be significant - OK, i understand - good point! doesn't fit into what i'm after, as I want the topic's capitalisation to be irrelevant, and for it to come from the reference (so i could write Homepage, HomePage, Home page and Home PaGe, and all would like to the one topic, rendered as writing on the topic (maybe))

-- SvenDowideit - 04 Oct 2005

I see that there are 2 directions we can take:

  1. Use the filename and break it down (substitute) to readable topic names.
  2. Use an arbitrairy filename and use it as ID to look up a meaningful value (the name).

-- ArthurClemens - 04 Oct 2005

I would paraphrase the second option in a different way: use the meaningful name as a lookup key for the file.

-- RafaelAlvarez - 05 Oct 2005

This could be a concept for using generated (internal) topic names as well, couldn't it?

-- FranzJosefSilli - 05 Oct 2005

With regards to free text linking (across phrases of words separated by [-_\s] see HackingOutWikiAnnoyance for existance proof and additional justification.

Salient comment is that this is good in some (explorative) environments, and maybe less good in others (as Wikipedia).

-- JeffPeck - 28 Oct 2005

Recap

Its time to see where we are now. Lots of people have been for this proposal, some against. We've seen other proposals as well, but these are far from implementation. Time to get more practical.

We've seen that underscores can create conflicts with italic TML. My example for Home_ was bad in that regard. The way to keep italic TML intact is to only accept underscores in the middle, like Page_name.

So:

_italic sentence with Topic_

will render:

italic sentence with Topic

and

_italic sentence with Topic_One_Two_

renders:

italic sentence with Topic_One_Two

Underscores in code text will not be rendered as link:

=Func_name=

will render:

Func_name

A problem could exist for filenames, like my_file_name, but only if the name is not inside code TML. The problem could be lessened by allowing only words that start with uppercase, so my_file_name would not get linked, but My_file_name would. Personally I think this problem is neglegible and the stricter syntax will deprive us of flexibility.

All in all the changes are so small that an extra configuration option would not be necessary.

The regex variables for this change are:

$regex{classicWikiWordPattern} = qr/[$regex{upperAlpha}]+[$regex{lowerAlpha}]+[$regex{upperAlpha}]+[$regex{mixedAlphaNum}]*/o;
$regex{underscoreWikiWordPattern} = qr/\b[$regex{mixedAlphaNum}]+[_]+[$regex{mixedAlphaNum}_]*[$regex{mixedAlphaNum}]\b/o;
$regex{wikiWordRegex} = qr/$regex{classicWikiWordPattern}|$regex{underscoreWikiWordPattern}/o;

Further updates: The WebTopicCreator script would need to be updated. And the rename script would need extra care: it does not recognize the wikiword if it is at the end of an italic sentence.

-- ArthurClemens - 31 Mar 2007

"this problem is neglegible" is a major understatement.

This major change in the definition of "WikiWord" will be a major problem. We will have so many false links on our Motorola wiki that you cannot imagine it.

Even on my Motion TWiki I will get a major problem. All my Motion options have underscores in them and would become links pointing to nowhere.

You cannot make this change without pissing off all our existing users. Most older TWikis are used in typical technical/software environments. The geeky syntax is not generally accepted by people that are not technical. So most TWikis used professionally will be full of information related to software and software information is full of underscore words and only a fraction of them will be inside verbatim tages or with = around them.

I know this as a fact. I simply changed the code as suggested above and the result is pure disaster for both my Motion TWiki and our Motorola TWiki. The Motion TWiki may be manageable. It only has 1200 topics. Our Motorola TWiki has magnitudes more topics. There is no way we can fix all those unwanted links that will arise from this change.

This change will be a disaster for many existing TWikis.

-- KennethLavrsen - 01 Apr 2007

I've written a new proposal in UnderscoreWikiWordsWithoutSyntaxChange.

-- ArthurClemens - 01 Apr 2007

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatdiff underscore_wiki_words_no_prefs.diff r2 r1 manage 1.5 K 2005-04-16 - 19:11 UnknownUser Adds support for underscore wikiwords as base syntax (no preference to set); diffed against TWiki.pm SVN r.4027
Unknown file formatdiff underscore_wikiword_patch_RenderDotPm.diff r1 manage 0.3 K 2005-04-13 - 20:55 UnknownUser Corrects unwanted linking in code text; diffed against Render.pm SVN r.4006
Unknown file formatdiff underscore_wikiword_patch_TWikiDotPm.diff r1 manage 2.2 K 2005-04-13 - 20:53 UnknownUser DEPRECATED Adds support for underscore wikiwords; diffed against TWiki.pm SVN r.4006
Edit | Attach | Watch | Print version | History: r86 < r85 < r84 < r83 < r82 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r86 - 2007-04-01 - ArthurClemens
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.