Move TOC to a Plugin
Vote on Proposal
Statistics on Vote
| 1 |
Strongly disagree |
2 |
| 2 |
Disagree |
4 |
| 3 |
am Neutral |
2 |
| 4 |
Agree |
3 |
| 5 |
Strongly Agree |
6 |
|
|
This proposal would move it out of the
TWikiKernel and into a plugin bundled in the
CorePackage.
Summary
Heavily refactored; see previous revs for older discussion. Crawford has tried to capture all the points below.
Why isn't TOC a plugin? Some people don't think it should be, because
- it is a common function which is widely used in TWiki installations (at least to read TWikiDocumentation)
- a plugin can implement it's own %TOC overriding the system default handling if necessary
- here is a possible performance hit if the TOC handling is moved to a plugin
- it messes up the variable evaluation order, breaking existing TWikiApplications
- the existing TOC handling can be optimized in the core, such as with lazy loading
- Though this effectively adds a new "type" of plugin module, adding to the complexity of the core code, and thus is against the TWikiMission - CC
- lazy loading is one possible solution; lazy loading of just the TOC does not require a new "type" of plugin module; actually lazy loading is not really appropriate for 65 lines of code. -- PTh
- most intranet TWiki installations (aka TWikiMission) use TOC quite frequently, diminishing the need to make TOC optional
On their own, these are valid arguments; however
- the desire to lighten and simplify the core, and remove glaring code carbuncles like TOC which block rendering optimisation,
- the fact that TOC in a plugin could be more easily customised, or even drop-in replaced,
- the fact that some TWiki users never use TOC, and would disable the plugin
- the desire to make plugging-in similar functionality easier for plugins authors (any plugin which scans the entire topic text the way the TOC plugin does)
suggest that it still might be a good idea to move the TOC handling out of the core.
Here are the technical issues...
%TOC%
looks like any other
TWikiVariables, but it isn't; it's special. To understand why, you have to consider the tag expansion process, and the plugin entry points in it. Here's a (crude) overview of the whole rendering process:
- Take out verbatim, rendered etc
- Call registered tag handlers (this includes all standard tags such as %SERVERTIME%, any tags registered by plugins using
registerTagHandler, and any tags implemented using Meredith's TWikiFns approach)
- Call
commonTagsHandler in each plugin
- Call registered tag handlers again, in case plugins added some new ones
- Handle TOC
- Put back verbatim
- Render TML syntax (---+, | etc)
So, first let's consider how to write TOC as a registered tag handler, and remove step 5.
The first problem is that TOC expansion relies on
TML syntax; it uses ---+!! to escape headings from the TOC, so TOC expansion has to be done before
TML rendering munges the headings. The second problem is that TOC expansion is not an incremental process, it's a text substitution. So it's a
once only process, and has to be done
as late as possible in the rendering process to allow plugins the
most opportunity to add new headings, before TOC eats them. The approach of registering tag handlers is predicated on two assumptions, neither of which hold for %TOC:
-
- The order in which TWikiVariables are expanded doesn't matter (beyond the inside-out-left-right rule)
- A twiki variable can get all the information it needs from the twiki object. It does not need to know what context it is being rendered in (the surrounding text)
Unlike any other
TWikiFn, %TOC requires access to the
full text of what is being rendered. Since tags are expanded in the order they are encountered in the text (inside out, left to right) a registered tag handler for %TOC tag would be expanded before a %INCLUDE later in the topic. As such, the text included by the %INCLUDE will not be available for processing by the %TOC tag handler. Taken altogether, these problems indicate that TOC cannot be implemented using a registered tag handler.
How about implementing TOC as a plugin? It would seem to fit reasonably well as a
postRenderingHandler.
This time the main problem is one of ordering. As stated above, TOC has to be expanded
as late as possible in the rendering process. Since another plugin (never mind a
TWikiFn), could add heading syntax that TOC needs to process, it needs to be done as late as possible in the plugins rendering process. Since TWiki only supports putting plugins
early in the rendering process but not
late (using
{PluginsOrder}), there is no way to mandate its position in the run order. Of course, you could add some jazzy extended syntax to
PluginsOrder to make that possible e.g. ellipsis
$TWiki::cfg{Plugins}{PluginsOrder} = SpreadSheetPlugin,TablePlugin...TableOfContentsPlugin
but that is not there in existing installs, so would have to be added as an upgrade step. More complexity for the installer.
Also, we must consider the possible performance hit if the TOC handling is moved to a plugin. There are some who believe that each plugin adds up to 3% to the runtime of a view.
Those are the problems. Anyone feel motivated to solve them?
--
Contributors: CrawfordCurrie,
PeterThoeny - 10 Aug 2006
Discussions
Crawford summarized the issues well if TOC is moved into a Plugin. I am primarily concerned about performance and about breaking existing
TWikiApplications if TOC is moved into a Plugin. The existing TOC handling can be optimized in the core, such as with lazy loading. Plugins that need a better TOC handling are free to steal the
%TOC% before it reaches the internal TOC handling.
--
PeterThoeny - 10 Aug 2006
There are two main points here. The first is that lightening the core is an important goal. Adding lazy loading of yet another piece of custom code is really not the way to go, as we just end up duplicating code that other people will also want to use. The second is that TOC is not unique. Processing the entire topic text after rendering is otherwise complete is actually quite a common requirement; for example, I really need to do it in the action tracker. Especially irritating is the fact that at the moment, there is no way for a plugin to process the text
after the TOC has been built. I really don't want to add yet another type of plugin module (
TWikiFns and
TWikiPlugins is enough), and adding a
postTOCHandler is just silly, so using the plugins mechanism to do this sort of processing seems to make a lot of sense. There is no reason for
TWikiApplications to break as a result of this; we are not moving the TOC processing in the rendering order. That's what ellipsis mechanism in the plugin processing order is for. Obviously we still need to look carefully at what happens between the last call to
postRenderingHandler and the call to
handleTOC to ensure nothing breaks.
The point about performance is well made, though I would observe that as we shrink the main modules of plugins and add lazy loading, the performance burden shrinks. I think the key performance consideration these days is the parsing of the plugin preferences from the plugin doc page. If we could eliminate that, perhaps by flagging in the plugin code module that preferences are not required, then we could accelerate plugins further.
--
CrawfordCurrie - 11 Aug 2006
And the links created by
TOC when including several topics with identical heading structure (pretty common usecase in technical documents) is still wrong, IIRC. --> There's need for action.
--
FranzJosefSilli - 11 Aug 2006
Agreed, we should first fix the issue we have with TOC.
On lazy loading: Will there a measurable benefit in doing so for TOC? The
_TOC function has 65 lines of code. Also, how does moving 65 lines of core code into a Plugin affect the performance? Each Plugin does extra runtime initialization stuff (function discovery etc.)
On evaluation order change if TOC is moved into a Plugin, and called as the last Plugin: Consider this scenario:
- Take out verbatim, rendered etc
- Call registered tag handlers
- Plugin's
commonTagsHandler is called in sequence:
- MyPlugin adds text with headings
- XyzPlugin adds new text that has an
%INCLUDE{}%
- SpreadSheetPlugin creates new text that is a
%SEARCH{}%
- TocPlugin is called at as the last Plugin
- Call registered tag handlers again, in case plugins added some new ones:
- N/A for new headings introduced by MyPlugin
-
%INCLUDE{}% introduced by XyzPlugin is expanded, introducing new headings
-
%SEARCH{}% introduced by SpreadSheetPlugin is expanded, introducing new headings
- Put back verbatim
- Render TML syntax (---+, | etc)
As you can see, the TocPlugin will miss headings introduced by other Plugins. Will this break some
TWikiApplications? Will other special rendering, such as
NumberedHeadings and
NumberedHeadingsWithNesting still work if TOC is moved to a Plugin?
--
PeterThoeny - 11 Aug 2006
TocPlugin must not perform it's magic in the commonTagsHandler or as a registered tag. It must perform it's magic in the
postRenderingHandler . That's what I understood from the Crawford's explanation.
If done that way, then only those headers introduced in the
postRenderingHandler may be missing. So the border case is a lot narrower than the one if Toc is implemented in the
commonTagsHandler
--
RafaelAlvarez - 12 Aug 2006
What is the point of moving this basic function to a plugin? It must be a handfull of TWiki installations out there that do not use the TOC feature. It is so basic and so necessary that I find it absolutely unjustified to move it to a plugin. On the contrary there are plugins that should be integrated into the core like table plugin.
Moving this to a plugin sound more like a religious idea. I do not see what the users get out of it. I do not see how performance can get any better. And if you believe it will then just comment out the 65 lines of code and tell us how many milliseconds you saved on a topic that did not already have a TOC. And then tell us how much you add again of time by moving this to a plugin.
I think precious coding resources are better spent closing the close to 100 open bugs including making the TOC working properly in the first place. And profiling Motion and find out where the real CPU eating time has gone. What was it that changed since Cairo that added the 40%? It was not TOC. It was already there in Cairo in the core.
--
KennethLavrsen - 12 Aug 2006
Kenneth, do we stop talking about the colour of the sunset just because the world is at war? Of course there are more important things to discuss; but we still have to have our dreams. This is actually quite practical; I need to process the text in a plugin
after the TOC is complete, and I can't currently do that.
As Rafael says, TOC has to be implemented later than
commonTagsHandler. On examining the code, I realised it has to be in
afterCommonTagsHandler (a handler I had never noticed before!)
Peter asks rightly what the performance impacts are of a move to a plugin. Yes, there is plugin initialisation cost, albeit static and therefore optimised out by mod_perl. As I said before, I think the major cost on each run is the parsing of the plugin topic (I think this is the major problem for plugins, though I have never been able to benchmark accurately enough to confirm this). Whoever wants to make this change will need to benchmark it.
I don't know how sensitive other plugins, such as those that number headings, are to the evaluation order. But my proposal is not to move the TOC handling very far; just into the end of the plugins evaluation order. At the same time, I want to add the ellipsis into the evaluation order, to allow plugins to be called as late as possible.
--
CrawfordCurrie - 13 Aug 2006
And we need fresh and good and exciting ideas to "trap" newcomers. There is nothing more disheartening for a newcomer coder that receiving a "Don't code anything new. first, go bugfixing until the code is rigth" speech. I bet most coders come to twiki because they need a new feature or a change in a feature, or to fix a bug THEY discover and need.
Moving TOC out to a plugin may be academic, but the only appealing argument I have seen against it is "performance of the plugins". Well, why don't finally "fix" the performance of the plugins? If there is too big an overhead, and as the typical TWiki installation has at least 10 plugins running, why are we wasting time profiling TWiki looking for non-existant hotspots instead of trimming the low hanging fruits?
--
RafaelAlvarez - 13 Aug 2006
- This proposal did not exactly come from a newcomer.
- Dreaming about sunsets is fine. To me this proposal seems more like a dream about the colour of the water pump of wife's Toyota Corolla.
- The goal of this proposal was lightening the core. But the idea of doing that is missed if you end up having the same feature in a plugin which overall will be slower. A light core only makes sense to me when the feature moved out is one rarely used. You cannot say that about the TOC which is used all the time.
Rafael, let us revisit the key performance numbers. Code execution (excluding the skin loading).
- Cairo without plugins - 0.67 seconds
- Dakar without plugins and no I18n - 1.06 seconds
- Cairo with 18 plugins - 0.97 seconds
- Dakar with 18 plugins and no I18n - 1.35 seconds
Let the numbers talk.
- Dakar core is significantly slower than Cairo core. We are not wasting time by profilling TWiki core! And it is not low hanging fruits. If it was, the core developers would have fixed them a long time ago.
- The plugins - even when not used - add a significant extra execution time. On average 2% relative to core execution time for each plugin. Or 20 ms on my test machine.
Moving TOC to a plugin as we know plugins today will cost another 20 ms execution time which we will not at all gain from removing the few lines of TOC code from the core. And the gain for the user is NONE as I see the proposal described.
And that is why I am not exactly excited about the colour of this particular sunset.
--
KennethLavrsen - 14 Aug 2006
Please take performance discussions to topics that are discussing performance. This topic is brainstorming a change to the rendering order.
Performance improvement is not in need of coders so much as it is in need of a champion, who is
driving for performance improvement. This can't be done by complaining, it has to be done by taking a lead (leading by example). I am somewhat disappointed that despite the kicking and screaming,
- There is still no publicly available test/benchmarking server.
- There is no progress on the benchmarking framework. My AthensMarks framework has apparently been discredited, but nothing has replaced it.
- No-one has described a process for profiling, without which optimisation is guesswork.
Kenneth, the drive for the change to TOC is actually the need to run a plugin handler over the text
after the TOC is complete. This is a requirement driven by a user.
--
CrawfordCurrie - 14 Aug 2006
We are discussing the performance impact of this proposal.
And moving the TOC to a plugin will have a negative performance impact - unless some of your proposed ideas of minimizing impact of the default execution of plugin loading will create a positive effect.
On your 3 points.
- test/benchmarking server. I will follow up on that in another topic. I have a simple idea.
- Discussions about AthensMarks is far out of scope of this topic.
- Noone have done anything the past month. Including yourself! People have been on vacation. A well deserved one.
Perhaps you can explain the real drive behind the TOC in a plugin because it seems to be a quite different reason that the original proposal.
And I would also like to learn the arguments why the same thing cannot be done within the core. After all it is the core that decides when plugins are running.
Please remember my mission here in this discussion. Avoid that this proposal further slows TWiki. I am not opposing the idea itself. But from the original proposal I could not - and still cannot - see what the enduser gains from the proposal but I can see another 20 ms delay.
--
KennethLavrsen - 14 Aug 2006
OK, I'll try again. I originally refactored the old topic with the technical issues without clearly stating my proposal, so here goes.
Imagine I have plugin that post-processes tables of contents. For example, it might change the target of links, or it might collate TOCs into a central TOC database. Further, imagine I have
several plugins that want to pre- and post-process TOCs. I need to be able to control the order in which they are invoked. I want to run some post-processors
before the TOC is assembled (so they can add heading magic) and some after.
The impact on the end user is that plugins authors can now write plugins that post-process TOCs.
--
CrawfordCurrie - 14 Aug 2006
Crawford, these are new use cases that are plausible. I think it is possible to accomplish this with the current implementation:
- The before TOC processing (including TOC overloading) can be done by Plugins in the
commonTagsHandler
- The after TOC processing can be done by Plugins in the
afterCommonTagsHandler
--
PeterThoeny - 16 Aug 2006
True. But is it really sensible to do that?
BTW I deleted some of the bullet points you added because they duplicated points that had already been made.
--
CrawfordCurrie - 16 Aug 2006
Could TOC be implemented as a plugin if on the first call to registered tag handlers it just outputs another tag that it would then handle properly on the second call?
Or am I missing something that is not obvious from the simplistic description of the expansion process above.
--
SamHasler - 16 Aug 2006