Question
How does one find a topic that is no longer being linked to?
It seems that a wiki would get a lot of these, but I haven't found how to find/delete/handle them.
- TWiki version: Dec 2001
- Web server: Apache 1.3.26 w/mod_perl
- Server OS: FreeBSD 4.6-stable
- Web browser: IE 6
- Client OS: W2K
--
ElliotFinley - 15 Jul 2002
Answer
This is not yet in TWiki.
Some experimenting has been done on the other way around, e.g.
Codev.FindReferencedButNotDefinedWikiWords. If you drive this forward we'd like to take it into the release
--
PeterThoeny - 17 Jul 2002
Response to Answer
I have an algorithm that should work, but I'm not really sure how to implement it in TWiki.
topic_list = <generate topic list from directory structure>
push(links_to_follow_queue, <root twiki page>)
links_followed = <empty>
while(there are links to follow in the links_to_follow_queue) {
current_link = shift(links_to_follow_queue)
push(links_followed, current_link)
links_from_current_link = <harvest links from current link>
foreach link (links_from_current_link) {
next if link is in links_followed
push(links_to_follow_queue, link)
}
}
remove entries that are in links_followed from topic_list
anything left in topic_list is an orphaned topic
--
ElliotFinley - 22 Jul 2002
This sounds like a
FindOrphanedTopicsPlugin project. Some usefull functions:
-
TWiki::Func::getPublicWebList() returns a list of all public webs.
-
TWiki::Func::getTopicList( $theWebName ) returns a list of all topics in a web.
-
TWiki::Func::readTopic( $theWebName, $theTopic ) returns the text of a topic
The criteria for a link is basically all internal links TWiki understands like:
WikiWordTopicName, Webname.WikiWordTopicName,
[[Any-Internal_Name]], [[Webname.Any-Internal_Name]],
[[Internal name with spaces]], (points to InternalNameWithSpaces)
[[Webname.Internal name with spaces]],
[[Any-Internal_Name][any label]], [[Webname.Any-Internal_Name][any label]],
TLA, XSLT (point to TLA, XSLT but only if they exist)
MyDrinks, Webname.MyDrinks (point to MyDrink in case MyDrinks does not exist)
MyPolicies, Webname.MyPolicies (point to MyPolicy in case MyPolicies does not exist)
MyAdresses, Webname.MyAdresses (point to MyAdress in case MyAdresses does not exist)
MyBoxes, Webname.MyBoxes (point to MyBox in case MyBoxs does not exist)
The Plural to sinular rule can be disabled in TWiki.cfg.
--
PeterThoeny - 07 Aug 2002
I think that the focus on this should be organisation of the hidden
`META:TOPICPARENT' fields present in the .txt files TWiki saves. Right now, only the GNU skin lets you set the parent of a topic. This TOPICPARENT is used by plugins like the
TreePlugin to create order from madness :-).
Once the TOPICPARENT is clear, the only other thing you need to search for is `abandoned' children - where the parent topic does not contain any links to the child topic.
--
SamVilain - 04 Nov 2002
The META:TOPICPARENT field is not always set, topics of older TWiki versions do not have it, and it is possible to create new topics without a parent (when asking for a new topic name in an HTML form).
FYI, you can re-parent a topic, click on the
More link.
The safest way to find orphaned topics is to:
- build a hash of all topics per web, set the hash value to 0 (represents the number of links pointing to it)
- for each topic, parse all topic text
- for each valid topic link, increment the link number the corresponding topic in the hash
- return all topics in the hash that have number of links equal zero.
--
PeterThoeny - 04 Nov 2002
Not exactly the same thing, but a similar form of orphaning occurs
in TreePlugin, where in version 310 it loses track and does not
report pages whose parent meta-info is broken in certain ways.
In a different topic I am posting a patch to fix this.
Basically, it creates a list of all topics (in a web),
for each topic finds its ultimate parent, breaking cycles if necessary,
and then reports all of the trees.
I.e. it reports a forest, with multiple roots for disconnected trees
(at least, disconnected via the parent link that TreePlugin uses),
rather than a single link.
Several other wikis, e.g. MoinMoin, create tables of contents
in a more interesting manner: instead of using a parent meta-info,
they deduce parentage from the inter-page links by various
heuristics. This produces an interesting table of contents.
And, overall, this also produces a graph that is useful in
finding orphans.
Remember: it is not just orphans you want to find.
It is disconnected subcomponents:
you may have a cycle that is not connected to anyone else.
I think it is Dijkstra's algorithm that finds disconnected subcomponents
on a graph.
Related issue: you have to take into account all of the various
forms of linkage.
--
AndyGlew - 04 Jul 2003
This comment is rather dated, but re: "how to find/delete/handle them:" — IMHO, there is nothing inherently wrong with an orphaned topic, especially when a wiki is used as a knowledge base.
Orphaned topics can get found by search, when you go looking for information that might be contained in those topics.
--
RandyKramer - 04 Jul 2003
You can try out a plugin I created for this purpose (finding orphans).
TopicReferencePlugin
--
JeffCrawford - 16 Apr 2006