In order to help combat wiki spam I have coded the following measures into
DakarRelease. The measures can't stop people spamming your wiki, but they
can stop spam links from being followed by robots, such as the Google spider.
In TWiki.cfg there are two new options:
# Options added to external links (links to URLs that do not match
# {AntiSpam}{Clean}. Public sites should set this to 'rel="nofollow"'
# to prevent wiki spammers gaining any benefit from spamming.
$cfg{AntiSpam}{Options} = '';
# Regular expression that must match the start of any external links
# that are _not_ to have the {AntiSpam}{Options} added. The default
# is to leave links to twiki.org and to the server site untouched.
$cfg{AntiSpam}{Clean} = qr,(http://(www\.)?twiki.org\b|$cfg{DefaultUrlHost}|\W),io;
and a new funtion in TWiki.pm:
---++ StaticMethod spamProof( $text ) -> $text
Find and replace all explicit links (<a etc) in $text and apply anti spam measures
to them. This method is designed to be called on text just about to be printed to the
browser, and needs to be very fast.
Links to URLs that are escaped by $cfg{AntiSpam}{Clean} are left untouched. All
other links have $cfg{AntiSpam}{Options} added.
May all spammers burn in the fires of eternal damnation! (that goes for email too, folks)
--
CrawfordCurrie - 02 Apr 2005
We now have two methods to combat
WikiSpam, this and
BlackListPlugin, and they both work in different ways. This adds
rel='no follow' to all links except those specifically excluded, whilst
BlackListPlugin adds them only to new links (I assume it won't get added twice, not that it would cause much harm).
It's possable you might want to use the different methods on different areas of the site.
Therefore it would be nice if
DakarAntiSpamMeasures could be turned off on a per web basis.
Webs like Main/Sandbox don't get many people viewing their
WebChanges regularly, as they are mostly user pages and therefore somewhat offtopic to the rest of the site.
This means that links in them probably don't deserve to be followed as they are more than likely unrelated to the site.
They are also less likely to be policed by the general populus to remove wiki spam, leaving the job of policing them to admins. These webs would benefit from
DakarAntiSpamMeasures as it would mean that
WikiSpam in them could be ignored as harmless. It could be cleaned up periodically but there would be no need to check it every day.
Hovever other webs e.g. Codev & Plugins on this site, probably should allow following of their links because they are more than likely related to the content and focus of the site, and therefore deserving of a boost in
PageRank / Search Fu.
(How many links to other wiki's/sites that deserve to be indexed and gain rank are there in the Codev & Plugins webs?)
Changes to these areas of the site are followed closely by many people therefore they will mostly be self policing.
BlackListPlugin would only be needed to stop any
WikiSpam that does appear from being indexed before someone can clean it up.
On the other hand closing down some areas of a site whilst opening up other areas may move the spam from Main/Sandbox where it is currently to the other webs, which could end up annoying legitimate users.
Although since the net effect should be that no spam gets indexed you'd hope that the spammers would move elsewhere before you had to totally lock down the site.
I'd certainly like to try it.
--
SamHasler - 03 Apr 2005