There have been a number of proposals for dealing with Wiki Spam. This page compares the various proposals. The requirements section has been distilled from arguments used against these solutions in the past.
For more information, see Anti Wiki Spam Scripts, Category Wiki Spam, and Crazy Things That Might Save Wiki.
Requirements
Here are some vague requirements to help guide us in finding a good solution. The order is arbitrary.
01 Should deal with first-time spammers
02 Should not require spammers to pay attention
03 Should not require spammers to understand English
04 Should deal with human spammers
05 Should deal with robot spammers
06 Should not annoy readers (leading to loss of readership)
07 Should not annoy authors (leading to loss of contributions)
08 Should deal with bulk spam
09 Should deal with Submarine Spam
10 Should not invalidate any current content
11 Should not be too different from what we have today
12 Should not be too hard to implement
13 Should be resistant to abuse (for example, in Edit Wars)
14 Should not require demanding ongoing maintenance
Solution comparison
Dashes mean that it probably meets the requirement; X's means that it probably doesn't meet the requirement.
02 03 04 05 06 07 08 09 10 11 12 13 14
-- xx -- -- -- -- -- -- -- -- -- -- -- -- DelayedIndexing -- xx -- -- -- -- -- -- -- -- -- -- -- -- NoFollow -- -- -- -- -- -- -- -- xx -- -- -- -- -- WikiSpamBlocker -- -- -- -- -- -- -- -- xx xx -- -- -- -- StatisticalFilter xx -- -- -- -- -- -- -- -- -- -- -- -- xx BannedContentBot -- -- -- xx -- -- -- -- xx -- -- -- -- -- LanguageFilter -- xx -- -- -- -- -- -- -- -- -- -- -- -- RedirectExternalLinks -- xx -- -- -- -- -- xx -- -- -- -- -- -- ExternalLinkArea -- xx xx -- xx -- -- -- -- -- -- -- -- -- WarningMessage''''''s -- xx xx -- xx -- -- -- -- -- -- -- -- -- SpamHereOnly -- -- -- xx -- -- xx -- xx xx -- -- -- -- ShotgunSpam -- xx -- -- -- xx -- -- -- -- -- -- -- -- LetsInsulateOurselves -- xx -- -- -- xx -- -- -- -- -- -- -- -- StopAutoLinking -- -- -- -- xx -- xx -- xx -- -- -- -- -- VolumeLimitedEdits xx -- -- -- -- -- -- -- xx -- -- -- -- xx EditThrottling -- -- -- xx -- -- xx -- -- -- -- -- -- -- HumanVerification -- -- -- -- -- xx xx -- -- xx -- -- -- -- RejectEdits xx -- -- -- -- -- xx -- -- -- -- xx -- -- EditsRequireKarma -- -- -- xx -- -- xx -- -- -- -- -- -- xx UserLogin''''''s xx -- -- -- -- -- -- -- -- -- -- -- xx xx SpamBlackList -- -- -- -- -- -- -- -- -- -- -- xx xx -- PeerToPeerBanList -- -- -- xx -- -- xx -- -- -- -- xx -- -- EditsRequireJavaScript
Proposed solutions
02 Spammers will probably not notice
02 See No Follow on Meatball Wiki for a thorough analysis
09 Doesn't address well-concealed submarine spam
StatisticalFilter
09 Intelligent spammers may blend in
10 Some current content may not pass the filter
01 Banned content is likely to be spammer-specific
14 Requires ongoing maintenance
04 Spammers can adapt to the filter
09 Doesn't address submarine spam
02 Has no effect unless the spammers realize what is going on
02 Spammers may not notice
08 Depends on bulk spam containing many links to the same page
02 Spammers may not notice the warning message
03 Spammers may not understand the warning message
05 Robots will not notice the warning message
02 Spammers may not notice the directions
03 Spammers may not understand the directions
05 Robots will not notice the directions
04 Spammers can use multiple IP addresses or pages
This is only a problem if you're basing spam detection on the editing of multiple pages; the simple approach of counting links during an edit submit isn't affected by where the spammer is coming from.
If the spammer uses multiple IPs, he can make multiple repeat edits to the same page. So even if he can't add 1000 links at once, he could still add them 4 at a time.
07 May get in the way of some large refactorings
09 Does not deal with submarine spam
10 If existing pages exceed the new limit, authors are required to clean them up before posting minor edits in the future.
Though that depends on whether we limit the total number of external links, or just the number of added links, and whether it's an absolute number or a links-to-text ratio.
Lets Insulate Ourselves: disable google indexing of wiki
02 Spammers will probably not notice
06 Readers will no longer be able to search the wiki with google
Stop Auto Linking: stop auto-linking external addresses
02 Spammers may not even notice
06 Readers have to copy and paste addresses
Google may still find the links, even if they aren't clickable
05 Robots can make many small edits
07 Slows down major refactorings
09 Does not address submarine spam
01 First-time spam always gets through
09 Doesn't stop non-bulk spam
14 Manual edits still required, just at a slower pace
04 Spammers can answer quizzes too
07 Quiz would be time-consuming for authors
Reject Edits: reject edits containing discernible external addresses
06 Readers have to follow vague instructions to find external resources
07 Authors have to write detailed instructions leading to external resources
10 Invalidates all current external links
Edits Require Karma (related to Wiki Needs Trust Metrics)
01 One-time spammers may get their first spam through (if too much initial karma is granted)
07 One-time authors are discouraged from contributing (if not enough initial karma is granted)
12 May take a lot of work
04 Spammers can create accounts too
07 One-time authors would be discouraged
14 Spammers' logins would have to be banned somehow
01 By definition, doesn't deal with first-time spammers
13 Want to win an Edit War? Just add your opponent to the blacklist
14 Requires ongoing maintenance
12 Would have to be integrated with a number of Wiki Engines before it becomes effective
13 Subject to same abuse as Spam Black List
Edits Require Java Script: only accept edits from users who have configured their browser to run Java Script
04 Human spammers use a normal browser
07 Some authors may prefer to disable Java Script in their browser, or use a browser that cannot run it
12 The Java Script code (and its generation) must be such complicated that it really needs a Java Script interpreter to pass the test (hashcash…)
(05 in the future, spam bots may be able to run Java Script)
While no measure is perfect by itself, most of them help to some kind of spammers. This suggests combined measures will be more effective than using a single solution. For example, on my wiki (although it definitely does not have the attention and Link Share of Wards Wiki) I use combined Link Throttling, Edit Throttling and Volume Limited Edits - together with a system that requires fetching at least part of a page before saving it (this is actually a side effect of the transparent Three Way Merging deployed in the wiki). This has prevented spamming thus far.
At any rate, any spam detection system is IMNSHO better than the "code word" system used here. It has effectively made this wiki a closed medium, for example it took me a long time before I could find the right moment to put this comment here. -- Panu Kalliokoski
The right moment?
See, there was this period of approximately a month (fall 2004 IIRC), when I didn't ever manage to come to the wiki at a time the code word was there. So I concluded, "okay, this is not the place to participate anymore", and left for a long time. After many months, I checked again, and no, the code word wasn't there. So I didn't come back until now.
Not fall 2004, as the code mechanism was introduced about February 2005.
Related to Wiki Vandalism Solutions.
See original on c2.com