Keeping Spammers Away with Third-Party Tools

As anyone who runs a website these days knows, spam on the internet is becoming more and more of a problem. It's hard to keep up with it on your own. That's why you don't do it on your own. Here are a couple free services to install on your website to keep the spam away.

Akismet is my favorite tool and I use it on almost all my sites that use WordPress or Drupal. Content submitted to your server gets passed to Akismet's servers and their service says it's good or not. On WordPress, the spam doesn't even show up in the comment moderation queue. On Drupal, I have it set so you have to approve them, but it makes it easier to tell which ones are spam or not. Also, they have built in the opportunity to teach the Akismet servers as well, for you can after the fact mark a spam as "ham" or notify them that a good comment was actually spam. In the future, Akismet's filtering systems will take it into account.

reCAPTCHA is a newer service that I have not used as much, but it seems to be promising as well. (A CAPTCHA is one of those images where you have to type the letters in to get past them. It's supposed to tell if you're a computer or not.) Instead of giving a CAPTCHA that is easy for a modern computer to read via OCR or providing one that's so hard to read that not even humans can do it half the time (like Ticketmaster, Google, and vBulletin do), this system uses words that OCR can't read. The reCAPTCHA team out at Carnegie Mellon University has been digitizing books via OCR document scanning, and reCAPTCHA is words that the character recognition system could not recognize. They slightly skew them and draw a line through them, but they are much more readable than most CAPTCHAs. When you visit a reCAPTCHA-enabled page, you get two words from their database, one which the system knows and the other which it is unsure about. If you get the first one right, it'll assume you're a human and got the second one. After double-checking the second word with a number of other reCAPTCHA users, it knows it is correct, thus verifying humanness and helping to digitize books. I haven't used it on any of my sites, but a couple phpBB installs here at work were having trouble with spammers registering, and added the reCAPTCHA system to the login form via their plug-ins page totally took away the user registration problem immediately. (They have plug-ins for WordPress, Drupal, MediaWiki and many more as well.)

The best thing about these tools is that you don't have to manage them. Just a couple minutes of adding some code and registering for access gets you all these features for free. But by far the best benefit is what you only get when hundreds and thousands of websites are using these systems: a look at the global spam systems and a better chance to block them. For example, I could run a blacklist of sites just for my WordPress comments, but that would be a huge undertaking. Akismet already caught over 7 million spam today, so they have a good idea of what computers are most likely sending spam. They maintain the blacklists and other filtering for me, and out of the goodness of their heart, it's free for personal users. In the same way, reCAPTCHA also blocks sites that are already sending too much spam their way so I don't have to worry about it. Joining these big guys in fighting spam is the way to make sure our blogs stay safe from spam and also helps others who sign up for the program too.


Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd> <img>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. The supported tag styles are: <foo>, [foo].
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.