Thwarting Spammers With reCAPTCHA

One day, while going out to lunch, I checked my phone and noticed 20 new email messages from Wordpress asking me to review new comments to my blog. I instantly knew something was wrong because I don't generally get that many comments in such a short amount of time. After reviewing all of my email messages, I confirmed that every last one of them was asking me to approve a spam comment. By the time I got back to my desk to deal with my new spam issue, I found over 100 messages awaiting my rejection.

Strangely enough, I had some mixed feelings towards this lame (and ultimately futile) spam attack. On one hand, I was super annoyed to have to deal with this problem. But, on the other hand, I was at least a little bit encouraged to see that my blog was getting web-crawled by something out there (at the time of this writing I am completely over the encouraged part and now just plain annoyed.)

My first plan to stop the barrage of spam comments I was receiving was to simply blacklist the IP address making all of the bad comment requests. After all, every single spam comment had the exact same text, so that means they should be coming from the same IP address, right? Wrong! Either this particular spambot was spoofing new IP addresses for each spam comment it made or it was sending the requests from several different - likely infected - slave machines all over the world.

CAPTCHA to the rescue!

First off, a CAPTCHA is a challenge that asks a human to type the letters generated in an image into a text field. Since computers don't know how to read text off of images (that well) we can be assured - at least for now - that only humans will be able to solve CAPTCHA puzzles.

Enter reCAPTCHA

reCAPTCHA is a CAPTCHA implementation that can be tied into any webpage that accepts user input to ensure the site's webforms aren't being automatically generated by a bot.

Why reCAPTCHA? Simple, because every time you solve a reCAPTCHA puzzle, you are actually working towards something productive.

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

Cool, huh? Not only do you get the benefit of a spam-free site, you also get to have a direct influence of perpetualizing human knowledge by assisting in the digitization of books! Go ahead and give yourself a pat on the back, I'll wait.

Installing reCAPTCHA

Installing reCAPTCHA is a breeze. There are several different installation methods available, it's up to you to pick the one that best fits your site. Either way, you'll need an API key.

PHP Instructions - use these if you are running a normal website that takes user input, or a blog other than Wordpress.

Wordpress Instructions -use these instructions if you already have a Wordpress blog.

And there we have it

After installing reCAPTCHA the spam assult on my blog stopped instantly and I haven't seen another spam comment since. The only downside is that normal human commentors (such as yourself) have to be inconvenienced a little bit to post a comment, and for this, I apologize.

Having problems with spam comments or contacts? CAPTCHA seems to be a solution that's just good enough to re-take control of your website and reCAPTCHA is a great implementation that works well and is easy to use. This is just a small victory in the never ending war against spam, I'm going to celebrate by posting a new article that won't get any spam comments.

Creative Commons License

What do you think?