According to Reuters, These Coders Say They Taught a Computer to Crack CAPTCHA. According to the article, the San Francisco based company, Vicarious, has developed this program as part of his larger mission. As their website says, “We’re building software that thinks and learns like a human.” What I find so remarkable about this is that this wasn’t done a long time ago.
For those of you who don’t know, CAPTCHA is an acronym that stands for, “Completely Automated Public Turing test to tell Computers and Humans Apart.” They are those pictures of distorted characters that you have to enter to, for example, post a comment on some blogs. You may have noticed that you do not need one here. This is because I’ve used them in the past and have found that they prevent absolutely no spam comments. This led me to believe that the spammers were just using programs that were able to crack the CAPTCHAs.
The truth is that I don’t see the problem with writing such a program. Imagine a simple CAPTCHA with a graphic with black letters on a white background. That should be as simple as running through the alphabet and doing a two-dimensional cross-correlation with each letter. Clearly, the more common distorted letters would be harder. But even these have standard algorithms that ought to allow one to reverse engineer them. And then increasingly, the CAPTCHAs have visual noise on them. It isn’t clear at this point just how well the Vicarious solution does with these. But often humans aren’t that good at this themselves. Here is a standard image that almost no one gets unless they are told that it is a picture of a spotted dog:
The big problem with the CAPTCHAs is not that computers can solve them. Vicarious has only approached the problem as a way to show what they can do with artificial intelligence; they aren’t trying to revolutionize the spam industry. That’s a good thing, because at this point, at least part of most spamming is done by hand. So a computer program will deposit spam on a web site and if it comes upon a CAPTCHA, it will be forwarded to a human to decipher. The article explains:
That could well be cheaper than the costs of breaking it by computer alone. My cross-correlation idea, for example, would take up a lot of cycles and a lot of real time as well.
I still don’t see what the purpose is of all the spam that I get. Absolutely none of it gets through. Almost all sites where it does get through post them with rel=”nofollow” options, so there is no search ranking increase from the link. And no one but a complete idiot would click on those links. I understand in the old days of spam email, there was effectively no cost of sending out 10 million email messages for a return on 1,000 clicks. But if Dr Mori is to be believed, then the spammers are paying a dollar for every 840 spam posts. That’s more than a tenth of a cent per posts. I just don’t see how it is worth that unless the people doing the spamming are scamming the companies that they work for.
Regardless, spammers do an enormous amount of damage to the internet for very little profit. It is the ultimate example of the tragedy of the commons. The funny thing is that despite what Ayn Rand said, people really do exhibit altruism. And most people will not pollute the entire network for the smallest of personal advantages. But in this context it really only takes one person to cause significant harm to all the others. The whole thing leaves me with very violent thoughts.
Afterword
One of my favorite spam comments goes something like this, “I have a site very much like yours. Do you have much of a problem with spam? Could you tell me what plugins you use to fight this?” It is quite clever because every time I see it, for an instant I want to respond. The more standard spam is more like, “It’s difficult to find experienced people in this particular subject, but you sound like you know what you’re talking about!” This raises the question of whether they know what I am talking about. That particular comment was in reference to an April Fools post based on an article in The Onion. Yes, I am experienced in that particular field; it is no wonder I sound like I know what I’m talking about!