CAPTCHA Solved But Who Cares?

CAPTCHAAccording to Reuters, These Coders Say They Taught a Computer to Crack CAPTCHA. According to the article, the San Francisco based company, Vicarious, has developed this program as part of his larger mission. As their website says, “We’re building software that thinks and learns like a human.” What I find so remarkable about this is that this wasn’t done a long time ago.

For those of you who don’t know, CAPTCHA is an acronym that stands for, “Completely Automated Public Turing test to tell Computers and Humans Apart.” They are those pictures of distorted characters that you have to enter to, for example, post a comment on some blogs. You may have noticed that you do not need one here. This is because I’ve used them in the past and have found that they prevent absolutely no spam comments. This led me to believe that the spammers were just using programs that were able to crack the CAPTCHAs.

The truth is that I don’t see the problem with writing such a program. Imagine a simple CAPTCHA with a graphic with black letters on a white background. That should be as simple as running through the alphabet and doing a two-dimensional cross-correlation with each letter. Clearly, the more common distorted letters would be harder. But even these have standard algorithms that ought to allow one to reverse engineer them. And then increasingly, the CAPTCHAs have visual noise on them. It isn’t clear at this point just how well the Vicarious solution does with these. But often humans aren’t that good at this themselves. Here is a standard image that almost no one gets unless they are told that it is a picture of a spotted dog:

Spotted Dog

The big problem with the CAPTCHAs is not that computers can solve them. Vicarious has only approached the problem as a way to show what they can do with artificial intelligence; they aren’t trying to revolutionize the spam industry. That’s a good thing, because at this point, at least part of most spamming is done by hand. So a computer program will deposit spam on a web site and if it comes upon a CAPTCHA, it will be forwarded to a human to decipher. The article explains:

“Most CAPTCHAs now are broken by paying people in Bangladesh to do it manually,” said computer scientist Greg Mori of Simon Fraser University in British Columbia, an expert on machine learning and computer vision. “For 50 cents an hour, you can get someone to break seven per minute.”

That could well be cheaper than the costs of breaking it by computer alone. My cross-correlation idea, for example, would take up a lot of cycles and a lot of real time as well.

I still don’t see what the purpose is of all the spam that I get. Absolutely none of it gets through. Almost all sites where it does get through post them with rel=”nofollow” options, so there is no search ranking increase from the link. And no one but a complete idiot would click on those links. I understand in the old days of spam email, there was effectively no cost of sending out 10 million email messages for a return on 1,000 clicks. But if Dr Mori is to be believed, then the spammers are paying a dollar for every 840 spam posts. That’s more than a tenth of a cent per posts. I just don’t see how it is worth that unless the people doing the spamming are scamming the companies that they work for.

Regardless, spammers do an enormous amount of damage to the internet for very little profit. It is the ultimate example of the tragedy of the commons. The funny thing is that despite what Ayn Rand said, people really do exhibit altruism. And most people will not pollute the entire network for the smallest of personal advantages. But in this context it really only takes one person to cause significant harm to all the others. The whole thing leaves me with very violent thoughts.


One of my favorite spam comments goes something like this, “I have a site very much like yours. Do you have much of a problem with spam? Could you tell me what plugins you use to fight this?” It is quite clever because every time I see it, for an instant I want to respond. The more standard spam is more like, “It’s difficult to find experienced people in this particular subject, but you sound like you know what you’re talking about!” This raises the question of whether they know what I am talking about. That particular comment was in reference to an April Fools post based on an article in The Onion. Yes, I am experienced in that particular field; it is no wonder I sound like I know what I’m talking about!

One thought on “CAPTCHA Solved But Who Cares?

  1. Funny. I saw the dog immediately but not the spots! People see things very differently. A bunch of us who’d seen "Gravity" (exciting movie, not as cerebral as I would have liked) were talking about the 3-D. It was seamless for some, had double-images or blurred ones at times for others. I used to be a projectionist, and am anal about film presentation. (I’m the guy who goes into the lobby and says, "fix the framing in auditorium 4, it’s too high.") I re-adjust the color/contrast on my TV whenever I get a new disc player. So I definitely see more than most. But are my problems with 3-D a problem with flaws I see and others don’t because their eyesight is better, or different? Tricky to say.

    Here’s a fun Sports Illustrated article on how athletes see things like pitches (the best hitters in MLB are bamboozled by a women’s softball champion pitcher):

    We just have no idea how the brain processes information. Absolutely none. More all the time, naturally, but we’re pretty much in the dark at this stage. What a fun field to be doing research in!

Leave a Reply