Homosapien Too

I sent a message the other day on ebay, and came across a new feature: to submit a message you now have to prove you are not spammer but human (these being opposites) with a Turing test or CAPTCHA.  Ok, these things are common on web systems these days, but the new slant here was that if you could not read the graphic, you could click on a link and download an audio version to listen to instead.  This is also one of the proposed strategies for dealing with SPIT (SPAM over Internet Telephony) in our VoIP systems of the future, i.e. interact with the bona fide caller or spammer and present them with some kind of test or quiz before they get put through.  This could be as simple as “Press 8 to speak to Martyn or 0 for voicemail.”

But there is also an arms race aspect to this, for the smart spammer might also employ automatic speech recognition (ASR) technology, which is increasingly cheap and effective due to increasing CPU performance and falling hardware prices.  Their ASR server could be programmed to understand digits, and so have a fair stab at giving the correct answer to the CAPTCHA. 

It interested me that on ebay, the audio file downloaded did not have a pristine recording of the digits being read out, but instead had a variety of noises in the background: white noise; some fragments of speech.  Naturally it’s quite easy for a human to extract the digits from the background noise, but this is just the kind of chaff that might confuse the enemy radar, so to speak, of the spammer’s ASR system.

Happy July 4th to those of you in the USA, and welcome back all our friends that just celebrated Canada Day.