Martin Geddes recently reflected on the use of Skype as a tool for recording podcasts with two people in different locations. This is a technique that is used on many podcasts now, including Blue Box, the VoIP Security Podcast. But as Geddes says, sometimes the quality is not all it should be, and it would be useful to be able to record in top quality, and in some way transmit this out-of-band, while using the inferior, real-time audio between the two podcasters. Sometimes this technique (called double-ending, or a “double ender”) is done manually today in podcasting and in radio: each person records their end of the conversation locally, then the files get spliced together at the end to make a broadcast quality programme. The telephone call only needs to be good enough for the two people to understand each other while the interview is taking place.
But adding double-ending functionality in Skype has interesting possibilities, apart from the podcasting one. In some areas human speech needs to be understood by less tolerant parties than humans, for example in the areas of automatic speech recognition, or speaker verification. Given that VoIP streams can be of cellphone quality (or lower), it could be useful for a computer system to be able to play back a passage of speech it was having trouble with. For example, a speaker verification system might listen to the live VoIP speech, perhaps match with a certainty of 20%, then after a few tens or hundreds of milliseconds it could try again using extra hi-fidelity information that came in while it was processing the first time. Much better than forcing the user to re-speak their passphrase over and over until the computer figures it out.
On the subject of Dan York (of Blue Box) and Martin Geddes, you can almost see them in this photograph from Fall VON. York is moving at speed, presumably in order to eclipse Geddes.
Martyn,
Thanks for posting this and also for pointing out Martin Geddes comments. The use of softphones in general – and Skype specifically – certainly do have both benefits and drawbacks for podcasters. On the plus side, we can actually create shows that involve hosts located wherever they are in the world. On the negative side, audio quality can be sometimes… um… less than ideal.
For instance, Jonathan and I recorded a Blue Box episode in early September while he was in Asia via Skype and had quite a few audio artifacts and gaps in the recording. Because I’m a stickler for audio quality, my post-production of that episode has taken quite some time – and the resulting show still doesn’t meet my audio quality standards, but it is what I’ve got.
I don’t use the “double-ender” approach for our recording, primarily because of the extra work currently involved with doing so. I just have Skype or another softphone running on one PC and do the recording on a second PC (mixing in my microphone) – so the audio I record is whatever I hear. Certainly if there was a tool that made double-enders simple, along the lines of what Martin suggests, I might definitely consider it for occasional recordings under bad audio situations (or at least as a backup).
Thanks again for the pointer,
Dan