Saturday 18 December 2010

Voice Biometrics: Tell It Like It Is | Voice Biometrics News & Events

Earlier this month a Cisco’s Alex Noble caused a stir when he published this blog post to discuss the fate of “voice biometrics security” in the wake of a decision by the UK Department of Work & Pensions to close several trial implementations of “voice risk analysis” (VRA). Perhaps it was a mistake by the headline writer, but for some reason, the term “voice risk analysis” and “voice biometrics” were used interchangeably. One is not synonymous with the other.

VRA, which in the case of the DWP was using a technology DigiLog UK was designed to serve as a “lie detector” or “stress test” which measured a set of pre-defined “emotional characteristics” of a caller’s utterances. The pilot program was designed to test whether these characteristics could be a reliable, consistent way to detect fraudsters. Apparently the answer was no.

But that finding has nothing to do with Voice Biometrics (VB) technologies and their potential to reliably and consistently serve as platforms for caller authentication, validation and ID proofing. In contrast to VRA, which is looking for specific tone changes in spoken utterances, VB captures and distills a voice print that reflects the shape and unique attributes of a person’s vocal tract as well as unique characteristics of “how” he or she speaks. It is a much broader behavioral biometric than mere tone change. VB can be foundational to applications that include caller authentication or verification, speaker identification, ID proofing, “voice signatures,” and many other applications.

As VoiceVault’s Nik Stanbridge points out in this blog post, “no voice biometrics were used at all” in the DWP implementation.

The situation for speech technologies may worsen as automated speech processing becomes “cool” again in support of search, dictation, command and control on mobile phones. Several closely related technologies are wending their way into the public consciousness simultaneously. Analysts, journalists, investors, customers and prospects – many of whom should know better – have a bad habit of conflating the disparate implementations of what is, at base, pattern recognition coupled with business logic and rules that define its application, deployment patterns and use cases.

For instance, while fielding calls from the trade press regarding Google’s new “Personalized Voice Search,” I needed to clarify the fact that “highly personalized speech recognition does not equate to speaker identification or authentication.” Indeed, once an application starts collecting utterances and associating them with a specific speaker, I can see how it could be confused with speaker identification.

The salient difference between VRA and VB is “intent” at the application or user interface layer. Google is collecting additional utterances and associating them with a specific user in order to promote more accurate speech recognition. It’s all part of the ultimate Star Trek crew person-to-machine conversational model. If anything, Google has designed an application that assumes “the right person” is in possession of his or her mobile phone, a presumption that a true VB-based authentication system could validate.

I’m glad that Google is moving toward more accurate recognition of utterances from specific users. I think any initiative to promote greater accuracy and reliability helps further acceptance and comfort of voice as a candidate for entering search terms, dictating messages and entering commands; especially on mobile devices. The unspoken (and sometimes spoken) question surrounding the ultimate success of voice biometrics is summed up as: “If a system can’t recognize what you say consistently, how do you expect to to recognize who you are with sufficient levels of confidence?”

While speech rec has vastly different challenges from speaker rec, they are inextricably linked in many people’s minds. In the wild, solutions blend or integrate these disparate technologies. Enrollment and authentication, for instance, relies on an IVR (interactive voice response) platform. That creates even more reason to be precise in our wording as we bring VB more prominently into the e-commerce and mobile commerce mainstream.