downtoearth-subscribe

Hark, the drunkard

Hark, the drunkard CATCHING sloshed people on the roads has always been tricky for the police. Mild drinkers are often experts at pretending that they are sober. But the next time you get caught for driving in an inebriated state, the cops may simply ask you to speak in a microphone linked to a computer, which can easily provide conclusive proof of your crime (The Economist, Vol339, No 7969).

An American scientist, Kathleen Cummings from the Georgia Institute of Technology in Atlanta, us, is developing a new technology that can elicit patterns from drunken speech that a computer can analyse. In a related work, researchers Renetta tull and Janet Rutledge from the Northwestern University in Evanston, Illinois, are examining what having a cold can do to your voiceprint -their efforts aim towards discovering the gadgets that a computer would need to separate one person's voice from another's.

A lot of research in speech recognition is already underway. It basically means picking out the voiceprint -the elements that make a voice unique and hence recogniseable. It also enables recognition of a person's voice with a bunged-up nose or alcohol-sodden consonants.

The human voice, as we hear it, is born in the vocal cords, and is then modified and articulated in the vocal tract ~ the throat, mouth and lips. For a computer to recognise a human voice, it needs to <;onvert the sound into mathematical formulae. There are several ways of doing it, each highlighting different aspects of the voice. The first is the raw sound -the waveform.

To unpack this waveform, Cummings passed a speech recording backwards through a series of filters that mimic the vocal tract, returning it as closely as possible to the pristine glottal sound. She found that a drunken person not only walks unsteadily but also has wavering vocal cords - this enables a reliable computer test for intoxicated speech. Similarly, Tull and Rutledge put snuffy and normal voice recordings through another filter - one that simulates how human hearing systems process sound. They found that healthy and unhealthy voices contained different features which could be measured by a computer.