It is the development of coping mechanisms

Researcher at the Loria (Lorraine Laboratory of research in computer science and its applications), a common unit at CNRS, Inria and the three universities Nancy, Jean-Paul Haton is one of the best French specialists in speech recognition. He is co-author of a book recently published on the subject (1).

One has the feeling that talking for a long time voice recognition without having seen little happen. Why

We talk about for a long time, it is true. I have myself written my first article on the issue in 1969. And there was disappointment, because it promised much, without the promises. Every five years, I have read research saying that the market would explode the following year. And there is nothing worse, because back the slope is very difficult. Today, it's different, voice recognition becomes an industrial reality with dozens of applications.

Things are so improved from a technical point of view

Yes, primarily due to technological developments. As for speech recognition, it requires a lot of computing power. I remember a time where, to do this, the computer was a whole room. Today, a PC of trade is amply. It should be noted that a system of speech recognition, nowadays, it is software. Apart from the microphone, there no need for additional hardware. Then, it must be said that our algorithms are much improved. As a result, systems have become strong enough to escape the environment. Still a few years ago we had acceptable laboratory and deplorable results outside. The difficulties are first related to the accents and ambient noise. But are also involved the quality of the microphone, the acoustics of the room or even the quality of the telephone line.

To improve things, should result in the system. Voice dictation software is perhaps designed to work with any voice. But if you take punishment to cause your software for an hour with your own voice, reading phrases predefined, and you stay in a quiet place like your Office, you will have a superior performance.

It said much that is the extraordinary dissemination of mobile devices that will eventually lead to an explosion of speech recognition applications. Is this true

This will have an impact, of course. Try to imagine the power that you have a mobile phone in a decade. One of the applications that I think much, is the simultaneous translation. This will not be perfect, but this will be a valuable aid.

And the future Will application performance improvement through better statistical tools or by increasing the power of the processors Or consider still a radical change in approach

It is the right question. The approach now exclusively statistical. Voice dictation software does not care of the syntax, but of the estate of the words simply. With the words above, he worked on the probability that the next such or such word. Let's assume that I develop a system of voice dictation from the database of the "Echos", i.e. hundreds of millions of words. After the sequence of words "President Jacques", the system will infer that there is good chance that the next word is "Chirac". The engine works with triphones, i.e. as a result of three words. Also note the limits. Because with this model of language, I'll make a good system of dictation voice for a journalist from the "echoes", but not at all suited to a doctor.

In the future will perhaps return to the methods of symbolic artificial intelligence based on human behaviour. In the history of science, we live in effect of cycles. In recent years, it has much lived on "all-statistics" and will perhaps return to humans with all the advances in neuroscience, psychology or Linguistics. Should in fact combining the two approaches with mathematical models in which it would incorporate explicit knowledge. Should thus happen to put a word in its context.

There is another route under study. It is the development of coping mechanisms. A little image of a human being, recognition system may, during the first seconds of dialogue, to adapt to his interlocutor and the conditions for registration. The idea is, in effect, can cause the system on an expedited basis.