|
|
IntroductionWhat is SR?Speech Recognition vs. Voice RecognitionTechnology of SRTimelineThe PlayersFutureLimitations and PotentialReferences |
In
the past, the major constraint to developing the perfect Speech Recognition
system
was the limited processing power of the computer's microprocessor. Once this
obstacle was overcome with the development of the microchip, the true
limitations of SR technology became visible: the ability to develop logarithms
sophisticated enough to nearly perfectly understand, interpret, and respond to voice commands.
The answers to this problem still elude the most successful research
institutions. For instance, some systems can understand input from a variety of
users but with a limited vocabulary bank. Conversely, other systems recognize
over 200,000 words but from only a very limited number of users. There does not
exist a program that can comprehend extensive vocabularies from various
speakers. The
commercial programs available today require a "training session" with
each user, which may last
over an hour. During this time, not only does the user have to learn how to
speak to the machine, but the computer also needs to become accustomed to the
user's voice. This may be a constraint on productivity because of the lost
hours, but this also presents another problem. This new problem lies in the fact
that systems will need to learn to understand multiple users in a short time (or
instantly). For instance, when we go to the McDonald's drive-thru window and
order a burger with ketchup, we will expect the computer system to recognize our
verbal input immediately. It would not be fast food if we had to train the
Speech Recognition program for half an hour! Another
limitation to the use of current SR tools is that there are nearly unlimited
variables comprising the noise of voice. For example, when we answer a phone
call just as we wake from sleep, our voice sounds different than after we
cheered all night at a basketball game. Additionally, background noise
poses limiting factors on the effectiveness of SR technology. It is relatively easy for the computer to filter background
noise when we are speaking in a quiet
office, but
if we were to say the same phrase on a busy street the SR systems will be confused. Even
though the current SR systems have limitations, significant progress has
been made in developing a perfectly reliable SR program. Once these frustrating
hindrances are overcome, the potential for SR technology is enormous.
The traditional methods of inputting data into a computer such as a mouse
and keyboard will become obsolete. Furthermore, the interaction
between the user and the computer will commonly be speak/listen/speak…
as opposed to mechanical-input/read/mechanical-input… For a further look
into the future and potential of Speech Recognition technology visit the
Future. |
Copyright © 1999 Ira Greenberg and Andrew Bate. All Rights Reserved.
|