The Players

 

Introduction           

What is SR?

Speech Recognition vs. Voice Recognition

Technology of SR

Timeline

The Players          

Future

Limitations and Potential

References

Commercial Applications

The first commercial applications of computer aided voice recognition came in the medical and legal fields.  Physicians and attorneys used to dictate notes on a case to an answering service and a secretary would type the report.  As the power of the computer hardware and software improved, the speech recognition capabilities of the computer became sufficient to transcribe these dictations.  Rather than having someone re-type the entire report, a human was merely needed to proofread the document after the computer constructed a rough draft.  Soon the necessity for a human proofreader will vanish as the technology becomes even more powerful.  The need for an accurate and efficient method of transcription provided the impetus for today’s commercial voice recognition software. 

There are three major players in the end-user commercial application of speech recognition; IBM, Lernaut and Hauspie, and Dragon Systems.  These three companies provide software packages that convert audible words into digital data that the computer applications can transform into usable data.  IBM's ViaVoice, L&H's VoiceXPress, and Dragon System's Naturally Speaking are very similar products that are comparable in price, ease-of-use, and features.  The deluxe version of these programs costs about $150 and has a vocabulary of over 200,000 words.  They will convert voice data into usable data for most popular software applications and have customized interfaces for the Microsoft suite of applications.  These programs are programmed to recognize and correctly interpret dates, currency and numbers.  The user can control the operations of the computer (such as opening and closing files and browsing the Web) through voice commands and macros.  The software will also read text and numbers to the user in a human voice.  All of these voice recognition programs require an intense training session (from 15 minutes to an hour) to learn the specific patterns of an individual's voice.  As computer processor speeds have improved, so has the accuracy and speed of these voice recognition software applications.

 

VoiceXML

In March 2, 1999, twenty leading speech, Internet and communications technology companies announced the formation of the Voice eXtensive Markup Language Forum to develop a standard in voice recognition technology.  The VXML Forum "aims to drive the market for voice- and phone-enabled Internet access by promoting a standard specification for VXML, a computer language used to used to create Web content and services that can be accessed by phone."[1] Once a standard in the computer community is established, there will be an increased adaptation of voice recognition technology by third party software developers.  Even simple programs will be able to incorporate voice recognition technology without a large investment in development time and skill.

 

Natural Language Speech Assistant by Motorola and Unisys

The Natural Language Speech Assistant (NLSA) is a developer’s toolkit for the development of software that enables customers to access the data they need using their own everyday language (or natural language), rather than restricting the responses to keypad entries or single-word answers.  The NLSA equips developers with the tools necessary for writing speech-enabled applications.  This eliminates the need to learn the details of programming speech recognizing programs. In addition, it protects programmer's development investments in order to migrate towards different speech recognizers. Furthermore, NLSA will hopefully enhance current Internet Voice Recognition applications as well as develop new and more sophisticated applications by capitalizing on the speech technology available today.

 

A number of large corporations have dedicated immense budgets towards researching SR technologies. Here are some of the prominent players:

Unisys Corporation

Unisys is developing the technology to successfully contact customers through a strategy that incorporates multiple mediums including the press, TV, Internet, phone, fax, mail and face-to-face contact through "call centers."  By being able to respond to the customer's voice commands, the call center can find a solution to best fit the customer's need through one of these media options.  Unisys claims that intelligent call centers are able to represent the company well and promote efficiency in communication across an organization.

   

VoxGateway by Motorola

VoxGateway software (part of Motorola's Mobile Internet Exchange) is a server product, which runs applications developed in the VoxML programming language. This software is also licensed in an OEM version to technology partners that create products, systems and applications enabling voice access to the Internet. The VoxGateway software will support applications written using the Unisys tools, Motorola's VoxML language and in the emerging VoiceXML specification.

 

VoxGateway and Natural Language Speech Assistant

On October 27, 1999, Unisys and Motorola announced the cross-licensing of its products to each other.  Motorola will license its VoxGateway software to Unisys and Unisys will license its Natural Language Speech Assistant software to Motorola.  This will allow consumers to find information and conduct transactions on the Internet using a telephone.  They will be able to interact with a specially designed web site.  "The potential for this technology is overwhelming, and is limited only by the reaches of the Internet and telephone system. Unisys Natural Language Speech Assistant tools simplify and accelerate speech application development and delivery without requiring specialized developer skills," said Joe Yaworski, vice president and general manager of the Unisys Natural Language Understanding initiative. "The combination of Unisys speech technology expertise and Motorola's wireless Internet industry leadership delivers the potential to dramatically expand both the access to, and the business of, the Internet."

 

In addition to these players, technology firms such as Microsoft, Lucent (Bell Labs), IBM, and Apple Computer have launched significant efforts to improve SR technologies. Direct links to their research can be found on the reference page.

 
Copyright © 1999 Ira Greenberg and Andrew Bate.  All Rights Reserved.