Harmony - A leading Indian magazine for senior citizen reviews Dragon.
Speech recognition software
Rajashree Balaram tells you about the potential of speech recognition software.
Ever wished your computer could just listen to your orders and act on them? You are not alone. Many silvers find the keyboard and the mouse literally a pain to handle, especially if they suffer from arthritis. But help is here in the form of speech recognition technology, which enables you to control the computer without touching the mouse or keyboard.
SPEAK EASY Speech recognition is a process where a speech signal is converted into text using a computer program. You need to install speech recognition software onto your computer and speak into a microphone or headphone connected to the computer. Using your voice, you can type letters, documents, reports and memos; operate various menus and buttons; open applications; send email; and navigate the Internet. The hands-free interaction prevents repetitive stress injury (RSI), a common problem with people who work extensively on computers. (RSI is a syndrome resulting from overuse of a tool that affects muscles, tendons and nerves of the hand, arms and upper back.)
HOW IT STARTED Tentative research in speech recognition technology originated in industrial research labs in the United States in the 1950s. As computers came with limited computing power till the late 1970s, they were not equipped to analyse a continuous speech pattern. Major research advances in the mid-1980s finally made it possible for a desktop personal computer to recognise a vocabulary of 20,000 words - a meagre capacity compared to the extensive vocabulary of 250,000-300,000 words in contemporary speech recognition software.
Speech recognition has come a long way from the 1990s, when initial offerings required a lot of additional hardware to operate. Now speech recognition software can run on standard platforms such as Windows XP and Windows Vista. Programs permit dictation of text directly into a voice-aware application that collaborates with the word processor. An online error correction feature allows words to be transcribed based on the context in which they are used. Efficiency may vary depending on user accent, but typically there is a learning feature that enables the system to improve its recognition of users' speech with regular use. It's also very fast - high quality speech recognition software can type out 160 words per minute depending on your computer's processing speed (minimum 1 GB RAM with 512 MB free)
Speech recognition technology is finding new applications in a variety of fields. Today speech recognition is also being used in mobile phones and personal digital assistants. For instance, in mobile phones it allows you to train the phone to call someone by just uttering his or her name. "Earlier, speech recognition technologies that came in from the West were attuned to Western pronunciation and had trouble deciphering the Indian accent," says Dr Aniruddha Sen, senior researcher at Tata Institute of Fundamental Research (TIFR). "However, the scenario is changing as more IT companies and research firms [IBM's India Research Laboratory (IRL), Indian Institute of Technology (IIT), Centre for Development of Advanced Computing (CDAC) and TIFR] are working to develop speech recognition technologies compatible with Indian English and regional Indian languages." IBM recently launched speech recognition in Hindi that is even sensitive to variations in dialect. The software that has a vocabulary of 75,000 words can enable semi-literate and physically challenged people to access information through voice-enabled ATMs, kiosks and other such devices.
LEADER OF THE PACK Now, let's look at some popular speech recognition options on the market. Dragon Naturally Speaking leads the pack with 85 per cent global market share in speech recognition. Manufactured by US-based Nuance Communications, Naturally Speaking was launched in India in 1996. It is available in two editions: Standard and Preferred. Both have a vocabulary of 250,000 words to which you can add many more.
The Standard edition (Rs 7,990) is ideal for basic use - sending email, typing letters and documents, and surfing the Net. It comes with a high quality noise-cancelling headset that eliminates all ambient noise so your voice is the only sound absorbed. It is compatible with all basic Windows applications such as Microsoft Word, Microsoft Outlook Express, Lotus Notes and Microsoft Internet Explorer. The Preferred edition (Rs 14,990) is ideal for PC enthusiasts and professionals and can even be customised to accommodate the vocabulary of lawyers and doctors. It offers more features than the Standard Edition such as dictation playback so you can replay your voice and make corrections on screen using basic speech commands; text-to-speech screen reader; import and export of user profiles in case you switch to a new computer; and dictation shortcuts for rapid transcription.
If you are always on the move, the Preferred Mobile edition (Rs 24,990) is your best bet as it comes with a digital voice recorder (DVR). So while you are travelling and don't have access to your computer, you can speak into the DVR and later hook it up to your computer for transcription. On the other hand, if you would rather dictate moving around the room, try the Preferred Wireless edition (Rs 24,990). You can choose between a radio frequency microphone that works within a radius of 200 m or go for the Bluetooth wireless headset that allows you to control your computer within a closer range of 20 ft.
The latest Version 9 of Naturally Speaking delivers 99 per cent speech recognition accuracy compared to 95 per cent offered by the earlier Version 8. Unlike other speech recognition software that is more sensitive to US or UK English pronunciation, Naturally Speaking Version 9 is the only speech recognition product that's available in an Indian English version. According to Manish Goenka, national distributor (India) for Naturally Speaking, the software can even be trained to pick up your individual accent. And you don't need to match the pace of your dictation with your computer's processing speed. "For instance, if you are a fast talker, you can continue dictating regardless of the speed of text appearing on screen, break for tea perhaps, and by the time you return to your desk, you will find your full dictation converted into text," he explains. Around 8 per cent (about 400 people) of Goenka's clientele are silvers, including general users, writers, lawyers, accountants, consultants and physicians.
BUDGET OPTION If you are looking for more affordable software, there is IBM ViaVoice - you can purchase it online at www.nuance.com using your credit card. The product was withdrawn from Indian stores a couple of years ago as it delivered less than 80 per cent accuracy with Indian English pronunciation. Even though it's not as accurate as Naturally Speaking, you cannot ignore the cheerful price tag. IBM ViaVoice is available in four Windows-compatible editions: Personal ($ 29.99 = Rs 1,200), Standard ($ 49.99 = Rs 2,000), Advanced ($ 79.99 = Rs 3,200), and Pro USB ($ 189.99 = Rs 7,500). ViaVoice also has two editions to suit Macintosh platforms: the Mac OS X Edition ($ 124.44 = Rs 5,000) and Simply Dictation Mac OS X Edition ($ 59.99 = Rs 2,400). The vocabulary is richer too; ViaVoice Standard, Advanced and Pro USB editions have a vocabulary of 300,000 words. Like the Naturally Speaking Preferred edition, the ViaVoice Advanced also works with digital hand-held recorders. If you are ready to compromise on accuracy but want to strike a sound price bargain, ViaVoice scores high.
SHAREWARE If your PC runs on Windows Vista, the latest operating system from Microsoft, you don't need to invest in speech recognition software - Vista comes with in-built speech recognition technology. Though the tool is partial to US and UK English, Microsoft claims that performance improves with time as the software adapts to your personal accent over repeated training. An interactive speech tutorial teaches you how to use the tool and simultaneously gets the system acquainted with your voice.
Along with useful features offered by other software like typing email, memos and documents, Vista's speech recognition tool also enables you to fill out application forms online. It has many limitations, though: vocabulary entries are limited to single words; speech recognition processing is very slow; and new words that are trained are often recognised incorrectly.
There were initial fears that computers using the tool were at risk from malicious commands played in an audio file. This could be a command to delete a file played through the computer speakers that would be picked up by the microphone.However, Microsoft reassured that such actions would prompt a warning that cannot be deleted with voice commands.
You can even download basic speech recognition software from www.e-speaking.com for a 30-day free trial. As it's a trial version, it may only pack a few features - just 100 commands or so - but if you are impressed, you can buy it for $ 14 (about Rs 600). But you need Microsoft's Speech Application Program Interface 5 (SAPI 5) and .NET framework for it to work on your computer. The .NET framework is a software component that's part of only modern Microsoft Windows operating systems such as Windows Server 2003, Windows Server 2006 and Windows Vista. Similarly, SAPI 5 is a Window Vista feature and has to be installed on other platforms.
So, dash off that email to your kids abroad; write that book you have always wanted to; enhance productivity. Your voice can do wonders.
GET TALKING
You can create 'voice shortcuts' to insert greetings, addresses and quotes.
You need to spell out punctuation. But making letters bold, reducing or increasing font size, underlining, changing font colour and inserting para space can be achieved through voice commands.
Voice commands are mostly very simple. For instance, 'start listening', 'save as', 'test document', 'open word pad'.
When proper nouns, such as Indian names, are not initially recognised, you can add them to the software's existing word list for it to recognise the word later.
Featured in Harmony MagazineFebruary 2008
Harmony Link: http://www.harmonyindia.org/hportal/VirtualPageView.jsp?page_id=6345
Comments