From Speech to Text

There are plenty of moments I'd love to relive again. Telephone interviews and business meetings aren't usually among them--but that's exactly what I do each time I listen to audio recordings I've made of those events for note-taking purposes.

Isn't there a better way? Can't I just, say, capture those conversations with a digital voice recorder and let speech recognition software transcribe the recordings into text while I watch Friday Night SmackDown?

To find out, I tested the Sony ICD-SX57 digital voice recorder ($150 list) with Nuance Software's Dragon NaturallySpeaking speech recognition software. NaturallySpeaking 9.0 is bundled with the recorder and is also available separately in various versions starting at $100.

I chose the Sony recorder because it's fairly new (the product started shipping in May); it's inexpensive; it works in tandem with the leading speech recognition software, NaturallySpeaking; and it's among the most highly rated recorders for speech recognition accuracy (according to Nuance).

I tested the Sony recorder in several situations. First, I dictated in my office and later, in my car. I next recorded a meeting, followed by two telephone conversations.

Finally, I connected the recorder to my Windows Vista PC via a USB cable, to see how accurately NaturallySpeaking would transcribe those recordings into text.

Dial 911!

The recorder-software bundle did a good job transcribing my dictated memos and other voice recordings. The most accurate transcriptions were of recordings I made in my office, windows closed, with background noise at a minimum. The transcriptions of those recordings were 85 to 95 percent accurate. Transcriptions of recordings made in my car were about 80 to 90 percent accurate.

But transcriptions of meetings were much less accurate. NaturallySpeaking transcribed what I said with about 75 percent accuracy, but it was only about 10 percent accurate in transcribing what others said.

Recordings I made of landline telephone conversations were mostly gibberish. About 20 percent of what I said was transcribed accurately, but the other party's words rarely made sense when converted into text. For example, NaturallySpeaking transcribed what one woman said on the phone to me as "and I in as as as as as as as as as as as as as as as him him him him him him him him him him him him him him him him him." Had she truly said that, I would have hung up and dialed 911.

It's Not Surprising

I wasn't surprised by these results, however. NaturallySpeaking is designed to accurately transcribe the speech of only one person at a time, not multiple parties speaking in one recording.

"Each person's style of speech is very different," says Chris Strammiello, NaturallySpeaking product manager. "In a meeting, there are multiple styles of speech. People are often talking over one another and speaking in a conversational style, which is different from the style of speech people use when dictating." To handle such a job would require speaker-independent, multi-party, large-vocabulary speech recognition software, Strammiello says. Such technology doesn't exist today, he adds.

But what about the speech recognition technology you encounter when you call an airline or other company, are confronted with a voice-mail menu, and are given the choice to speak or type in your answers on the phone keypad? That technology, which Nuance also develops, is speaker-independent, multi-party, small-vocabulary speech recognition, Strammiello explains. The software can recognize multiple voices but only a small range of words for each voice-mail menu option. For instance, If an airline's voice-mail menu asks on what day you want to fly, the software can accurately recognize "Monday" or even "next Monday," but not, say, "Anytime in the month of August."

Meanwhile, a growing number of voice messaging services can transcribe a customer's voice-mail messages into text e-mails. Frequently, these services use a combination of speech recognition software and human transcriptionists, Strammiello says.

When will consumers be able to record meetings and have speech recognition software accurately transcribe what everyone said into text? Strammiello declined to guess, but allowed it would probably happen in less than ten years.

Slim and Slender, But Confusing

As for the Sony recorder itself, the device is light and slender; it slips easily into a shirt or pants pocket.

Like some other Sony products I've tested, though, usability sometimes takes a back seat to functionality. The recorder's on-screen menu options and its buttons were a bit confusing initially, though eventually I got the hang of it. Also, there's little documentation about using the recorder in conjunction with Sony Digital Voice Editor, the included PC software that lets you play back and bookmark recordings on your computer. Sony could also do a better job providing clear, step-by-step directions for using the recorder with NaturallySpeaking.

Worth It or Not?

Is the Sony recorder/NaturallySpeaking bundle worth the money? Yes, if you frequently need to dictate notes, memos, e-mail, and other text-heavy documents when you're away from the computer. I can see this bundle being useful to doctors, researchers, writers, field salespeople, home inspectors, and other professionals.

For More Information

Mobile Computing News, Reviews, & Tips

Second-Gen Zunes: Microsoft's updated Zune portable media players are built around a rounded touch-sensitive control that doubles as a clickable controller, like the iPod's Click Wheel. The new Zunes ($150-$250) feature Wi-Fi for wireless syncing with your PC. Microsoft has also updated its Zune Marketplace, with new community features and DRM-free music sales.

iPhone Rivals Flaunt Flexibility: While Apple plays cat-and-mouse with hackers determined to unlock the iPhone, the company's competitors have been loudly proclaiming the openness of their phones. Example: Nokia launched a new Web site highlighting the openness of its phones. "We believe the best devices have no limits. That's why we've left the Nokia NSeries open. Open to applications. Open to Widgets. Open to anything," the site's home page reads. Microsoft and Research In Motion have also called attention to the openness of their smart phone operating systems.

Hands-On With the iPod Touch: Apple's iPod Touch is a gorgeous portable media player that's incredibly fun to use, says PC World reviewer Eric Dahl. But Eric encountered some problems with the touch-screen iPod, such as a software bug that halted music playback.

I agree the iPod Touch is a pleasure to use. But in my first few days as an iPod Touch owner, I encountered some gotchas too.

Suggestion Box

Is there a particularly cool mobile computing product or service I've missed? Got a spare story idea in your back pocket? Tell me about it. However, I regret that I'm unable to respond to tech-support questions, due to the volume of e-mail I receive.

Contributing Editor James A. Martin offers tools, tips, and product recommendations to help you make the most of computing on the go. Martin is also author of the Traveler 2.0 blog. Sign up to have the Mobile Computing Newsletter e-mailed to you each week.

Subscribe to the Best of TechHive Newsletter

Comments