IBM gives new voice to speech technologies

Opening the doors to its research labs in Israel, IBM showed Computing its latest developments

Written by James Watson

IBM's Haifa research laboratory has been investigating voice recognition and speech technology as an alternative interface to various devices.

Speech recognition is particularly suited to environments where people cannot easily access a device with a keyboard or mouse. Looking for directions while driving a car is a classic example.

Advertisement

IBM's success in this area has led to its technology being adopted by Japanese carmaker Honda for a voice-controlled in-car satellite navigation system which is being installed in selected Acura models.

The first cars using the technology hit the road last September, following a rapid development cycle to adapt prototype technologies.

The system allows drivers to ask for directions to a specific location. When they have been established, the application reads them back to the driver.

Zohar Sivan, manager of media services and technologies at IBM's Haifa labs in Israel, says much progress has been made in the field of voice control, but there's still a long way to go.

'Speech is just another form of user interface. But the technology is not perfect and won't be perfect for a long time to come.'

Sivan has recently been concentrating his research on text-to-speech (TTS) technology, which helps companies building various consumer devices to avoid paying for costly pre-recorded voice prompts.

'The automation of this is very attractive for firms,' says Sivan.

But because most users don't respond well to the tinny and obviously computerised voices, a more human-sounding speech engine is crucial.

However, developing TTS systems for mobile devices with a limited memory capacity, like the one used in the Honda vehicles, provides a tough technical challenge.

While the quality of a computer-generated voice is boosted by an extensive sample of pre-recorded sentences, this also bloats the size of the software.

One of the approaches developed so far in Haifa is a technology that allows a voice sample library to be substantially compressed, without any significant compromise on quality.

This has allowed the lab to compress a 500MB application into a 9-15MB mobile version of the software, which can be bundled into low-capacity mobile devices.

For Honda's needs, the lab boosted the number of sample sentences to about 5000, and added support for a higher sampling frequency to provide a richer voice.

In a spoken demonstration, Sivan instructed the software to read out a paragraph from a page using first the original technology from several years back and then the newer system.

The new voice is markedly clearer and more lifelike than before, although it's still obvious that it's not from a human, which is what will keep Sivan and his team busy in the years to come.

Tags:

Further reading

Related articles

Related whitepapers

Related jobs

Do you agree?

IT white papers

Search vnunet IThound

Top categories

Job of the week

Search thousands of IT jobs :

Search thousands of IT jobs:

Advanced search

Hiring now on ComputingCareers:

Related IT jobs

Search thousands of IT jobs :

Search thousands of IT jobs:

Advanced search

Advertisement

Newsletter signup

Sign up for our range of FREE newsletters:

Existing User

Newsletter user login:

Enter email address to edit your newsletter preferences

Watch

Shaun Nichols and Iain Thomson

10 Oct 2008

7.33 MBPodcast Special: Views from the Valley More...

Podcast image

09 Oct 2008

12.99 MBComputing podcast - IT implications of the banking crisis, and the FSA clamps down on IT security More...

Shaun Nichols and Iain Thomson

03 Oct 2008

6.49 MBPodcast Special: Views from the Valley More...

Poll

Google Android

Google Android

Are you intending to try out a Google Android mobile phone?

Previous poll results

Spotlight

MoD building

Latest data breach leads MPs to demand culture change

MoD admits to losing a hard drive containing up to...  More...

Online shopping

E-retailers urged to prepare for Christmas

Credit crunch sending shoppers online for cheaper presents   More...

Mobile phone

Emerging markets drive mobile growth

Mobile penetration rates expected to reach 95 per cent by...  More...

Digital information

Poor data classification costing companies dear

Millions wasted on searching through clutter, says analyst   More...

Primary Navigation