Applications of speech technologies

Integration into existing applications

1AlfaNum ASR and TTS, as basic components for speech recognition and synthesis, are primarily intended for companies dealing with software companies and system integrators. In most cases such companies already have deployed fully functional solutions, but ASR and TTS can provide additional functionality or attractiveness of an application or service. Speech technologies also open the door to the development of entirely new applications and services, which were completely impossible to create using conventional methods of communication with the user.

Upgrade of call centres and interactive voice response (IVR) systems

2ASR and TTS can be used to upgrade call centres and IVR systems, allowing automatic recognition of sequences of digits (PIN codes), amounts, dates, proper names etc., as well as automatic creation of flexible voice prompts and reading out any textual information using speech synthesis. Furthermore, addressing the users by name and/or company name provides additional quality to the service which was impossible to achieve until now.

Extended dialling by voice

3Making telephone calls was never simpler. Each organisation, regardless of its size, can now have its private telephony operator. It is no longer necessary to memorise dozens of telephone numbers or browse endless menus. One needs just to pick up the receiver, dial a number and say aloud the name of the person or the department of interest. The system is connected to the existing private branch exchange, and is able to transfer the call to the desired person or department, or to initiate an outbound call. Each employee can define their own personalised phonebook using a simple web interface. This functionality can be used by both employees and outside callers.

Information related to timetables

4Using an interactive voice response (IVR) system with ASR capabilities, it is easy to obtain the necessary information related to any timetable. For instance, if the caller states the desired destination and the day by voice, the system provides the requested information (the times of departures) through speech synthesis. Such a system can be applied at bus or railway stations, airports...

TV schedule

5In the car, on a bus, on a beach, on a hill... To find out what is on TV it does not matter where you are. An efficient solution is that the caller gets all the desired information at one place, through a range of available queries. Another solution is that each broadcasting company sets up its own interactive voice response system.

White and yellow pages

6The users can say the names and addresses of private individuals or organisations of interest (e.g., Petar Petrović, Novi Sad), and the system will find the requested data (e.g. telephone number) in the appropriate database and reply by synthesised speech. More flexible searches can also be supported – by business activities, keywords...

Sports results

7Many bookmakers want to keep track of sports results, betting odds and timetables of sports events at any moment. By means of a simple phone call all such information becomes available at any time of day or night, through efficient and intuitive communication with an interactive voice response system.


8Using ASR or TTS introduces new quality to many existing information services, but also allows the creation of many more, such as televoting, lottery, personal ads services, horoscope... The callers do not need to memorise the digit(s) to be keyed in, they just say aloud the word or phrase of interest (e.g. “Scorpio”), and the system recognises it, retrieves the desired information from a database and replies to the caller by synthesised speech.

Medical appointment scheduling

9The callers identify themselves by their social security numbers, and in a later phase also by their telephone numbers (possibly with additional verification such as: “Is this Mr Petar Petrović?”). The callers then state the name of the desired department (surgery, orthopaedics...), and/or physician. After making this selection, the callers are offered the list of time slots available for appointment, out of which they choose one.

Adding speech functionality to websites

10In the world of global networking and abundance of information it is not enough to be one of many and have all that everybody else has. Adding speech functionality to your website can be a feature that will make the difference. The TTS technology can convert the textual content of the website into speech, offering a new dimension of surfing. The feature is particularly useful to the visually impaired, but to many other visitors as well.

Subtitling TV shows

11The AlfaNum ASR system can be used for subtitling shows in the Serbian language. At the moment this technology can be applied whenever the textual transcription of the show is available, and the system can then perform automatic synchronisation by comparing the audio content of the show with the transcription. The synchronised subtitles are displayed through teletext to the viewers who opt for it. This feature is of particular interest to the hearing impaired and the elderly, but to many others who, for some reason, have a need for it.

Do you have an idea?

12Speech technologies are so versatile that it is hard to imagine a branch of human activity where they cannot be applied. On the webpage of the AlfaNum project you can see many other examples, and you should feel free to suggest your own idea to us as well.