• The best choice for the company!
    The best choice for the company!

    Speed up and ease the communication with devices around you.

  • Get the best solution for you!
    Get the best solution for you!

    Upgrade the call centers to unseen levels.

  • Let's get work done together.
    Let's get work done together.

    Future arrived here, too. Become a part of it.

  • The best choice for the company!
    The best choice for the company!

    Work even when your eyes or hands are busy.

  • Get the best solution for you!
    Get the best solution for you!

    Future arrived here, too. Become a part of it.

Main objectives and activities

  • The development of flexible text-to-speech synthesis (TTS) of high quality
  • The development of large vocabulary continuous automatic speech recognition (ASR)
  • The research and development of emotion speech recognition
  • The development of speech morphing systems
  • The development of natural language processing modules including dialogue management
  • The application of the developed speech technologies in Western Balkan countries:
    • in multimodal human-machine dialogue systems (IVR, smart phones, smart homes)
    • for purposes such as: text reading, text dictation, speech transcription
    • within aids for the physically disabled, visually impaired, speech impaired, hearing impaired.

The most important innovative results

One can listen to news at a number of speech-enabled web sites (Radio Television of Serbia - RTS, Radio Television of Vojvodina - RTV, eUprava, as well as several municipalities) using a computer or a smart phone. The visually impaired can listen to any text displayed on the screen using the software anReader based on AlfaNumTTS. The AlfaNumASR and AlfaNumTTS components have provided smart phones with basic speech generation and understanding functionalities in Serbian.

Further development of both large vocabulary ASR and more advanced TTS is based on the aforementioned speech and language resources. Both technologies will enable a much wider range of applications and will contribute to the preservation of Serbian and kindred languages in this new domain of communication – spoken dialogue between humans and machines.

Magnifying glass

Word Spotter is a system that enables highly efficient and reliable search for predefined keywords in a large quantity of audio material. It is based on the automatic speech recognition (ASR) technology, but is optimised for locating particular words and phrases, disregarding any of the remaining speech, background noise or music.

With Word Spotter it is no longer necessary to listen to all of the existing audio material in search of some words or phrases. Users can specify a list of words or phrases to be detected and import appropriate sound files. After a certain time needed for processing, a list of appearances of target words or phrases in the sound files is ready. The user just needs to go through the list and select the occurrences of interest.

The system has the following features:

  • Search for an arbitrary number of words or phrases in an unlimited quantity of audio material
  • Automatic inflection of key words – the application attempts to find all grammatical forms of a word (if so specified)
  • Support for a range of formats of audio files
  • Support for multiple parallel searches in the background (leaving the user free to do something else in the meantime)
  • Manual verification of the results
  • Support to modern multicore and multiprocessor platforms
  • Possibility of distribution over multiple computers and load balancing, which is of crucial importance in highly demanding environments
  • Software can be obtained in several forms: as a stand-alone application with its own GUI, or as an API or library, integrated into some of our other products (Audiomemo recording system)
  • The integration with a module for detection of high levels of emotion is also under way, which will contribute to the efficiency and applicability of the system.

     

Speech

Speech is the basic means of communication between humans

Using speech, humans can convey their thoughts and feelings to others in a way much more intricate than in any other animal species, and thus the human speech system is the most complicated one...

Read more

ASR

Automatic Speech Recognition

Automatic speech recognition (ASR) is considered one of the greatest technical challenges of today, attracting attention of many researchers worldwide for more than half a century...

Read more

TTS

Text-to-Speech Synthesis

Text-to-Speech Synthesis (TTS) is the oldest speech technology, originating from as early as the 18th century, when first "speaking machines" appeared...

Read more