Axon Voice Assistant is an application that allows issuing voice commands on smartphones: calling a contact by name or by phone number, sending messages, starting navigation, etc.
Key features of the application are:
- Voice dialing (by contact name or phone number)
- Address book, contact and call log management
- Text message management
- Dictation of text messages, using an Internet connection
- Calling contacts and initializing sending messages to contacts using Viber and WhatsApp
- Starting navigation to any desired location selected by voice, using Google Maps or Here WeGo
Additional functionalities of the application:
- The ability to carry out all the actions by touch, without the use of voice, if the user desires so
- Quick and easy activation of the application by shaking the phone
- Working without an Internet connection (except for the first launch and dictation of messages)
- Filtering contacts, messages and call logs by name or part of a name, messages by content, and call logs by type of call
- The application is specifically adapted to the Serbian language, so it is possible to use the names in their appropriate morphological word forms ("Pozovi Lučića" instead of "Pozovi Lučić"). The application supports both the Cyrillic and the Latin alphabets, as well as names written without using letters like 'č' and 'ć' (user can say "Lučić" even though it is written as "Lucic" in the address book)
- Automatic alphabet conversion when sending text messages
- You can set Axon Voice Assistant to be the default messaging application
Beta version of this application (1.3.14) is available from the following link: Download Axon - Voice Assistant
How it works
Dialog manager
This module is responsible for entire behavior of the system. It uses the output of the speech recognizer as its input, and performs the appropriate action.
A set of tasks is defined, with a precise specification of the information required for their execution. Some examples of these tasks are: calling a contact, sending a text message to a contact, etc.
If the system fails to recognize the verbal input, an appropriate message will be displayed for the user. If the user fails to provide all the necessary information, the system will ask additional questions.
Natural Language Understanding
The NLU module converts the user’s query into a form suitable to the dialog manager. For instance, if the query is recognized as 'Send a text message to Vesna Petrović', the dialog manager will receive: "command: SEND_SMS; contact: Vesna Petrović".
Natural Language Generation
The function of this module is dual to the function of the NLU module. Namely, it converts the information from the format suitable to the dialog manager into a sentence of the natural language.
Implementation of speech technologies on mobile platforms
Until recently, speech recognition was limited to small vocabularies and to the PC platform. The vocabulary of this application is significantly larger, and the software is optimized to conform to the resource limitations of mobile phone devices.
As for speech synthesis, our previous solutions were of high quality, but also restricted to a PC. Now we have developed a less resource demanding version compatible with the operating systems of smartphones. This was possible with a slight degradation of the quality of the synthesis, acceptable from the point of view of the target application.
Speech Recognition Accuracy
It is well known that a number of recognizers by renowned manufacturers function with insufficient accuracy, even for major languages, which leads to user frustration and dissatisfaction.
In order to increase recognition accuracy, the language model used for recognition is particularly tailored to suit the functionality used at a particular moment. For example, the user cannot open a contact from the address book and then call someone else. This more restrictive approach involves a certain (although very short) adjustment period, but it provides much more reliability.