• The best choice for the company!
    The best choice for the company!

    Speed up and ease the communication with devices around you.

  • Get the best solution for you!
    Get the best solution for you!

    Upgrade the call centers to unseen levels.

  • Let's get work done together.
    Let's get work done together.

    Future arrived here, too. Become a part of it.

  • The best choice for the company!
    The best choice for the company!

    Work even when your eyes or hands are busy.

  • Get the best solution for you!
    Get the best solution for you!

    Future arrived here, too. Become a part of it.

Main objectives and activities

  • The development of flexible text-to-speech synthesis (TTS) of high quality
  • The development of large vocabulary continuous automatic speech recognition (ASR)
  • The research and development of emotion speech recognition
  • The development of speech morphing systems
  • The development of natural language processing modules including dialogue management
  • The application of the developed speech technologies in Western Balkan countries:
    • in multimodal human-machine dialogue systems (IVR, smart phones, smart homes)
    • for purposes such as: text reading, text dictation, speech transcription
    • within aids for the physically disabled, visually impaired, speech impaired, hearing impaired.

The most important innovative results

One can listen to news at a number of speech-enabled web sites (Radio Television of Serbia - RTS, Radio Television of Vojvodina - RTV, eUprava, as well as several municipalities) using a computer or a smart phone. The visually impaired can listen to any text displayed on the screen using the software anReader based on AlfaNumTTS. The AlfaNumASR and AlfaNumTTS components have provided smart phones with basic speech generation and understanding functionalities in Serbian.

Further development of both large vocabulary ASR and more advanced TTS is based on the aforementioned speech and language resources. Both technologies will enable a much wider range of applications and will contribute to the preservation of Serbian and kindred languages in this new domain of communication – spoken dialogue between humans and machines.

Introduction

Welcome to our machine learning services! We have a team of highly skilled and experienced experts with over 7 years of experience in the field. Our team of professionals is dedicated to delivering top-notch solutions to our clients, and we pride ourselves on our ability to tackle any problem that comes our way.

Our expertise covers a wide range of areas, including speech recognition and speech synthesis, data analysis, predictive modeling, natural language processing and more. We utilize the latest technologies and techniques to ensure that our clients receive the best possible results. Whether you need help with a specific project or require ongoing support, our team is always ready to assist.

At our core, we understand that the key to success in machine learning lies in understanding the unique needs and challenges of each individual client. That's why we take the time to work closely with our clients, ensuring that we have a deep understanding of their business goals and objectives. With this information in hand, we are able to create tailor-made solutions that are both effective and efficient. So if you're looking for a team of experts who can deliver high-quality machine learning solutions, look no further than our company. We're here to help you succeed!

Team Overview

The company has a team of 6 experts available for part-time or full-time engagement on a project in the area of machine learning

  • Strong experience in Machine Learning and Artificial Intelligence
  • Most team members have PhDs in Electrical Engineering and Computer Sciences, with outstanding scientific results
  • The team has specific expertise in speech technology, and has so far developed fully functional and high-quality Speech Recognition and Speech Synthesis for English (A.E. and B.E.), Spanish, Italian, Serbian, Croatian and Hebrew, and has launched into the market a number of products based on them
  • Some of these technologies represent the state of the art, e.g. neural network adaptation to easily produce speech in a specific voice or a specific speech style, even cross-lingual

Team Skills and Technologies Used

  • General skills
    • Machine learning platforms: TensorFlow, Keras, PyTorch
    • Programming languages: C++, Python, Java
    • Platforms: Windows, Linux, Android.
  • Skills and technologies/tools specifically related to speech technology:
    • Text-to-Speech: HTK/HTS, Merlin, Tacotron, WaveRNN, HiFi-GAN
    • Speech Recognition: HMM, DNN, Kaldi
    • Natural Language Processing: Part-of-Speech and semantic analysis, phonetic and prosodic disambiguation, sentiment analysis
  • Neural network architectures:
    • Feed forward and recurrent (LSTM, GRU)
    • Generative adversarial networks (GANs)
    • Convolutional networks
    • Transformers

Other Relevant Information

  • The company possesses its own computational and data storage resources
  • It also has its own data processing team of 8 people, in case some semi-automatic data annotation is required
  • In case the project is related to speech technology, the company also possesses a remarkable range of speech and language resources (databases) necessary for any such development

Contact Data

Darko Pekar, PhD

AlfaNum Ltd.

This email address is being protected from spambots. You need JavaScript enabled to view it.

The system for dictation of medical findings, i.e. automatic creation of medical findings based on dictated speech, is aimed at increasing the efficiency of medical staff and allowing them to focus on more important aspects of their work. The system can be adapted to the vocabulary of any area of medicine, and it can also be easily integrated into existing systems and applications already in use, with minimal need for additional training of end users.

Possibilities

The system for dictation of medical findings:

  • recognizes speech delivered naturally with hardly any errors, on a computer of average performance, without any special microphone, in real time – without delay;
  • recognizes and correctly interprets abbreviations, punctuation, capital letters;
  • recognizes and correctly interprets latin medical terminology, and successfully combines recognition of Latin and Serbian (e.g. status post hysterectomiam in October two thousand and twelve);
  • supports special commands according to user requirements („delete word/sentence“ etc.);
  • allows the user to manually correct an incorrectly recognized word.

A demo of the system (in Serbian) can be found at:

Benefits

The system is based on client-server architecture, which means that recognition is carried out by a centralized server, which is either cloud based, or located within the premises of the institution. The server receives speech sound recordings from computers of end users and returns the recognized text. This approach has two significant advantages:

  • recordings never reach a public network, which implies that their privacy is absolutely safe;
  • acquisition of new and more powerful computers for end users is not needed, which significantly lowers the cost of the system in comparison with a scenario in which recognition is performed locally.

Hardware requirements of the system principally depend on the maximum number of simultaneous requests for service, but to some extent on the size of the vocabulary as well. One standard CPU core is typically able to service one recognition channel. The use of graphical processing units (GPU) significantly increases the number of channels that can be serviced.

Speech is the basic means of communication between humans.

These casinos let players have a choice to select from a list of slots offered by them. Online UK PayPal casinos are very simple and easy to play. This is one reason why many people are attracted to play online slots. Online slots casinos offer a wide variety of machines for different kinds of players.

Speech

Speech is the basic means of communication between humans

Using speech, humans can convey their thoughts and feelings to others in a way much more intricate than in any other animal species, and thus the human speech system is the most complicated one...

Read more

ASR

Automatic Speech Recognition

Automatic speech recognition (ASR) is considered one of the greatest technical challenges of today, attracting attention of many researchers worldwide for more than half a century...

Read more

TTS

Text-to-Speech Synthesis

Text-to-Speech Synthesis (TTS) is the oldest speech technology, originating from as early as the 18th century, when first "speaking machines" appeared...

Read more