DNN (Deep Neural Networks)
A deep neural network (DNN) is an artificial neural network (ANN) with multiple hidden layers between the input and output layers. DNNs can model complex non-linear relationships. We use them in Text-to-Speech to learn the relationship between a set of input texts and their acoustic realizations by different speakers.
Neural networks are a set of algorithms, modelled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labelling or clustering raw input. The patterns they recognize are numerical, contained in vectors, into which all real-world data, be it images, sound, text or time series, must be translated.
Neural Networks have revolutionized artificial vision and automatic speech recognition. This machine learning revolution is holding its promises as it enters the Text to Speech arena. Acapela Group is actively working on Deep Neural Networks. With Acapela DNN, we create a voice using a limited amount of existing or new speech recordings.