Sami Parliament of Norway has chosen Acapela Group to develop North Sami language text-to-speech.
Sami is a language spoken in the north of Norway, Finland and Sweden. Among all different Sami languages, Acapela will develop the most widespread, North Sami, spoken by 20 000 people.
A small audience for usual developments of text to speech for new languages. But Sami text to speech language creation is much more than just adding another language to a portfolio. It’s part of a global project initiated by Sametinget to contribute to keep speaking, teaching and learning this minority language. Keep Sami a living spoken heritage, that’s what Acapela text to speech voices will contribute to achieve – together with other linguistic and learning dedicated resources.
‘We are very happy to add synthetic speech to our Sami resources, says , it is an outstanding tool to support Sami language promotion and use, either to read news, teach and learn, share stories or provide public announcements. Sami text to speech will allow to read any written content with the full respect of Sami grammar and specificities. Audio content in Sami will be accessible to all.’
To create Sami TTS, Acapela is using existing databases and lexicon provided by Sametinget and is working in close cooperation with its linguists. Full resources collected since the start of the project in 2005 are being used to create the male and female voices that will be available as ‘Colibri voices’, based on Acapela’s HMM technology.
This technology allows to generate speech from trained statistical models where spectrum and prosody are modeled altogether. It does not require any specific recordings and speaker research. While naturalness may be degraded compared to HQ technology, Colibri voice creation process is lighter, shortened, voices are flexible with small footprint.
‘Latest advancement in the text to speech and Natural Language Processing allow us to create new languages and voices using existing databases with an acceptable audio result. It opens new opportunities for speech synthesis use such as the preservation of this minority language with voices that will be accessible to all. Widely speaking, with ‘Colibri*’ voices we can deal with new constraints and answer to market demands with an additional speech solution that will benefit to more applications and usages’ comments Lars-Erik Larsson, CEO of Acapela Group.
The project is carried out by Divvun and Giellatekno (Centre for Sami language technology at the University of Tromsø) as through a tender process has engaged Acapela Group to make text to speech solutions for North Sami. Development of the system began in November this year and the finished product will be launched in October 2014.
*colibri = hummingbird in French