Cambridge AI company Speechmatics can learn a new language in a week
We talk to CEO Benedikt von Thüngen as firm unveils machine learning technology called Automatic Linguist.
It hopes one day to master all 7,000 languages in the world.
And Speechmatics is off to a good start, having deployed its artificial intelligence technology on 28 so far, enabling accurate speech-to-text transcription for a host of purposes.
Yesterday, the Cambridge company officially launched Automatic Linguist (AL), its machine learning platform that enables it to tackle a new language in as little as a week.
Purpose-built from the ground up using technology developed at the University of Cambridge, the platform recognises patterns in language and applies them to each new build.
Challenged by a major corporate client with learning Hindi in two weeks, Speechmatics delivered a production-ready system that, according to its test, makes 23 per cent fewer errors than the market leader.
Benedikt von Thüngen, CEO at Speechmatics, told the Cambridge Independent: “Hindi was surprisingly easy. We discovered it is very similar to English in pronunciation, so we were able to use a process called adaptation. It learned from the different data sets we have.
“Each language has interesting aspects. If you take Korean, Turkish, Finnish or German, they use agglutination – where words are added together to form new words. That was a fun challenge to solve…
“You have tonal languages, like Vietnamese and Mandarin and its variations, which was another fun challenge. But it’s a matter of teaching the system to deal with it and that opens up a new swathe of languages.”
The traditional route for enabling speech recognition of a language involved laborious, expensive manual processes, in which vast amounts of data was collated and cleansed by experts, creating a one-off system. It meant it was economical to cover only the most widely-spoken languages.
But using decades of research in neural networks, developed from Cambridge PhD work by Dr Tony Robinson, who is now CTO at the company, Speechmatics can learn the initial base of a language in less than a day from relatively little data by recognising fundamental sounds and grammatical structures.
One major use of the software is providing accurate real-time closed captioning for TV, and it is adding a custom dictionary to cater for specialist languages – football players’ names, for example.
Meetings and recordings can be transcribed with the platform and Benedikt said financial institutions can use the technology for call recording to demonstrate compliance or to audit for PPI mis-selling.
He predicted: “Voice will very quickly become the main mechanism to interact with devices. We are seeing the really early adoption of that with Amazon Alexa, Siri and Google Voice, which is phenomenal but it’s still very one-dimensional.
“We’ll move more into an intuitive world and we are going to get to much more conversational interfaces.”
Benedikt says: “Our USPs are accuracy, operational performance – memory footprint and speed – the breadth of language coverage and speed with which we offer a new language, and the speed at which we can add new features.”
A white paper from the firm says: “Our eventual aim is to have a language pack for all the world’s languages. This is an ambitious aim – it is estimated there are around 7,000 living languages at present, and we hope to cover them all one day.”
Benedikt conceded that was a tall order, given that most of those are barely documented, but said Speechmatics would look first to get to the milestone of 100 and then 1,000.
Little wonder the Kirkwood Road-based company, which jointly won the AI scale-up category of the Cambridge Independent’s Entrepreneurial Science and Technology Awards in September, is looking to double its staff of 39 over the next 12 months.