Speechmatics' breakthrough in speech recognition could be used in mobile phones, home entertainment and call centres
Cambridge technology firm Speechmatics' has unveiled a breakthrough that could change how we interact with our mobile phones and home entertainment systems.
The Cambridge firm has launched a real-time, embeddable continuous speech recognition system in many languages – a breakthrough that it says will provide the kind of high level of accuracy and speed usually only found in expensive cloud-based services.
The speaker independent technology will be of use to a host of markets, from businesses requiring transcriptions to manufacturers of devices.
It will enable improved live subtitling, or could be used to gather real-time data in call centres. It will also enable offline email dictation on mobile phones or allow home entertainment systems to feature true voice interaction.
Dr Hermann Hauser, co-founder of Speechmatics’ investor Amadeus Capital Partners, said: “We are seeing a shift in the tech industry as we move away from touchpad technology towards speech as the main form of communication. This shift is creating a need for businesses to gain immediate, actionable intelligence through highly accurate speech recognition technology, in many languages.
“There is strong demand in the market for Speechmatics, as it will allow businesses that work on an international scale to not only ensure speech is transcribed correctly, but also to improve everyday user experiences.”
The breakthrough comes through advances in recurrent neural networks – technology developed by founder Dr Tony Robinson during his PhD at Cambridge and enhanced over 30 years.
Then, in 2016, Speechmatics’ R&D team redesigned the system. Systems with large vocabularies that have worked at high speed have typically lost accuracy.
The technology has previously been limited to post-processing after an interaction or to the use of short phrases. It proved nearly impossible for large, continuous systems to process fast, accurate transcriptions over long periods.
But Speechmatics’ new system has a 250,000-word vocabulary for each language, optimised for speed and accuracy on devices from mobile phones to servers. To improve data security, the system enables data to be held and processed by the user, running natively on a device, rather than in the cloud. Speechmatics says the offline capability takes us a step closer to using such technology anywhere, at any time.
Dr Robinson said: “This is the second big breakthrough we have had within nine months.
“First, we completed the first version of our Auto-Auto system. This was a massive step forward for speech science, as for the first time we could build new languages automatically with one-two orders less data and in a matter of weeks. For example, we built Japanese in a few weeks even though no one in the team spoke Japanese.
“We’re delighted to continue making advances in speech recognition technology, building a strong R&D team and making our achievements globally available.”
A free trial demo is available at speechmatics.com.