SpinVox Shares Details of its World-Beating Speech Technology Breakthroughs.
August 2, 2009 | 14:46
Companies: #spinvox
- “We’re in the last mile of solving the problem of reliable automatic speech conversion,” says Daniel Doulton.
- SpinVox Voice Message Conversion System (VMCS™) has become so advanced and efficient that it has reduced its use of QC agents by 98% in just two years.
- SpinVox VMCS contains two billion words and phrases and knows over 99 per cent of the words you are likely to say.
- Quality Control agents train VMCS to recognise new or difficult words and phrases.
London, UK - 26 July 2009.
SpinVox, the global leader in voice to content messaging, today took the opportunity to reveal some of the details about the technological breakthroughs it has achieved in the development of its Voice Message Conversion System (VMCS™).
VMCS system
The core technology around which the VMCS is built is unique to the speech market, so there is little to which it can be compared. It is based on world-leading breakthroughs in automatic speech recognition (ASR) combined with artificial intelligence, semantics and natural linguistics which have been developed by SpinVox Cambridge-based Advanced Speech Group (ASG).
VMCS already knows more than 99 per cent of anything a user is likely to say. It contains over two billion words and phrases derived from the equivalent of 72 years of audio training – making it the world’s largest corpus of spoken language. This knowledge system is growing at an ever-increasing rate and is further accelerating SpinVox ability to automate.
Dr Tony Robinson, a peer of Prof. Woodland, leads the SpinVox ASG of more than 20 speech specialists and PhDs who are working continually to develop and refine the system.
Professor Philip Woodland explains: “Much of this technology derives from the world-leading research undertaken by my group at Cambridge University Machine Intelligence Laboratory. I am a consultant to SpinVox and its Automatic Speech
Group and their unique approach allows the automatic system to be exposed to huge amounts of spoken data, from which highly accurate acoustic and language models can be built.”
This combined group of technologies and processes is the main engine for converting voice messages to text, rather than human intervention.
Having experimented with purely automatic speech conversion, SpinVox decided early on in its development that because its voice to text service converts real-life, dynamic and fast-evolving language and messages that we use and exchange every day (known in the industry as ‘free form speech’), it was essential that the system had the capability to evolve at the same rate, converting the latest words, phrases, brand names and colloquialisms to ensure a high level of accuracy. This is why it describes the system as ‘live-learning’ .
SpinVox realised that only by combining its rapidly evolving state-of-the art technology with human quality control and training, could it create a system which could complete elements of messages which could not be automatically converted. As a result, the system constantly learns new words and phrases, making it increasingly efficient and reliable. Deal with the complexity of modern language, fast-rate of change constantly dynamic to deal wiht the speed of chamng and the complexities of modern language.
Explains SpinVox CIO Rob Wheatley: “Quality Control agents are an important part of the SpinVox service because their constant minute-by-minute input actually improves the quality of text conversions in a process we call `live learning`. The technology is a bit like a human brain, in that, the more it is exposed to input, the more it learns.
“This process has helped us improve our accuracy massively. Since its inception in 2007, the technology has improved to the extent that the system requires only two per cent of the input it required just two years ago and can even now predict more than 99 per cent of what most people speaking in English or Spanish will say next. Or to put it another way, in just two years, we have reduced the requirement for human intervention to just a few hundred agents per market compared to the thousands per market when we started. Our world-class speech scientists in the Advanced Speech Group have helped make this system unchallenged in terms of accuracy, speed and reliability.”
Privacy
As discussed above, SpinVox VMCS learns from human intervention. This is the reason SpinVox works with five, world-class, call centres which have been chosen after SpinVox put around 50 call centres through its stringent quality control and security procedures.
Every message is dealt with initially by the automated system. Only in cases where speech is too indistinct to be dealt with by the system, or contains unfamiliar or new words or phrases, is the completely anonymised and encrypted message sent to a QC agent for help. The agents will only ever see the messages that need input and do not know how many other messages have been converted, processed and sent automatically by VMCS.
SpinVox is fully compliant with industry standards relating to the processing of information, including the Data Protection Act 1998. To this end, any part of a message seen by the Quality Control centres is anonymous, encrypted and randomised, meaning that it is impossible to determine where the messages are from or where they are going to. SpinVox has achieved two prestigious ISO qualifications: ISO 27001 (the international Information Security Standard) and ISO 9001:2008 quality certification.
Jaime Tronqued is President of ScopeWorks Asia, Inc., a Quality Control house which handles private and secure customer services for companies in telecommunications, travel, banking, healthcare and insurance.
ScopeWorks has been working with SpinVox for the past two years and Tronqued says: “I can categorically assure people that SpinVox messages are both private and secure. There are many layers of security and privacy that are used to ensure this and SpinVox was extremely thorough in its audit of our operations, our security and our privacy procedures as they ran a training pilot with our Quality Control agents on test conversion messages, prior to contracting with us to deliver a live customer service using encrypted and anonymised messages. “
Adds Ragindra Persaud, CEO of NPIC, a South American-based Quality Control house, which has also been working with SpinVox for the past two years. “SpinVox went through a thorough review of our processes and procedures, and it was not until they were fully satisfied that we comply with their stringent requirements, that they contracted with us officially. In the live customer system there is no way for any Quality Control agent to know where a message has come from or to whom it is being sent, nor copy or abuse this.”
SpinVox – now and in the future
SpinVox has achieved enormous success in the past and intends to achieve even greater success in the future. It currently has 30 million users worldwide and will be converting voice to text services for over 100 million users by the end of 2009. It is a British success story based on breakthroughs in technology that have established a new class of automated speech conversion.
Says co-founder Christina Domecq: “We have spent five years working very hard, building up a solid foundation for this business. It’s something of which we are very proud, whether it’s our technology, our customer group, subscriber base or investors. Like every business, we have to deal with challenges, particularly in the middle of a credit crunch, but we are in a strong position, are growing fast and are looking to the future with confidence. We are a business that is founded on a long-term vision and for that reason we do everything by the book, which includes a very thorough approach to security and due diligence.”
Adds co-founder Daniel Doulton: “We knew when we started SpinVox that the time was right for this kind of breakthrough. We had to take this innovative approach, because leading commercial speech technology at the time was unable to deliver a reliable experience. We’ve developed our own, data-driven system which works on a 'meaning' (i.e. semantic basis) to solve the automation problem. To help achieve this, we have recruited the best speech scientists from Cambridge and abroad to build a team of more than 20 speech specialists and PhDs. Now, SpinVox is leading the market, having created a whole new category of ‘speech as a service’. We’ve already signed 28 operators in the two short years we’ve been selling to carriers, and there are more deals to come.”
Explains Julie Meyer, CEO, Ariadne Capital, an early investor in SpinVox: “The company’s genius is how it integrates its technology with human intelligence, which is still the only thing that machines can learn from, to do virtually real-time voice-to-text conversion.
“SpinVox has built up huge value in terms of its intellectual property, customer relationships, management and revenue growth. Because they don’t sell through UK mobile operators people tend to forget just how successful they are continuing to be in the rest of the world. To close a deal in the past few months with Telefonica, for instance, and subsequently roll-out the service to 13 countries across Latin America is an amazing feat that we in the UK should feel proud of.”
Want to comment? Please log in.