Microsoft speech recognition achieves the lowest-ever error rate in recent study

Kareem Anderson

According to the keynotes delivered during several developer conferences over the past year, three key areas companies are looking to lead the technology industry in the future include a focus on machine learning, artificial intelligence and speech recognition.

Ideally, the avenues of machine learning, speech recognition, and artificial intelligence will intersect and create a seamless experience for users who opt to communicate through burgeoning digital assistants or applications that rely heavily on cloud-connected data.

Fortunately for Microsoft, its stake in a digital assistant that uses speech recognition is starting to pay off for the company as its achieves a new milestone in human and machine interactions.

According to a recent benchmark evaluation reported by Microsoft’s chief speech scientist Xuedong Huang, the company managed to mark its lowest word error rate (WER) to date. When compared to the industry standard Switchboard speech recognition task, Microsoft researchers managed to jot down a WER of 6.3 percent. Microsoft’s new 6.3 WER stands currently as the sector’s lowest markings to date.

The Microsoft researchers behind the new speech recognition feat attribute their success to foundations developed with Neural networks. Earlier this year, Microsoft researchers also won the Image computer vision challenge that utilized its work in neural networking. By using Microsoft’s cross-layer network connections, researchers we able to use each layer to optimize recognition and association of speech patterns, definitions, etc.

Cortana enabled
Cortana enabled

Another key scientific element contributing to the new low measure was Microsoft’s other successful jaunt with its Computational Network Toolkit. Once again, CNTK allowed researchers to make use of sophisticated optimizations by way of learning algorithms that helped users and computers tap into quickened learning algorithms.

Huang adds that the speech recognition milestone is a significant marker on Microsoft’s journey to deliver the best AI solutions for its customers. One component of that AI strategy is conversation as a platform (CaaP); Microsoft outlined its CaaP strategy at the company’s annual developer conference earlier this year.”

Although it may been said seven months ago and subsequently forgotten by most, Microsoft is betting on a future where voice will become the new ‘swipe’ and user interactions should be as seamless through vocal input as it is when using touchscreens and apps.

Microsoft researchers seem well on their way to making a movie such as Her, a reality rather than a bullet point on a PowerPoint presentation during a developer conference keynote.

To read more about the men and women helping to bring this project to light or to find out more about the low ranking was achieved, visit Microsoft’s Official Microsoft Blog for details.