Microsoft has spent the past two years adding flashy new productivity features to Teams, and now the company is overhauling how the fundamentals work with AI. We’ve all had a call where someone had poor room acoustics making it hard to hear them, or seen two people trying to talk at the same time, creating an awkward “no, you’re gonna” moment. before “. Microsoft’s new AI-powered voice quality improvements should improve, if not eliminate, these everyday annoyances.
Microsoft is now using machine learning models to improve room acoustics so you no longer feel like you’re hiding in a cave. “While we’ve done our best with digital signal processing to do a really good job in Teams, we’ve now started using machine learning for the first time to create echo cancellation where you can really reduce echo from all the different devices,” said Robert Aichner, senior program manager for intelligent chat and communication cloud at Microsoft, in an interview with The edge.
Microsoft has been testing this for months, measuring its real-world models to make sure Teams users notice the echo reduction and improved call quality. The software maker has used 30,000 hours of speech to help train its models and has captured thousands of devices through crowdsourcing where Teams users are paid to record their voice and play audio from their device.
“We also simulate about 100,000 different rooms…room acoustics play a big role in echo cancellation,” says Aichner. The result is a big improvement in call audio quality and echo cancellation that also allows multiple people to talk at the same time. You can see all the improvements in action in the video above.
If Teams detects that sound is bouncing or resonating in a room resulting in shallow sound, the model will also convert the captured audio and process it to make it sound like Teams participants are speaking into a short-range microphone at the instead of an echo mess.
The most impressive part is the ability for people to drop into Teams calls now, without the annoying overlap where you can’t hear the other person because of the echo. Microsoft is now shipping all of this work into Teams, alongside earlier improvements to AI-based noise cancellation. All processing is done locally on client devices, instead of the cloud.
“We said we wanted to do it on the customer, because the cloud is still expensive if you want every call handled in the cloud…and obviously we would have to pass that cost on to the customer,” says Aichner. That would potentially mean restricting these significant Teams improvements to paying customers, and the on-device route means features like noise cancellation are available on 90% of devices using Teams.
All of these new Microsoft Teams improvements are now live, along with some real-time screen optimizations for text in videos and AI-powered improvements for bandwidth constraints when video calling or sharing apps. ‘screen.