Poor internet connections can ruin a video call, turning a group chat into an irritating mess of glitches, hanging and dropouts. But Google Duo aims to tackle that with the power of artificial intelligence.
Announced via a Google AI Blog post, a new speech codec called Lyra is set to improve voice communication by compressing the user’s speech into a lower bitrate “even on the slowest networks.” That basically means there’ll be a reduced chance of glitches, dropouts and freezing, and thus less disruption in video calls.
Aimed at reducing data usage and minimizing latency when compressing and transmitting voice signals, Lyra is surprisingly not that complex in the way it works.
Every 40 milliseconds, the codec extracts any noticeable features or “distinctive speech attributes” from the user, which are then “compressed for transmission.” The encoded features are then used to recreate the speech signal using Lyra’s generative model at the receiving end; basically what a person on the other end of a Duo call will hear.
According to Google, “since the inception of Lyra, our mission has been to provide the best quality audio using a fraction of the bitrate data of alternatives.”
For reference, Opus, one of the most popular audio codecs, is designed to obtain transparent speech quality at 32kbps (kilobits per second), though this can go down to 6kbps with noticeably worse quality if needed. In comparison, Lyra operates at only 3kbps, significantly reducing overall data usage.
Google’s listening tests also claim that “Lyra outperforms any other codec at that bitrate and is compared favorably to Opus at 8kbps, thus achieving more than a 60% reduction in bandwidth.”
In several audio examples included in the blog post, we can clearly hear a drastic improvement in quality when using Lyra at 3kbps as opposed to Opus at 6kbps.
This ultimately means that users suffering from inconsistent internet speeds will soon be able to see an improvement in call quality when using Google Duo on either Android or iOS devices. However, it’s unclear as to when exactly Lyra will become widely available.
To ensure better performance, Lyra is constantly being trained through machine learning using countless hours of speech audio data. The audio comes from speakers fluent in over 70 languages to enable wider international usage.
Google’s blog also mentioned that in addition to constantly improving Lyra, the team is exploring ways as to how it can use the same technology to improve other codecs in the future.