Thursday, November 7, 2024

How we developed the adaptive audio features for Google Meet

- Advertisement -

Adaptive audio

What is adaptive audio?

Adaptive Audio is a feature that automatically adjusts the level of noise cancellation and transparency mode based on your surroundings. It helps block out noise in loud places and lets in important sounds, like voices or alarms, when needed. This way, you can stay aware of your environment without having to change settings manually.

The Google Meet team put up a demo area with many laptops arranged side by side at this year’s Cloud Next. After bringing clients into the room, the team asked them what they would think would happen if all of the laptops entered the conference at once.

- Advertisement -

Following the global shift to video conferencing and, subsequently, hybrid work as a result of the pandemic, the team began working on adaptive audio. Due to supply chain difficulties at the time, obtaining new conference room hardware was difficult. Additionally, Huib notes that many businesses either lacked the funds for specialized meeting room technology or did not initially have enough video conferencing rooms.

Without having to cram themselves around a single laptop, teams needed to be able to set up ad hoc meeting areas. However, it’s far more difficult than it seems to allow everyone to connect from their own devices while keeping the “screams” quiet.

Consider the audio system in a movie theater. According to Meet Software Engineer Manager Henrik Lundin, “you have a number of speakers around you, and it’s a pleasant audio experience because they’re all connected to the same sound source, so they play out in an intended synchronicity.” Now, it would sound awful if multiple devices were playing the same music in the same room without synchronization. As if you were in a huge cathedral, you are receiving numerous copies of the same sounds. Similarly, when you talk in front of a group of microphones on several devices, they all record sound simultaneously even though they are not on the same time.

The echo issue comes next. You’ve undoubtedly observed that when you use video conferencing technologies, you occasionally hear an echo of your own voice.”The devices that conduct meetings have an echo canceller built in, so you don’t always get that,” Henrik says. It’s a signal processing method that attempts to determine which portion of the microphone signal is your speech and which portion is merely coming from the device’s speakers. When several laptops are in the same room playing the same audio and connecting to each other’s microphones, this becomes ten times more difficult.

- Advertisement -

The team had to spend a lot of time in the same room and figure out how to make their laptops recognize each other as being adjacent to each other in order to solve this audio challenge. Initially, they experimented with allowing attendees to join pre-established groups during the conference. “This was clearly prone to mistakes, but it allowed us to test the experience of synchronizing the microphones and speakers on all of the laptops,” Henrik adds.

They then experimented using ultrasound. The laptops may sense the presence of other computers nearby and start acting as a group by making high-frequency noises that are inaudible to the human ear. Users no longer had to choose the room they were in or manually configure their devices as a result. It was quite difficult, though, as Henrik explains, “because the ultrasound had to be accurate and dependable on any device if audio leaks from the room next door, it shouldn’t think you’re in the same room.” In order to improve accuracy, the researchers used a novel kind of ultrasound and adjusted the volume and frequency to maximize reach without being audible.

Adaptive audio immediately turns on when Meet recognizes that there are several laptops present, synchronizing the microphones and speakers on each laptop without shutting down any of them. Depending on who is speaking, it alternates between microphones to avoid echo and feedback. Before sending Adaptive audio to other participants, Meet also employs backend processing and a cloud denoiser to improve audio quality and eliminate background noise.

Adaptive audio is already used in numerous Google meetings every day, often without the participants’ knowledge. It is one of those technologies that relieves the user’s cognitive burden. Before attending a meeting, people don’t need to question whether they’re set up correctly, explains Ahmed Aly, Meet Interaction Design Lead. No matter how complex and incredible the tech is, from the end user’s perspective, it simply works anytime they open their laptop and attend a meeting.

In the future, the group is still investigating ways to facilitate connections, particularly in situations where meeting spaces or conferencing equipment are not available. Huib stated that “we hope it gives more flexibility and improves meeting equity and participation.” “You can be seen and heard clearly from wherever you are sitting because the camera and microphone are directly in front of you.”

- Advertisement -
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes