Cell phone calls being tapped remotely with sensors in latest research
Researchers have demonstrated a method for detecting the vibrations of a mobile phone speaker and transcribing what the interlocutor was saying with up to 83 percent accuracy. A team at Pennsylvania State University used an off-the-shelf automotive radar sensor and a new processing approach to identify this major safety issue.
“As technologies become more robust and reliable over time, it becomes likely that such sensor technologies will be misused by attackers,” said Suryodai Basak, a doctoral student at the University of Pennsylvania.
“Our demonstration of this kind of exploitation adds to the pool of scientific literature that broadly says: “Hey! Car radars can be used to listen to audio. We need to do something about it,” Basak said.
The radar operates in the millimeter wave band (mmWave), specifically in the 60 to 64 GHz and 77 to 81 GHz bands, which led the researchers to call their approach “mmSpy”. This is a subset of the radio frequency spectrum used for 5G, the fifth generation standard for communications systems around the world.
In the mmSpy demo described at the IEEE Symposium on Security and Privacy (SP) 2022, researchers simulated people speaking through a smartphone speaker.
Speech vibrates the speaker of the phone, and this vibration spreads throughout the body of the phone.
“We are using radar to feel this vibration and reconstruct what the person on the other side of the line said,” Basak said.
The researchers, including Mahant Gouda, an assistant professor at the University of Pennsylvania, noted that their approach works even when the sound is completely inaudible to either people or nearby microphones.
“This is not the first time such vulnerabilities or attacks have been discovered, but this particular aspect of detecting and recovering speech from the other side of a smartphone’s line has not yet been investigated,” Basak said.
Radar sensor data is pre-processed using MATLAB and Python modules, which are platform language interfaces used to remove hardware-related noise and artifact noise from the data.
The researchers then feed this into machine learning modules trained to classify speech and reconstruct audio.
When the radar picks up vibrations at a distance of a foot, the accuracy of the processed speech is 83 percent. It drops as the radar moves away from the phone, they say, to 43 percent accuracy at six feet.
Once speech is reconstructed, researchers can then filter, refine, or categorize keywords as needed, Basak said.
The team continues to refine their approach to better understand not only how to protect against this vulnerability, but also how to use it for good.
“The methodology we have developed can also be used to measure vibration in industrial equipment, smart home systems, and building monitoring systems,” Basak said.
There are similar home care systems or even health monitoring systems that could benefit from such sensitive tracking, the researchers say.
“Imagine a radar that can track the user and call for help if any health parameter changes in a dangerous way,” Basak said.
“With the right set of targeted actions, radars in smart homes and industries can provide a faster response when problems and problems are detected,” he added.