I Spy You: Eavesdropping Continuous Speech on Smartphones via Motion Sensors

Shijia Zhang, Yilin Liu, Mahanth Gowda

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents iSpyU, a system that shows the feasibility of recognition of natural speech content played on a phone during conference calls (Skype, Zoom, etc) using a fusion of motion sensors such as accelerometer and gyroscope. While microphones require permissions from the user to be accessible by an app developer, the motion sensors are zero-permission sensors, thus accessible by a developer without alerting the user. This allows a malicious app to potentially eavesdrop on sensitive speech content played by the user's phone. In designing the attack, iSpyU tackles a number of technical challenges including: (i) Low sampling rate of motion sensors (500 Hz in comparison to 44 kHz for a microphone). (ii) Lack of availability of large-scale training datasets to train models for Automatic Speech Recognition (ASR) with motion sensors. iSpyU systematically addresses these challenges by a combination of techniques in synthetic training data generation, ASR modeling, and domain adaptation. Extensive measurement studies on modern smartphones show a word level accuracy of 53.3 - 59.9% over a dictionary of 2000-10000 words, and a character level accuracy of 70.0 - 74.8%. We believe such levels of accuracy poses a significant threat when viewed from a privacy perspective.

Original languageEnglish (US)
Article number197
JournalProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume6
Issue number4
DOIs
StatePublished - Jan 11 2023

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'I Spy You: Eavesdropping Continuous Speech on Smartphones via Motion Sensors'. Together they form a unique fingerprint.

Cite this