A person’s voice is one of the most fundamental attributes that enables communication with others in physical proximity, or at remote locations using phones or radios, and the Internet using digital media. However, unbeknownst to them, people often leave traces of their voices in many different scenarios and contexts. it is relatively easy for someone, potentially with malicious intentions, to “record” a person’s voice by being in close physical proximity of the speaker (using, for example, a mobile phone), by social engineering trickeries such as making a spam call, by searching and mining for audiovisual clips online, or even by compromising servers in the cloud that store such audio information. The more popular a person is (e.g., a celebrity or a famous academician), the easier it is to obtain his/her voice samples.
In this work, we study the implications of such a commonplace leakage of people’s voice snippets. We show that the consequences of imitating one’s voice can be grave. Since voice is regarded as a unique characteristic of a person, it forms the basis of the authentication of the person. If voice could be imitated, it would compromise the authentication functionality itself, performed implicitly by a human in human-to-human communications, or explicitly by a machine in human-to-machine interactions. Equipped with the current advancement in automated speech synthesis, our attacker can build a very close model of a victim’s voice after learning only a very limited number of samples in the victim’s voice (e.g., mined through the Internet, or recorded via physical proximity). Specifically, the attacker uses voice morphing techniques to transform its voice – speaking any arbitrary message – into the victim’s voice.
As our case study in this work, we investigate the aftermaths of stealing voices in two important applications and contexts that rely upon voices as an authentication primitive. The first application is a voice-based biometric or speaker verification system which uses the potentially unique features of an individual’s voice to authenticate that individual. Our second application, naturally, is human communications. If an attacker can imitate a victim’s voice, the security of (remote) arbitrary conversations could be compromised. The attacker could make the morphing system speak literally anything that the attacker wants to, in victim’s tone and style of speaking, and can launch an attack that can harm victim’s reputation, his security/safety and the security/safety of people around the victim.
We develop our voice impersonation attacks using an off-the-shelf voice morphingtool, and evaluate their feasibility against state-of-the-art automated speaker verification algorithms (application 1) as well as human verification (application 2). Our results show that the automated systems are largely ineffective to our attacks. The average rates for rejecting fake voices were under 10-20% for most victims. Even human verification is vulnerable to our attacks. Based on two online studies with about 100 users, we found that only about an average 50% of the times people rejected the morphed voice samples of two celebrities as well as briefly familiar users. The following figure shows an overview of our work.
- Maliheh Shirvanian (PhD candidate)
- Dibya Mukhopadhyay (@UAB; Master 2016; now Sr. Data Analyst at Westfield Retail Solutions)
- Defeating Hidden Audio Channel Attacks on Voice Assistants via Audio-Induced Surface Vibrations
Chen Wang, Abhishek Anand, Jian Liu, Payton Walker, Yingying Chen and Nitesh Saxena
In Annual Computer Security Applications Conference (ACSAC), December 2019
- Quantifying the Breakability of Mobile Assistants
[Runner Up, The Mark Weiser Best Paper Award]
Maliheh Shirvanian, Summer Vo and Nitesh Saxena
International Conference on Pervasive Computing and Communications (PerCom), March 2019.
- All your voices are belong to us: Stealing voices to fool humans and machines.
Dibya Mukhopadhyay, Maliheh Shirvanian, Nitesh Saxena.
European Symposium on Research in Computer Security (ESORICS), 2015/9/21.
- Your Phone Compass Can Stop Voice Hacks for This Scientific Reason, Inverse Innovation, Jun 7, 2017
- Pillanatok alatt saját hangunkon beszélhet a mesterséges intelligencia, Bitport, Apr 27, 2017
- Sinister startup claims it can imitate any voice in just one minute, The INQUIRER, Apr 25, 2017
- Imitating people’s speech patterns precisely could bring trouble,The Economist, Apr 20, 2017
- Voice Hackers Can Record Your Voice Then Use Morpher To Trick Authentication Systems, Linkedin, Oct 18, 2015
- Automated voice can fool humans, machines, Bangalore Mirror, Oct 2, 2015
- Voice hackers can record your voice then use morpher to trick authentication systems, The Rumor Mill News, Oct 2, 2015
- UAB research finds automated voice imitation can fool humans and machines, UAB News, Sep 25, 2015
- Research Finds Automated Voice Imitation Can Fool Humans and Machines, Communications of the ACM, Sep 28, 2015
- UAB Research Finds Automated Voice Imitation Can Fool Humans and Machines, ACM TechNews, Sep 30, 2015
- Voice hackers can record your voice then use morpher to trick authentication systems, Network World, Sep 30, 2015
- UAB researchers find that automated voice imitation can spoof voice authentication systems, Biometric Update, Sep 28, 2015
- Hackers can imitate your voice to trick authentication systems, The Stack, Sep 28, 2015
- Automated Voice Imitation fools Humans and Machines, Scientific Computing, Sep 29, 2015
- Voice-based user authentication is not as secure as we thought, ITProPortal, Sep 29, 2015
- Research finds automated voice imitation can fool humans and machines, PHYS ORG, Sep 28, 2015
- Experts warn of morphing threat to voice biometrics, Planet Biometric, Sep 28, 2015
- How hackers could steal your voice to access your bank account, International Business Times, Sep 28, 2015
- Biometrics: Advances Smack Down Workarounds, Bank Info Security, Sep 29, 2015
- Hackers steal your voice, Tech Eye Net, Sep 29, 2015
- Security-Systeme mit Stimmkopie ausgetrickst, Presstext, Sep 29, 2015