Sonal Joshi's research while affiliated with Johns Hopkins University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (12)
Adversarial attacks are a threat to automatic speech recognition (ASR) systems, and it becomes imperative to propose defenses to protect them. In this paper, we perform experiments to show that K2 conformer hybrid ASR is strongly affected by white-box adversarial attacks. We propose three defenses--denoiser pre-processor, adversarially fine-tuning...
Adversarial attacks pose a severe security threat to the state-of-the-art speaker identification systems, thereby making it vital to propose countermeasures against them. Building on our previous work that used representation learning to classify and detect adversarial attacks, we propose an improvement to it using AdvEst, a method to estimate adve...
Adversarial examples are designed to fool the speaker recognition (SR) system by adding a carefully crafted human-imperceptible noise to the speech signals. Posing a severe security threat to state-of-the-art SR systems, it becomes vital to deep-dive and study their vulnerabilities. Moreover, it is of greater importance to propose countermeasures t...
Adversarial attacks have become a major threat for machine learning applications. There is a growing interest in studying these attacks in the audio domain, e.g, speech and speaker recognition; and find defenses against them. In this work, we focus on using representation learning to classify/detect attacks w.r.t. the attack algorithm, threat model...
In this study, we analyze the use of speech and speaker recognition technologies and natural language processing to detect Alzheimer disease (AD) and estimate mini-mental status evaluation (MMSE) scores. We used speech recordings from Interspeech 2021 ADReSSo challenge dataset. Our work focuses on adapting state-of-the-art speaker recognition and l...
The ubiquitous presence of machine learning systems in our lives necessitates research into their vulnerabilities and appropriate countermeasures. In particular, we investigate the effectiveness of adversarial attacks and defenses against automatic speech recognition (ASR) systems. We select two ASR models - a thoroughly studied DeepSpeech model an...
Research in automatic speaker recognition (SR) has been undertaken for several decades, reaching great performance. However, researchers discovered potential loopholes in these technologies like spoofing attacks. Quite recently, a new genre of attack, termed adversarial attacks, has been proved to be fatal in computer vision and it is vital to stud...
Citations
... Conv-TasNet [48]: We use off-the-shelf Conv-TasNet for the feedforward model for deep regression and the generator for GANs. It is a time-domain model that has been used for other tasks like speech enhancement [49], [50] and source separation [48]. It consists of encoder, separator, and decoder. ...
... These results further verify the effectiveness of using pretrained models. (Samangouei et al., 2018) and Joint Adversarial Finetuning (Joshi et al., 2022). For DefenseGAN, which is originally designed to defend against adversarial images by finding the optimal noise that generates the most similar image to the adversarial counterpart, we adopt it to the audio domain, choosing WaveGAN (Donahue et al., 2018) as the GAN model in this pipeline. ...
... Passive defense does not modify the ASV model, instead, it defends against adversarial attacks by a mitigation or detection component. For example, the works in [25], [26], [27] proposed to remove the adversarial noise with the adversarial separation network, Parallel-Wave-GAN (PWG) module, and cascaded self-supervised learning based reformer (SSLR), respectively. Wu et al. [28] also employed a voting strategy with random sampling to mitigate the adversarial attacks. ...
... Cummins et al. [36] and Rohanian et al. [37] combined public acoustic features with neural networks and achieved an accuracy of 70.8 and 66.6%, respectively. Acoustic embeddings as speech features started to attract the attention of numerous researchers, and have gained good performance in AD detection [38,[40][41][42][43]]. There appears to be a trade-off in accuracy and convenience. ...
... We focus on attacks on speaker recognition systems, particularly on the state-of-the-art x-vector based system. Our previous work [15] proposed to use embeddings obtained by representation learning of adversarial examples as attack signatures to retrieve information about the adversarial attack. This information includes attack algorithm type and threat model 1 , Signal to Adversarial noise ratio, etc. could help in knowing about attacker identity and intention. ...