Article

Toward Stealthy Backdoor Attacks Against Speech Recognition via Elements of Sound

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Deep neural networks (DNNs) have been widely and successfully adopted and deployed in various applications of speech recognition. Recently, a few works revealed that these models are vulnerable to backdoor attacks, where the adversaries can implant malicious prediction behaviors into victim models by poisoning their training process. In this paper, we revisit poison-only backdoor attacks against speech recognition. We reveal that existing methods are not stealthy since their trigger patterns are perceptible to humans or machine detection. This limitation is mostly because their trigger patterns are simple noises or separable and distinctive clips. Motivated by these findings, we propose to exploit elements of sound ( e.g ., pitch and timbre) to design more stealthy yet effective poison-only backdoor attacks. Specifically, we insert a short-duration high-pitched signal as the trigger and increase the pitch of remaining audio clips to ‘mask’ it for designing stealthy pitch-based triggers. We manipulate timbre features of victim audio to design the stealthy timbre-based attack and design a voiceprint selection module to facilitate the multi-backdoor attack. Our attacks can generate more ‘natural’ poisoned samples and therefore are more stealthy. Extensive experiments are conducted on benchmark datasets, which verify the effectiveness of our attacks under different settings ( e.g ., all-to-one, all-to-all, clean-label, physical, and multi-backdoor settings) and their stealthiness. Our methods achieve attack success rates of over 95% in most cases and are nearly undetectable. The code for reproducing main experiments are available at https://github.com/HanboCai/BadSpeech_SoE.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... First, at the time of writing, there are only a few other attempts to strike backdoor poisoning attacks on speech-related models based on transformers, with only Mengara [21] specifically performing poisoning on SR transformer-based models. Cai et al. [22] attack two transformer-based models, but these models are deployed for KWS, not for general SR tasks. Second, many of the papers that we analyse in the remainder of this section [10]- [15], [22]- [24], focus on KWS systems, using as a benchmark the Google Speech Commands dataset [2], which contains only single-word utterances. ...
... Cai et al. [22] attack two transformer-based models, but these models are deployed for KWS, not for general SR tasks. Second, many of the papers that we analyse in the remainder of this section [10]- [15], [22]- [24], focus on KWS systems, using as a benchmark the Google Speech Commands dataset [2], which contains only single-word utterances. Therefore, most manuscripts strike backdoor poisoning attacks by injecting only single words or labels, out of a restricted list of words. ...
... For example, in their stylistic backdoor attack, Koffas et al. [23] employ audio effects, such as reverb and chorus, for producing malicious triggers. Cai et al. [22] present an approach where they first transpose the targeted samples upwards in pitch, and then hide high-frequency triggers in the loudest part of each sample. The authors test their attack on two transformer-based models: a keyword-spotting model developed by Berg et al. [25] and an audio classifier by Gazneli et al. [26]. ...
Preprint
Full-text available
Thanks to the popularisation of transformer-based models, speech recognition (SR) is gaining traction in various application fields, such as industrial and robotics environments populated with mission-critical devices. While transformer-based SR can provide various benefits for simplifying human-machine interfacing, the research on the cybersecurity aspects of these models is lacklustre. In particular, concerning backdoor poisoning attacks. In this paper, we propose a new poisoning approach that maps different environmental trigger sounds to target phrases of different lengths, during the fine-tuning phase. We test our approach on Whisper, one of the most popular transformer-based SR model, showing that it is highly vulnerable to our attack, under several testing conditions. To mitigate the attack proposed in this paper, we investigate the use of Silero VAD, a state-of-the-art voice activity detection (VAD) model, as a defence mechanism. Our experiments show that it is possible to use VAD models to filter out malicious triggers and mitigate our attacks, with a varying degree of success, depending on the type of trigger sound and testing conditions.
... They used these inaudible triggers to successfully attack Android applications, demonstrating the significant real-world threat posed by their proposed method. Cai et al. [80] utilized sound elements (such as pitch and timbre) to design more covert but effective pure poison backdoor attacks. They manipulated the timbre features of the victim's audio to create timbre-based stealthy attacks and designed a voiceprint selection module to facilitate multibackdoor attacks. ...
... Point clouds are transformed into graph spectral signals to embed the backdoor trigger. Cai et al. [80] inserted a short-duration highpitched signal as a trigger and increased the pitch of the remaining audio clips to "mask" it, thereby designing a pitch-based stealthy trigger. ...
Article
Full-text available
Deep learning, as an important branch of machine learning, has been widely applied in computer vision, natural language processing, speech recognition, and more. However, recent studies have revealed that deep learning systems are vulnerable to backdoor attacks. Backdoor attackers inject a hidden backdoor into the deep learning model, such that the predictions of the infected model will be maliciously changed if the hidden backdoor is activated by input with a backdoor trigger while behaving normally on any benign sample. This kind of attack can potentially result in severe consequences in the real world. Therefore, research on defending against backdoor attacks has emerged rapidly. In this article, we have provided a comprehensive survey of backdoor attacks, detections, and defenses previously demonstrated on deep learning. We have investigated widely used model architectures, benchmark datasets, and metrics in backdoor research and have classified attacks, detections and defenses based on different criteria. Furthermore, we have analyzed some limitations in existing methods and, based on this, pointed out several promising future research directions. Through this survey, beginners can gain a preliminary understanding of backdoor attacks and defenses. Furthermore, we anticipate that this work will provide new perspectives and inspire extra research into the backdoor attack and defense methods in deep learning.
... Most existing backdoor attacks focus on image classification tasks [19], [20], [21], [22], [23], while a few have been extended to other tasks [24], [25], [26], [27], [28]. However, backdoor attacks on human motion prediction models remain uncharted. ...
... An even more inconspicuous branch of clean-label backdoor attacks [23], [22], [38] can achieve target misclassification by hiding the backdoor triggers into the training images without altering their labels. While most existing backdoor attacks have been developed for image classification tasks, some successful attacks have also been reported in other task domains, such as speech recognition [24], [25], graph classification [26], [27], crowd counting [28], and point cloud classification [39], [40]. ...
Preprint
Precise future human motion prediction over subsecond horizons from past observations is crucial for various safety-critical applications. To date, only one study has examined the vulnerability of human motion prediction to evasion attacks. In this paper, we propose BadHMP, the first backdoor attack that targets specifically human motion prediction. Our approach involves generating poisoned training samples by embedding a localized backdoor trigger in one arm of the skeleton, causing selected joints to remain relatively still or follow predefined motion in historical time steps. Subsequently, the future sequences are globally modified to the target sequences, and the entire training dataset is traversed to select the most suitable samples for poisoning. Our carefully designed backdoor triggers and targets guarantee the smoothness and naturalness of the poisoned samples, making them stealthy enough to evade detection by the model trainer while keeping the poisoned model unobtrusive in terms of prediction fidelity to untainted sequences. The target sequences can be successfully activated by the designed input sequences even with a low poisoned sample injection ratio. Experimental results on two datasets (Human3.6M and CMU-Mocap) and two network architectures (LTD and HRI) demonstrate the high-fidelity, effectiveness, and stealthiness of BadHMP. Robustness of our attack against fine-tuning defense is also verified.
... Considering the characteristics of speech, speech backdoor attacks can be classified into (1) Methods based on the addition of extra noisy speech and perturbation on signals(Noise trigger or Perturbation trigger) [8]- [15]. (2) Methods based on the modification of speech components/elements(Element trigger) [16]- [19]. Koffas et.al [8] proposed a series of perturbation operations(e.g., pitch shift, reverberation and chorus) to perform digital music effects as a pertubation trigger. ...
... On the other hand, Ye et.al [16], [18] proposed VSVC to treat the timbre as speech backdoor attack trigger. Cai et.al [19] also demonstrated that the pitch and timbre triggers could be combined as element triggers for multi-target attacks, which gained excellent attack effectiveness on speech classification models. ...
Preprint
Full-text available
Deep speech classification tasks, mainly including keyword spotting and speaker verification, play a crucial role in speech-based human-computer interaction. Recently, the security of these technologies has been demonstrated to be vulnerable to backdoor attacks. Specifically speaking, speech samples are attacked by noisy disruption and component modification in present triggers. We suggest that speech backdoor attacks can strategically focus on emotion, a higher-level subjective perceptual attribute inherent in speech. Furthermore, we proposed that emotional voice conversion technology can serve as the speech backdoor attack trigger, and the method is called EmoAttack. Based on this, we conducted attack experiments on two speech classification tasks, showcasing that EmoAttack method owns impactful trigger effectiveness and its remarkable attack success rate and accuracy variance. Additionally, the ablation experiments found that speech with intensive emotion is more suitable to be targeted for attacks.
... [cs.SE] 30 Dec 2024 alone spends approximately $48 billion annually on software testing [14]. Therefore, automated robustness testing of LLMbased software is crucial to ensure stable performance and maintain the quality of software responses [15]. ...
Preprint
Full-text available
Benefiting from the advancements in LLMs, NLP software has undergone rapid development. Such software is widely employed in various safety-critical tasks, such as financial sentiment analysis, toxic content moderation, and log generation. To our knowledge, there are no known automated robustness testing methods specifically designed for LLM-based NLP software. Given the complexity of LLMs and the unpredictability of real-world inputs (including prompts and examples), it is essential to examine the robustness of overall inputs to ensure the safety of such software. To this end, this paper introduces the first AutOmated Robustness Testing frAmework, AORTA, which reconceptualizes the testing process into a combinatorial optimization problem. Existing testing methods designed for DNN-based software can be applied to LLM-based software by AORTA, but their effectiveness is limited. To address this, we propose a novel testing method for LLM-based software within AORTA called Adaptive Beam Search. ABS is tailored for the expansive feature space of LLMs and improves testing effectiveness through an adaptive beam width and the capability for backtracking. We successfully embed 18 test methods in the designed framework AORTA and compared the test validity of ABS with three datasets and five threat models. ABS facilitates a more comprehensive and accurate robustness assessment before software deployment, with an average test success rate of 86.138%. Compared to the currently best-performing baseline PWWS, ABS significantly reduces the computational overhead by up to 3441.895 seconds per successful test case and decreases the number of queries by 218.762 times on average. Furthermore, test cases generated by ABS exhibit greater naturalness and transferability.
... Traditional backdoor attacks. Traditional backdoor attacks primarily target deep neural networks (DNNs) and have proven effective in various domains, including image classification [17], [18], [29], [32], [35], [49], [50], [61], [75], natural language processing [6], [20], [42], [43], [55], and speech recognition [2], [34], [36]. ...
Preprint
Full-text available
Large language models (LLMs) have seen significant advancements, achieving superior performance in various Natural Language Processing (NLP) tasks, from understanding to reasoning. However, they remain vulnerable to backdoor attacks, where models behave normally for standard queries but generate harmful responses or unintended output when specific triggers are activated. Existing backdoor defenses often suffer from drawbacks that they either focus on detection without removal, rely on rigid assumptions about trigger properties, or prove to be ineffective against advanced attacks like multi-trigger backdoors. In this paper, we present a novel method to eliminate backdoor behaviors from LLMs through the construction of information conflicts using both internal and external mechanisms. Internally, we leverage a lightweight dataset to train a conflict model, which is then merged with the backdoored model to neutralize malicious behaviors by embedding contradictory information within the model's parametric memory. Externally, we incorporate convincing contradictory evidence into the prompt to challenge the model's internal backdoor knowledge. Experimental results on classification and conversational tasks across 4 widely used LLMs demonstrate that our method outperforms 8 state-of-the-art backdoor defense baselines. We can reduce the attack success rate of advanced backdoor attacks by up to 98% while maintaining over 90% clean data accuracy. Furthermore, our method has proven to be robust against adaptive backdoor attacks. The code will be open-sourced upon publication.
... The attacked models behave normally on benign samples, whereas their prediction will be maliciously changed whenever the trigger pattern arises. Currently, most of the existing backdoor attacks were designed for image classification tasks [4], [16], [18], [21], [31], [36], [47], although there were also a few for others [5], [12], [14], [29], [44], [48]. These methods could be divided into two main categories: poison-label and cleanlabel attacks, depending on whether the labels of modified poisoned samples are consistent with their ground-truth ones. ...
Preprint
Full-text available
Recently, point clouds have been widely used in computer vision, whereas their collection is time-consuming and expensive. As such, point cloud datasets are the valuable intellectual property of their owners and deserve protection. To detect and prevent unauthorized use of these datasets, especially for commercial or open-sourced ones that cannot be sold again or used commercially without permission, we intend to identify whether a suspicious third-party model is trained on our protected dataset under the black-box setting. We achieve this goal by designing a scalable clean-label backdoor-based dataset watermark for point clouds that ensures both effectiveness and stealthiness. Unlike existing clean-label watermark schemes, which are susceptible to the number of categories, our method could watermark samples from all classes instead of only from the target one. Accordingly, it can still preserve high effectiveness even on large-scale datasets with many classes. Specifically, we perturb selected point clouds with non-target categories in both shape-wise and point-wise manners before inserting trigger patterns without changing their labels. The features of perturbed samples are similar to those of benign samples from the target class. As such, models trained on the watermarked dataset will have a distinctive yet stealthy backdoor behavior, i.e., misclassifying samples from the target class whenever triggers appear, since the trained DNNs will treat the inserted trigger pattern as a signal to deny predicting the target label. We also design a hypothesis-test-guided dataset ownership verification based on the proposed watermark. Extensive experiments on benchmark datasets are conducted, verifying the effectiveness of our method and its resistance to potential removal methods.
... In addition to inaudible attacks, researchers have explored other security and privacy concerns related to voice assistant applications. Covert channels using internal speakers have been demonstrated [7], and stealthy backdoor attacks against speech recognition systems have been proposed [8]. Surveys have highlighted the need for improved security and privacy measures in voice assistant applications [9]. ...
Conference Paper
The widespread adoption of voice-activated systems has modified routine human-machine interaction but has also introduced new vulnerabilities. This paper investigates the susceptibility of automatic speech recognition (ASR) algorithms in these systems to interference from near-ultrasonic noise. Building upon prior research that demonstrated the ability of near-ultrasonic frequencies (16 kHz - 22 kHz) to exploit the inherent properties of micro electromechanical systems (MEMS) microphones, our study explores alternative privacy enforcement means using this interference phenomenon. We expose a critical vulnerability in the most common microphones used in modern voice-activated devices, which inadvertently demodulate near-ultrasonic frequencies into the audible spectrum, disrupting the ASR process. Through a systematic analysis of the impact of near-ultrasonic noise on various ASR systems, we demonstrate that this vulnerability is consistent across different devices and under varying conditions, such as broadcast distance and specific phoneme structures. Our findings highlight the need to develop robust countermeasures to protect voice-activated systems from malicious exploitation of this vulnerability. Furthermore, we explore the potential applications of this phenomenon in enhancing privacy by disrupting unauthorized audio recording or eavesdropping. This research underscores the importance of a comprehensive approach to securing voice-activated systems, combining technological innovation, responsible development practices, and informed policy decisions to ensure the privacy and security of users in an increasingly connected world.
... FORMULATING DATA POISONING ATTACKS. In this study, we focus ( Figure 1) solely on backdoor attacks using model training data poisoning [38], [39], in which attackers can only manipulate training samples without having any information about the victim's model. We study the speech classification task, in which the model produces a probability vector of class C-class 1 : ...
Article
Full-text available
Audio-based machine learning systems frequently use public or third-party data, which might be inaccurate. This exposes deep neural network (DNN) models trained on such data to potential data poisoning attacks. In this type of assault, attackers can train the DNN model using poisoned data, potentially degrading its performance. Another type of data poisoning attack that is extremely relevant to our investigation is label flipping, in which the attacker manipulates the labels for a subset of data. It has been demonstrated that these assaults may drastically reduce system performance, even for attackers with minimal abilities. In this study, we propose a backdoor attack named ”DirtyFlipping”, which uses dirty label techniques, ‘label-on-label‘, to input triggers (clapping) in the selected data patterns associated with the target class, thereby enabling a stealthy backdoor.
Article
Recently, point clouds have been widely used in computer vision, whereas their collection is time-consuming and expensive. As such, point cloud datasets are the valuable intellectual property of their owners and deserve protection. To detect and prevent unauthorized use of these datasets, especially for commercial or open-sourced ones that cannot be sold again or used commercially without permission, we intend to identify whether a suspicious third-party model is trained on our protected dataset under the black-box setting. We achieve this goal by designing a scalable clean-label backdoor-based dataset watermark for point clouds that ensures both effectiveness and stealthiness. Unlike existing clean-label watermark schemes, which were susceptible to the number of categories, our method can watermark samples from all classes instead of only from the target one. Accordingly, it can still preserve high effectiveness even on large-scale datasets with many classes. Specifically, we perturb selected point clouds with non-target categories in both shape-wise and point-wise manners before inserting trigger patterns without changing their labels. The features of perturbed samples are similar to those of benign samples from the target class. As such, models trained on the watermarked dataset will have a distinctive yet stealthy backdoor behavior, i.e ., misclassifying samples from the target class whenever triggers appear, since the trained DNNs will treat the inserted trigger pattern as a signal to deny predicting the target label. We also design a hypothesis-test-guided dataset ownership verification based on the proposed watermark. Extensive experiments on benchmark datasets are conducted, verifying the effectiveness of our method and its resistance to potential removal methods.
Conference Paper
Full-text available
Saliency-based representation visualization (SRV) (e.g., Grad-CAM) is one of the most classical and widely adopted explainable artificial intelligence (XAI) methods for its simplicity and efficiency. It can be used to interpret deep neu-ral networks by locating saliency areas contributing the most to their predictions. However, it is difficult to automatically measure and evaluate the performance of SRV methods due to the lack of ground-truth salience areas of samples. In this paper, we revisit the backdoor-based SRV evaluation, which is currently the only feasible method to alleviate the previous problem. We first reveal its implementation limitations and unreliable nature due to the trigger generalization of existing backdoor watermarks. Given these findings, we propose a generalization-limited backdoor watermark (GLBW), based on which we design a more faithful XAI evaluation. Specifically, we formulate the training of watermarked DNNs as a min-max problem, where we find the 'worst' potential trigger (with the highest attack effectiveness and differences from the ground-truth trigger) via inner max-imization and minimize its effects and the loss over benign and poisoned samples via outer minimization in each iteration. In particular, we design an adaptive optimization method to find desired potential triggers in each inner maximization. Extensive experiments on benchmark datasets are conducted, verifying the effectiveness of our generalization-limited watermark. Our codes are available at https://github.com/yamengxi/GLBW.
Conference Paper
Full-text available
The prosperity of deep neural networks (DNNs) is largely benefited from open-source datasets, based on which users can evaluate and improve their methods. In this paper, we revisit backdoor-based dataset ownership verification (DOV), which is currently the only feasible approach to protect the copyright of open-source datasets. We reveal that these methods are fundamentally harmful given that they could introduce malicious misclassification behaviors to watermarked DNNs by the adversaries. In this paper, we design DOV from another perspective by making watermarked models (trained on the protected dataset) correctly classify some `hard' samples that will be misclassified by the benign model. Our method is inspired by the generalization property of DNNs, where we find a \emph{hardly-generalized domain} for the original dataset (as its \emph{domain watermark}). It can be easily learned with the protected dataset containing modified samples. Specifically, we formulate the domain generation as a bi-level optimization and propose to optimize a set of visually-indistinguishable clean-label modified data with similar effects to domain-watermarked samples from the hardly-generalized domain to ensure watermark stealthiness. We also design a hypothesis-test-guided ownership verification via our domain watermark and provide the theoretical analyses of our method. Extensive experiments on three benchmark datasets are conducted, which verify the effectiveness of our method and its resistance to potential adaptive methods. The code for reproducing main experiments is available at \url{https://github.com/JunfengGo/Domain-Watermark}.
Article
Full-text available
Deep learning, especially deep neural networks (DNNs), has been widely and successfully adopted in many critical applications for its high effectiveness and efficiency. The rapid development of DNNs has benefited from the existence of some high-quality datasets ( e.g ., ImageNet), which allow researchers and developers to easily verify the performance of their methods. Currently, almost all existing released datasets require that they can only be adopted for academic or educational purposes rather than commercial purposes without permission. However, there is still no good way to ensure that. In this paper, we formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model, where defenders can only query the model while having no information about its parameters and training details. Based on this formulation, we propose to embed external patterns via backdoor watermarking for the ownership verification to protect them. Our method contains two main parts, including dataset watermarking and dataset verification. Specifically, we exploit poison-only backdoor attacks ( e.g ., BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification. We also provide some theoretical analyses of our methods. Experiments on multiple benchmark datasets of different tasks are conducted, which verify the effectiveness of our method. The code for reproducing main experiments is available at https://github.com/THUYimingLi/DVBW.
Conference Paper
Full-text available
Deep neural networks (DNNs) are vulnerable to backdoor attacks, where adversaries embed a hidden backdoor trigger during the training process for malicious prediction manipulation. These attacks pose great threats to the applications of DNNs under the real-world machine learning as a service (MLaaS) setting, where the deployed model is fully black-box while the users can only query and obtain its predictions. Currently, there are many existing defenses to reduce backdoor threats. However, almost all of them cannot be adopted in MLaaS scenarios since they require getting access to or even modifying the suspicious models. In this paper, we propose a simple yet effective black-box input-level backdoor detection , called SCALE-UP, which requires only the predicted labels to alleviate this problem. Specifically, we identify and filter malicious testing samples by analyzing their prediction consistency during the pixel-wise amplification process. Our defense is motivated by an intriguing observation (dubbed scaled prediction consistency) that the predictions of poisoned samples are significantly more consistent compared to those of benign ones when amplifying all pixel values. Besides, we also provide theoretical foundations to explain this phenomenon. Extensive experiments are conducted on benchmark datasets, verifying the effectiveness and efficiency of our defense and its resistance to potential adaptive attacks. Our codes are available at https://github.com/JunfengGo/SCALE-UP.
Conference Paper
Full-text available
Deep neural networks (DNNs) have demonstrated their superiority in practice. Arguably, the rapid development of DNNs is largely benefited from high-quality (open-sourced) datasets, based on which researchers and developers can easily evaluate and improve their learning methods. Since the data collection is usually time-consuming or even expensive, how to protect their copyrights is of great significance and worth further exploration. In this paper, we revisit dataset ownership verification. We find that existing verification methods introduced new security risks in DNNs trained on the protected dataset, due to the targeted nature of poison-only backdoor watermarks. To alleviate this problem, in this work, we explore the untargeted backdoor watermarking scheme, where the abnormal model behaviors are not deterministic. Specifically, we introduce two dispersibilities and prove their correlation, based on which we design the untargeted backdoor watermark under both poisoned-label and clean-label settings. We also discuss how to use the proposed untargeted backdoor watermark for dataset ownership verification. Experiments on benchmark datasets verify the effectiveness of our methods and their resistance to existing backdoor defenses. Our codes are available at \url{https://github.com/THUYimingLi/Untargeted_Backdoor_Watermark}.
Chapter
Full-text available
In the past few years, triplet loss-based metric embeddings have become a de-facto standard for several important computer vision problems, most notably, person reidentification. On the other hand, in the area of speech recognition the metric embeddings generated by the triplet loss are rarely used even for classification problems. We fill this gap showing that a combination of two representation learning techniques: a triplet loss-based embedding and a variant of kNN for classification instead of cross-entropy loss significantly (by 26% to 38%) improves the classification accuracy for convolutional networks on a LibriSpeech-derived LibriWords datasets. To do so, we propose a novel phonetic similarity based triplet mining approach. We also improve the current best published SOTA for Google Speech Commands dataset V1 10 + 2 -class classification by about 34%, achieving 98.55% accuracy, V2 10 + 2-class classification by about 20%, achieving 98.37% accuracy, and V2 35-class classification by over 50%, achieving 97.0% accuracy. (Code is available at https://github.com/roman-vygon/triplet_loss_kws).
Conference Paper
Full-text available
Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data (e.g., data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers (i.e., pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing back-door attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at https://github.com/zhaitongqing233/Back door-attack-against-speaker-verification.
Article
Full-text available
Recent researches indicate that, by injecting backdoor instances into the training set, the attackers can create a backdoor in the deep neural network (DNN) model. Existing studies all focus on attacking a single target that triggered by a single backdoor signal. In this paper, for the first time, we propose multi-target and multi-trigger backdoor attacks: 1) One-to-N attack, where the attacker can trigger multiple backdoor targets by controlling the different intensities of the same backdoor signal; 2) N-to-One attack, where such attack is triggered only when all the N backdoor signals are satisfied. Compared with existing One-to-One attacks, the proposed backdoor attacks are more flexible and more concealed. Besides, the proposed backdoor attacks can be applied under weak attack model, where the attacker has no knowledge about the parameters and architectures of the DNN models. Experimental results show that these two attacks can achieve better or similar performances when injecting a much smaller proportion or same proportion of backdoor instances than existing works. The two attack methods can achieve high attack success rates while the test accuracy of the DNN model has hardly dropped. The proposed two attacks are demonstrated to be effective and stealthy under two state-of-the-art defense techniques.
Article
Full-text available
The objective of this work is speaker recognition under noisy and unconstrained conditions. We make two key contributions. First, we introduce a very large-scale audio-visual dataset collected from open source media using a fully automated pipeline. Most existing datasets for speaker identification contain samples obtained under quite constrained conditions, and usually require manual annotations, hence are limited in size. We propose a pipeline based on computer vision techniques to create the dataset from open-source media. Our pipeline involves obtaining videos from YouTube; performing active speaker verification using a two-stream synchronization Convolutional Neural Network (CNN), and confirming the identity of the speaker using CNN based facial recognition. We use this pipeline to curate VoxCeleb which contains contains over a million ‘real-world’ utterances from over 6000 speakers. This is several times larger than any publicly available speaker recognition dataset. Second, we develop and compare different CNN architectures with various aggregation methods and training loss functions that can effectively recognise identities from voice under various conditions. The models trained on our dataset surpass the performance of previous works by a significant margin.
Article
Full-text available
Deep learning-based techniques have achieved state-of-the-art performance on a wide variety of recognition and classification tasks. However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained models that are then fine-tuned for a specific task. In this paper, we show that the outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet ) that has the state-of-the-art performance on the user’s training and validation samples but behaves badly on specific attacker-chosen inputs. We first explore the properties of BadNets in a toy example, by creating a backdoored handwritten digit classifier. Next, we demonstrate backdoors in a more realistic scenario by creating a U.S. street sign classifier that identifies stop signs as speed limits when a special sticker is added to the stop sign; we then show in addition that the backdoor in our U.S. street sign detector can persist even if the network is later retrained for another task and cause a drop in an accuracy of 25% on average when the backdoor trigger is present. These results demonstrate that backdoors in neural networks are both powerful and—because the behavior of neural networks is difficult to explicate—stealthy. This paper provides motivation for further research into techniques for verifying and inspecting neural networks, just as we have developed tools for verifying and debugging software.
Article
Full-text available
Deep learning models have achieved high performance on many tasks, and thus have been applied to many security-critical scenarios. For example, deep learning-based face recognition systems have been used to authenticate users to access many security-sensitive applications like payment apps. Such usages of deep learning systems provide the adversaries with sufficient incentives to perform attacks against these systems for their adversarial purposes. In this work, we consider a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor. Specifically, the adversary aims at creating backdoor instances, so that the victim learning system will be misled to classify the backdoor instances as a target label specified by the adversary. In particular, we study backdoor poisoning attacks, which achieve backdoor attacks using poisoning strategies. Different from all existing work, our studied poisoning strategies can apply under a very weak threat model: (1) the adversary has no knowledge of the model and the training set used by the victim system; (2) the attacker is allowed to inject only a small amount of poisoning samples; (3) the backdoor key is hard to notice even by human beings to achieve stealthiness. We conduct evaluation to demonstrate that a backdoor adversary can inject only around 50 poisoning samples, while achieving an attack success rate of above 90%. We are also the first work to show that a data poisoning attack can create physically implementable backdoors without touching the training process. Our work demonstrates that backdoor poisoning attacks pose real threats to a learning system, and thus highlights the importance of further investigation and proposing defense strategies against them.
Article
Full-text available
Significance Deep neural networks are currently the most successful machine-learning technique for solving a variety of tasks, including language translation, image classification, and image generation. One weakness of such models is that, unlike humans, they are unable to learn multiple tasks sequentially. In this work we propose a practical solution to train such models sequentially by protecting the weights important for previous tasks. This approach, inspired by synaptic consolidation in neuroscience, enables state of the art results on multiple reinforcement learning problems experienced sequentially.
Article
Full-text available
Pitch is one of the primary auditory sensations and plays a defining role in music, speech, and auditory scene analysis. Although the main physical correlate of pitch is acoustic periodicity, or repetition rate, there are many interactions that complicate the relationship between the physical stimulus and the perception of pitch. In particular, the effects of other acoustic parameters on pitch judgments, and the complex interactions between perceptual organization and pitch, have uncovered interesting perceptual phenomena that should help to reveal the underlying neural mechanisms.
Chapter
Training deep neural networks (DNNs) usually requires massive training data and computational resources. Users who cannot afford this may prefer to outsource training to a third party or resort to publicly available pre-trained models. Unfortunately, doing so facilitates a new training-time attack (i.e., backdoor attack) against DNNs. This attack aims to induce misclassification of input samples containing adversary-specified trigger patterns. In this paper, we first conduct a layer-wise feature analysis of poisoned and benign samples from the target class. We find out that the feature difference between benign and poisoned samples tends to be maximum at a critical layer, which is not always the one typically used in existing defenses, namely the layer before fully-connected layers. We also demonstrate how to locate this critical layer based on the behaviors of benign samples. We then propose a simple yet effective method to filter poisoned samples by analyzing the feature differences between suspicious and benign samples at the critical layer. We conduct extensive experiments on two benchmark datasets, which confirm the effectiveness of our defense.KeywordsBackdoor DetectionBackdoor DefenseBackdoor LearningAI SecurityDeep Learning
Article
The pervasive view of the mobile crowd bridges various real-world scenes and people's perceptions with the gathering of distributed crowdsensing photos. To elaborate informative visuals for viewers, existing techniques introduce photo selection as an essential step in crowdsensing. Yet, the aesthetic preference of viewers, at the very heart of their experiences under various crowdsensing contexts (e.g., travel planning), is seldom considered and hardly guaranteed. We propose CrowdPicker, a novel photo selection framework with adaptive aesthetic awareness for crowdsensing. With the observations on aesthetic uncertainty and bias in different crowdsensing contexts, we exploit a joint effort of mobile crowdsourcing and domain adaptation to actively learn contextual knowledge for dynamically tailoring the aesthetic predictor. Concretely, an aesthetic utility measure is invented based on the probabilistic balance formalization to quantify the benefit of photos in improving the adaptation performance. We prove the NP-hardness of sampling the best-utility photos for crowdsourcing annotation and present a (1-1/e) approximate solution. Furthermore, a two-stage distillation-based adaptation architecture is designed based on fusing contextual and common aesthetic preferences. Extensive experiments on three datasets and four raw models demonstrate the performance superiority of CrowdPicker over four photo selection baselines and four typical sampling strategies. Cross-dataset evaluation illustrates the impacts of aesthetic bias on selection.
Chapter
With the rapid development of deep learning, its vulnerability has gradually emerged in recent years. This work focuses on backdoor attacks on speech recognition systems. We adopt sounds that are ordinary in nature or in our daily life as triggers for natural backdoor attacks. We conduct experiments on two datasets and three models to validate the performance of natural backdoor attacks and explore the effects of poisoning rate, trigger duration and blend ratio on the performance of natural backdoor attacks. Our results show that natural backdoor attacks have a high attack success rate without compromising model performance on benign samples, even with short or low-amplitude triggers. It requires only 5% of poisoned samples to achieve a near 100% attack success rate. In addition, the backdoor will be automatically activated by the corresponding sound in nature, which is not easy to be detected and will bring severer harm.KeywordsDeep learningBackdoor attacksSpeech recognitionNatural trigger
Article
In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.
Chapter
Deep learning-based models have achieved state-of-the-art performance in a wide variety of classification and recognition tasks. Although such models have been demonstrated to suffer from backdoor attacks in multiple domains, little is known whether speaker recognition system is vulnerable to such an attack, especially in the physical world. In this paper, we launch such backdoor attack on speaker recognition system (SRS) in both digital and physical space and conduct more comprehensive experiments on two common tasks of a speaker recognition system. Taking the poison position, intensity, length, frequency characteristics, and poison rate of the backdoor patterns into consideration, we design four backdoor triggers and use them to poison the training dataset. We demonstrate the results of digital and physical attack success rate (ASR) and show that all 4 backdoor patterns can achieve over 89% ASR on digital attacks and at least 70% on physical attacks. We also show that the maliciously trained model is able to provide comparable performance on clean data.KeywordsNeural networkBackdoor attackSpeaker recognition
Article
With the widespread use of automated speech recognition (ASR) systems in modern consumer devices, attack against ASR systems have become an attractive topic in recent years. Although related white-box attack methods have achieved remarkable success in fooling neural networks, they rely heavily on obtaining full access to the details of the target models. Due to the lack of prior knowledge of the victim model and the inefficiency in utilizing query results, most of the existing black-box attack methods for ASR systems are query-intensive. In this paper, we propose a new black-box attack called the Monte Carlo gradient sign attack (MGSA) to generate adversarial audio samples with substantially fewer queries. It updates an original sample based on the elements obtained by a Monte Carlo tree search. We attribute its high query efficiency to the effective utilization of the dominant gradient phenomenon, which refers to the fact that only a few elements of each origin sample have significant effect on the output of ASR systems. Extensive experiments are performed to evaluate the efficiency of MGSA and the stealthiness of the generated adversarial examples on the DeepSpeech system. The experimental results show that MGSA achieves 98% and 99% attack success rates on the LibriSpeech and Mozilla Common Voice datasets, respectively. Compared with the state-of-the-art methods, the average number of queries is reduced by 27% and the signal-to-noise ratio is increased by 31%.
Conference Paper
Speech recognition systems, trained and updated based on large-scale audio data, are vulnerable to backdoor attacks that inject dedicated triggers in system training. The used triggers are generally human-inaudible audio, such as ultrasonic waves. However, we note that such a design is not feasible, as it can be easily filtered out via pre-processing. In this work, we propose the first audible backdoor attack paradigm for speech recognition, characterized by passively triggering and opportunistically invoking. Traditional device-synthetic triggers are replaced with ambient noise in daily scenarios. For adapting triggers to the application dynamics of speech interaction, we exploit the observed knowledge inherited from the context to a trained model and accommodate the injection and poisoning with certainty-based trigger selection, performance-oblivious sample binding, and trigger lateaugmentation. Experiments on two datasets under various environments evaluate the proposal’s effectiveness in maintaining a high benign rate and facilitating outstanding attack success rate (99.27%, ∼ 4% higher than BadNets), robustness (bounded infectious triggers), feasibility in real-world scenarios. It requires less than 1% data to be poisoned and is demonstrated to be able to resist typical speech enhancement techniques and general countermeasures (e.g., dedicated fine-tuning). The code and data will be made available at https://github.com/lqsunshine/DABA.
Article
Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if the hidden backdoor is activated by attacker-specified triggers. This threat could happen when the training process is not fully controlled, such as training on third-party datasets or adopting third-party models, which poses a new and realistic threat. Although backdoor learning is an emerging and rapidly growing research area, there is still no comprehensive and timely review of it. In this article, we present the first comprehensive survey of this realm. We summarize and categorize existing backdoor attacks and defenses based on their characteristics, and provide a unified framework for analyzing poisoning-based backdoor attacks. Besides, we also analyze the relation between backdoor attacks and relevant fields (i.e., adversarial attacks and data poisoning), and summarize widely adopted benchmark datasets. Finally, we briefly outline certain future research directions relying upon reviewed works. A curated list of backdoor-related resources is also available at https://github.com/THUYimingLi/backdoor-learning-resources .
Article
As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance. The absence of trustworthy human supervision over the data collection process exposes organizations to security vulnerabilities; training data can be manipulated to control and degrade the downstream behaviors of learned models. The goal of this work is to systematically categorize and discuss a wide range of dataset vulnerabilities and exploits, approaches for defending against these threats, and an array of open problems in this space.
Book
Hidden Markov Models (HMMs) provide a simple and effective framework for modelling time-varying spectral vector sequences. As a consequence, almost all present day large vocabulary continuous speech recognition (LVCSR) systems are based on HMMs. Whereas the basic principles underlying HMM-based LVCSR are rather straightforward, the approximations and simplifying assumptions involved in a direct implementation of these principles would result in a system which has poor accuracy and unacceptable sensitivity to changes in operating environment. Thus, the practical application of HMMs in modern systems involves considerable sophistication. The Application of Hidden Markov Models in Speech Recognition presents the core architecture of a HMM-based LVCSR system and proceeds to describe the various refinements which are needed to achieve state-of-the-art performance. These refinements include feature projection, improved covariance modelling, discriminative parameter estimation, adaptation and normalisation, noise compensation and multi-pass system combination. It concludes with a case study of LVCSR for Broadcast News and Conversation transcription in order to illustrate the techniques described. The Application of Hidden Markov Models in Speech Recognition is an invaluable resource for anybody with an interest in speech recognition technology.
Chapter
Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4%0.4\% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.
Article
Neural networks are susceptible to adversarial examples: small, carefully-crafted perturbations can cause networks to misclassify inputs in arbitrarily chosen ways. However, some studies have showed that adversarial examples crafted following the usual methods are not tolerant to small transformations: for example, zooming in on an adversarial image can cause it to be classified correctly again. This raises the question of whether adversarial examples are a concern in practice, because many real-world systems capture images from multiple scales and perspectives. This paper shows that adversarial examples can be made robust to distributions of transformations. Our approach produces single images that are simultaneously adversarial under all transformations in a chosen distribution, showing that we cannot rely on transformations such as rescaling, translation, and rotation to protect against adversarial examples.
Article
Voice transformation (VT) aims to change one or more aspects of a speech signal while preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to change a source speaker’s speech in such a way that the generated output is perceived as a sentence uttered by a target speaker. Despite many years of research, VC systems still exhibit deficiencies in accurately mimicking a target speaker spectrally and prosodically, and simultaneously maintaining high speech quality. In this work we provide an overview of real-world applications, extensively study existing systems proposed in the literature, and discuss remaining challenges.
Article
Synopsis—The most obvious method for determining the distortion of telegraph signals is to calculate the transients of the telegraph system. This method has been treated by various writers, and solutions are available for telegraph lines with simple terminal conditions. It is well known that the extension of the same methods to more complicated terminal conditions, which represent the usual terminal apparatus, leads to great difficulties. The present paper attacks the same problem from the alternative standpoint of the steady-state characteristics of the system. This method has the advantage over the method of transients that the complication of the circuit which results from the use of terminal apparatus does not complicate the calculations materially. This method of treatment necessitates expressing the criteria of distortionless transmission in terms of the steady-state characteristics. Accordingly, a considerable portion of the paper describes and illustrates a method for making this translation. A discussion is given of the minimum frequency range required for transmission at a given speed of signaling. In the case of carrier telegraphy, this discussion includes a comparison of single-sideband and double-sideband transmission. A number of incidental topics is also discussed.
Article
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an overview of this progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
Article
Two experiments were performed to evaluate the perceptual relationships between 16 music instrument tones. The stimuli were computer synthesized based upon an analysis of actual instrument tones, and they were perceptually equalized for loudness, pitch, and duration. Experiment 1 evaluated the tones with respect to perceptual similarities, and the results were treated with multidimensional scaling techniques and hierarchic clustering analysis. A three-dimensional scaling solution, well matching the clustering analysis, was found to be interpretable in terms of the spectral energy distribution; the presence of synchronicity in the transients of the higher harmonics, along with the closely related amount of spectral fluctuation within the tone through time; and the presence of low-amplitude, high-frequency energy in the initial attack segment; an alternate interpretation of the latter two dimensions viewed the cylindrical distribution of clusters of stimulus points about the spectral energy distribution, grouping on the basis of musical instrument family (with two exceptions). Experiment 2 was a learning task of a set of labels for the 16 tones. Confusions were examined in light of the similarity structure for the tones from experiment 1, and one of the family-grouping exceptions was found to be reflected in the difficulty of learning the labels.