January 2024
·
21 Reads
·
4 Citations
Signal Processing Letters, IEEE
Active defense is an important approach to counter speech deepfakes that threaten individuals' privacy, property, and reputation. However, the existing works in this field suffer from issues such as time-consuming and ordinary defense effectiveness. This letter proposes a Generative Adversarial Network (GAN) framework for adversarial attacks as a defense against malicious voice conversion. The proposed method uses a generator to produce adversarial perturbations and adds them to the mel-spectrogram of the target audio to craft adversarial example. In addition, in order to enhance the defense effectiveness, a spectrogram waveform conversion simulation module (SWCSM) is designed to simulate the process of reconstructing waveform from the adversarial mel-spectrogram example and re-extracting mel-spectrogram from the reconstructed waveform. Experiments on four state-of-the-art voice conversion models show that our method achieves the overall best performance among five compared methods in both white-box and black-box scenarios in terms of defense effectiveness and generation time. The source code is available at GitHub by https://github.com/imagecbj/ Initiative-Defense-against-Voice-Conversion-through-Gen erative-Adversarial-Network.