Yiming Li

Yiming Li
Tsinghua University | TH · Tsinghua Shenzhen International Graduate School (SIGS)

Ph.D. Student

About

43
Publications
9,250
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
125
Citations
Introduction
My research interests are broadly in the security of machine learning, including adversarial and robust learning, backdoor learning, and data privacy. For more information, please look through my homepage (http://liyiming.tech).

Publications

Publications (43)
Article
Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if the hidden backdoor is activated by attacker-specified triggers. This threat could happen when the training process is not fully controlled, such as...
Preprint
Full-text available
Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poison...
Preprint
Full-text available
Visual object tracking (VOT) has been widely adopted in mission-critical applications, such as autonomous driving and intelligent surveillance systems. In current practice, third-party resources such as datasets, backbone networks, and training platforms are frequently used to train high-performance VOT models. Whilst these resources bring certain...
Conference Paper
Full-text available
Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poison...
Preprint
Full-text available
Obtaining a well-trained model involves expensive data collection and training procedures, therefore the model is a valuable intellectual property. Recent studies revealed that adversaries can `steal' deployed models even when they have no training samples and can not get access to the model parameters or structures. Currently, there were some defe...
Article
Federated learning enables data owners to train a global model with shared gradients while keeping private training data locally. However, recent research demonstrated that the adversary may infer private training data of clients from the exchanged local gradients, e.g., having deep leakage from gradients (DLG). Many existing privacy-preserving app...
Preprint
Adversarial training (AT) has been demonstrated as one of the most promising defense methods against various adversarial attacks. To our knowledge, existing AT-based methods usually train with the locally most adversarial perturbed points and treat all the perturbed points equally, which may lead to considerably weaker adversarial robust generaliza...
Article
Despite the impressive performance of random forests (RF), its theoretical properties have not been thoroughly understood. In this paper, we propose a novel RF framework, dubbed multinomial random forest (MRF), to analyze its consistency and privacy-preservation. Instead of deterministic greedy split rule or with simple randomness, the MRF adopts t...
Conference Paper
Full-text available
Well-trained models are valuable intellectual properties for their owners. Recent studies revealed that the adversaries can 'steal' deployed models even when they have no training sample and can only query the model. Currently, there were some defense methods to alleviate this threat, mostly by increasing the cost of model stealing. In this paper ,...
Conference Paper
Full-text available
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of \emph{static} trigger, $i.e.,$ triggers across the train...
Preprint
Full-text available
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and...
Conference Paper
Full-text available
Deep neural networks (DNNs) are vulnerable to the \emph{backdoor attack}, which intends to embed hidden backdoors in DNNs by poisoning training data. The attacked model behaves normally on benign samples, whereas its prediction will be changed to a particular target label if hidden backdoors are activated. So far, backdoor research has mostly been...
Preprint
Full-text available
Deep neural networks (DNNs) are vulnerable to the backdoor attack, which intends to embed hidden backdoors in DNNs by poisoning training data. The attacked model behaves normally on benign samples, whereas its prediction will be changed to a particular target label if hidden backdoors are activated. So far, back-door research has mostly been conduc...
Conference Paper
Full-text available
To explore the vulnerability of deep neural networks (DNNs), many attack paradigms have been well studied, such as the poisoning-based backdoor attack in the training stage and the adversarial attack in the inference stage. In this paper , we study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purpos...
Preprint
Full-text available
To explore the vulnerability of deep neural networks (DNNs), many attack paradigms have been well studied, such as the poisoning-based backdoor attack in the training stage and the adversarial attack in the inference stage. In this paper, we study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purpose...
Conference Paper
Full-text available
k-means algorithm is one of the most classical clustering methods, which has been widely and successfully used in signal processing. However, due to the thin-tailed property of the Gaussian distribution , k-means algorithm suffers from relatively poor performance on the dataset containing heavy-tailed data or outliers. Besides, standard k-means alg...
Conference Paper
Full-text available
Privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification data is mainly in the mapping between the image and its corresponding label, since this relation provides a great amount of information and can be used in other scenarios. In this paper, we pro...
Conference Paper
Full-text available
Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data (e.g., data from the Internet or third-party data company). This raises the question of whether adopting un...
Preprint
Full-text available
Recently, backdoor attacks pose a new security threat to the training process of deep neural networks (DNNs). Attackers intend to inject hidden backdoor into DNNs, such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by an attacker-defined trigger. Exist...
Conference Paper
Full-text available
The rapid development of deep learning has benefited from the release of some high-quality open-sourced datasets (e.g., ImageNet), which allows researchers to easily verify the effectiveness of their algorithms. Almost all existing open-sourced datasets require that they can only be adopted for academic or educational purposes rather than commercia...
Conference Paper
The deep hashing based retrieval method is widely adopted in large-scale image and video retrieval. However, there is little investigation on its security. In this paper, we propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval. Specifically, we first formulate the targeted attack as a po...
Conference Paper
Full-text available
From the mutual empowerment of two high-speed development technologies: artificial intelligence and edge computing , we propose a tailored Edge Intelligent Video Surveillance (EIVS) system. It is a scalable edge computing architecture and uses multitask deep learning for relevant computer vision tasks. Due to the potential application of different...
Article
Full-text available
Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Ne...
Preprint
Full-text available
Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$, data from the Internet or third-party data company). This raises the question of whether adopting...
Preprint
Full-text available
The rapid development of deep learning has benefited from the release of some high-quality open-sourced datasets ($e.g.$, ImageNet), which allows researchers to easily verify the effectiveness of their algorithms. Almost all existing open-sourced datasets require that they can only be adopted for academic or educational purposes rather than commerc...
Preprint
Full-text available
Interpretability and effectiveness are two essential and indispensable requirements for adopting machine learning methods in reality. In this paper, we propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT), to explore the possibility of fulfilling those requirements simultaneously. Specifically, we...
Conference Paper
Full-text available
Adversarial defense is a popular and important research area. Due to its intrinsic mechanism, one of the most straightforward and effective ways of defending attacks is to analyze the property of loss surface in the input space. In this paper, we define the local flatness of the loss surface as the maximum value of the chosen norm of the gradient r...
Preprint
Full-text available
The deep hashing based retrieval method is widely adopted in large-scale image and video retrieval. However, there is little investigation on its security. In this paper, we propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval. Specifically, we first formulate the targeted attack as a po...
Preprint
Full-text available
In this work, we study the problem of backdoor attacks, which add a specific trigger ($i.e.$, a local patch) onto some training images to enforce that the testing images with the same trigger are incorrectly predicted while the natural testing examples are correctly predicted by the trained model. Many existing works adopted the setting that the tr...
Preprint
Full-text available
Adversarial examples have been shown to be the severe threat to deep neural networks (DNNs). One of the most effective adversarial defense methods is adversarial training (AT) through minimizing the adversarial risk $R_{adv}$, which encourages both the benign example $x$ and its adversarially perturbed neighborhoods within the $\ell_{p}$-ball to be...
Conference Paper
Full-text available
Current research has managed to train multiple Deep Neural Networks (DNNs) in affordable computing time. Then, finding a practical method to aggregate these DNNs becomes a fundamental problem. To address this, we present an unbiased combination scheme to guide the aggregation of the diverse DNNs models, by leveraging the Negative Correlation Learni...
Preprint
Full-text available
Data privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification tasks is mainly in the one-to-one mapping between image and its corresponding label, since this relation provides a great amount of information and can be used in other scenarios. In this...
Preprint
Full-text available
Adversarial defense is a popular and important research area. Due to its intrinsic mechanism, one of the most straightforward and effective ways is to analyze the property of loss surface in the input space. In this paper, we define the local flatness of the loss surface as the maximum value of the chosen norm of the gradient regarding to the input...
Preprint
Full-text available
$k$-means algorithm is one of the most classical clustering methods, which has been widely and successfully used in signal processing. However, due to the thin-tailed property of the Gaussian distribution, $k$-means algorithm suffers from relatively poor performance on the dataset containing heavy-tailed data or outliers. Besides, standard $k$-mean...
Preprint
How to obtain a model with good interpretability and performance has always been an important research topic. In this paper, we propose rectified decision trees (ReDT), a knowledge distillation based decision trees rectification with high interpretability, small model size, and empirical soundness. Specifically, we extend the impurity calculation a...
Preprint
Full-text available
Despite the impressive performance of standard random forests (RF), its theoretical properties have not been thoroughly understood. In this paper, we propose a novel RF framework, dubbed multinomial random forest (MRF), to discuss the consistency and privacy-preservation. Instead of deterministic greedy split rule, the MRF adopts two impurity-based...
Article
Full-text available
For any Bedford-McMullen self-affine carpet, the geodesic path on the carpet between points \((x_{1},y_{1})\) and \((x_{2},y_{2})\) has length greater than or equal to \(|x_{1}-x_{2}|+|y_{1}-y_{2}|.\) This property fails for self-similar carpets.

Network

Cited By

Projects

Projects (3)
Project
This project focuses on the data privacy, which relates to the security issues about the training data of machine learning.
Project
This project focuses on the robustness of DL and ML methods, which relates to the security issues about the inference process.
Project
This project focuses on the backdoor learning, including backdoor attack and backdoor defense, which relates to the security issues about the training process.