Shouhuai Xu

Shouhuai Xu
University of Texas at San Antonio | UTSA · Department of Computer Science

About

234
Publications
26,933
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,163
Citations

Publications

Publications (234)
Preprint
Full-text available
Social engineering attacks are a major cyber threat because they often serve as a first step for an attacker to break into an otherwise well-defended network, steal victims' credentials, and cause financial losses. The problem has received due amount of attention with many publications proposing defenses against them. Despite this, the situation ha...
Article
We initiate the study on the problem of automated and robust Cyber Security Management (CSM). We exemplify the problem by investigating how CSM should respond to the discovery of cyber intelligence that identifies new attackers, victims, or defense capabilities. Given the complexity of CSM, we divide it into three classes, referred to as Network-ce...
Chapter
Social engineering attacks are phenomena that are equally applicable to both the physical world and cyberspace. These attacks in the physical world have been studied for a much longer time than their counterpart in cyberspace. This motivates us to investigate how social engineering attacks in the physical world and cyberspace relate to each other,...
Preprint
The deployment of monoculture software stacks can have devastating consequences because a single attack can compromise all of the vulnerable computers in cyberspace. This one-vulnerability-affects-all phenomenon will continue until after software stacks are diversified, which is well recognized by the research community. However, existing studies m...
Preprint
The deployment of monoculture software stacks can cause a devastating damage even by a single exploit against a single vulnerability. Inspired by the resilience benefit of biological diversity, the concept of software diversity has been proposed in the security domain. Although it is intuitive that software diversity may enhance security, its effec...
Chapter
Cybersecurity Metrics and Quantification is a fundamental but notoriously hard problem and is undoubtedly one of the pillars underlying the emerging Science of Cybersecurity. In this paper, we present an novel approach to addressing this problem by unifying Security, Agility, Resilience and Risk (SARR) metrics into a single framework. The SARR appr...
Chapter
In the context of malware detection, ground-truth labels of files are often difficult or costly to obtain; as a consequence, malware detector effectiveness metrics (e.g., false-positive and false-negative rates) are hard to measure. The unavailability of ground-truth labels also hinder the training of machine learning based malware detectors. These...
Preprint
The deep learning approach to detecting malicious software (malware) is promising but has yet to tackle the problem of dataset shift, namely that the joint distribution of examples and their labels associated with the test set is different from that of the training set. This problem causes the degradation of deep learning models without users' noti...
Article
The bribery problem in election has received considerable attention in the literature, upon which various algorithmic and complexity results have been obtained. It is thus natural to ask whether we can protect an election from potential bribery. We assume that the protector can protect a voter with some cost (e.g., by isolating the voter from poten...
Article
The deployment of monoculture software stacks has devastating consequences because a single attack can compromise all of the vulnerable computers in cyberspace. This one-vulnerability-affects-all phenomenon will continue until after software stacks are diversified, which is well recognized by the research community. However, existing studies mainly...
Preprint
Automatically detecting software vulnerabilities in source code is an important problem that has attracted much attention. In particular, deep learning-based vulnerability detectors, or DL-based detectors, are attractive because they do not need human experts to define features or patterns of vulnerabilities. However, such detectors' robustness is...
Article
Cybersecurity dynamics is a mathematical approach to modeling and analyzing cyber attacker-defender interactions in networks. In this paper, we advance the state-of-the-art in characterizing one kind of cybersecurity dynamics, known as preventive and reactive cyber defense dynamics, which is a family of highly nonlinear dynamical system models. We...
Article
Causality is an intriguing concept that once tamed, can have many applications. While having been widely investigated in other domains, its relevance and usefulness in the cybersecurity domain has received little attention. In this paper, we present a systematic investigation of a particular approach to causality, known as Granger causality (G-caus...
Article
Automatically detecting software vulnerabilities is an important problem that has attracted much attention from the academic research community. However, existing vulnerability detectors still cannot achieve the vulnerability detection capability and the locating precision that would warrant their adoption for real-world use. In this paper, we pres...
Article
As increasingly more vehicles are connected to the Internet, cyber attacks against vehicles are becoming a real threat with devastating consequences. This highlights the importance of detecting vehicle cyber attacks before fatal accidents occur. One natural method for tackling this problem is to adapt existing approaches for detecting attacks in en...
Preprint
Full-text available
COVID-19 (Coronavirus) hit the global society and economy with a big surprise. In particular, work-from-home has become a new norm for employees. Despite the fact that COVID-19 can equally attack innocent people and cybercriminals, it is ironic to see surges in cyberattacks leveraging COVID-19 as a theme, dubbed COVID-19 themed cyberattacks or COVI...
Preprint
Full-text available
COVID-19 has hit hard on the global community, and organizations are working diligently to cope with the new norm of "work from home". However, the volume of remote work is unprecedented and creates opportunities for cyber attackers to penetrate home computers. Attackers have been leveraging websites with COVID-19 related names, dubbed COVID-19 the...
Article
Data breach is a major cybersecurity problem that has caused huge financial losses and compromised many individuals’ privacy (e.g., social security numbers). This calls for deeper understanding about the data breach risk. Despite the substantial amount of attention that has been directed toward the issue, many fundamental problems are yet to be inv...
Article
Machine learning-based malware detection is known to be vulnerable to adversarial evasion attacks. The state-of-the-art is that there are no effective defenses against these attacks. As a response to the adversarial malware classification challenge organized by the MIT Lincoln Lab (dubbed the AICS'2019 challenge), we propose six guiding principles...
Article
The detection of software vulnerabilities (or vulnerabilities for short) is an important problem that has yet to be tackled, as manifested by the many vulnerabilities reported on a daily basis. This calls for machine learning methods for vulnerability detection. Deep learning is attractive for this purpose because it alleviates the requirement to m...
Article
Full-text available
Modeling cyber threats, such as the computer malicious software (malware) propagation dynamics in cyberspace, is an important research problem because models can deepen our understanding of dynamical cyber threats. In this paper, we study the statistical model-ing of the macro-level evolution of dynamical cyber attacks. Specifically , we propose a...
Conference Paper
Full-text available
The Cybersecurity Dynamics framework offers an approach to systematically understanding, characterizing, quantifying and managing cybersecurity from a holistic perspective. The framework looks into cyberspace through the dynamics lens because environments in cyberspace often evolve with time (e.g., software vulnerabilities, attack capabilities, def...
Article
Estimating the global state of a networked system is an important problem in many application domains. The classical approach to tackling this problem is the periodic ( observation ) method, which is inefficient because it often observes states at a very high frequency. This inefficiency has motivated the idea of event-based method, which leverages...
Conference Paper
Full-text available
COVID-19 (Coronavirus) hit the global society and economy with a big surprise. In particular, work-from-home has become a new norm for employees. Despite the fact that COVID-19 can equally attack innocent people and cyber criminals, it is ironic to see surges in cyberattacks leveraging COVID-19 as a theme, dubbed COVID-19 themed cyberattacks or COV...
Conference Paper
Full-text available
COVID-19 has hit hard on the global community, and organizations are working diligently to cope with the new norm of "work from home". However, the volume of remote work is unprecedented and creates opportunities for cyber attackers to penetrate home computers. Attackers have been leveraging websites with COVID-19 related names, dubbed COVID-19 the...
Preprint
Cybersecurity Dynamics is new concept that aims to achieve the modeling, analysis, quantification, and management of cybersecurity from a holistic perspective, rather than from a building-blocks perspective. It is centered at modeling and analyzing the attack-defense interactions in cyberspace, which cause a ``natural'' phenomenon -- the evolution...
Preprint
Secure group communications are a mechanism facilitating protected transmission of messages from a sender to multiple receivers, and many emerging applications in both wired and wireless networks need the support of such a mechanism. There have been many secure group communication schemes in wired networks, which can be directly adopted in, or appr...
Article
Full-text available
Social engineering cyberattacks are a major threat because they often prelude sophisticated and devastating cyberattacks. Social engineering cyberattacks are a kind of psychological attack that exploits weaknesses in human cognitive functions. Adequate defense against social engineering cyberattacks requires a deeper understanding of what aspects o...
Chapter
Full-text available
The bribery problem in election has received considerable attention in the literature, upon which various algorithmic and complexity results have been obtained. It is thus natural to ask whether we can protect an election from potential bribery. We assume that the protector can protect a voter with some cost (e.g., by isolating the voter from poten...
Article
Full-text available
Detecting software vulnerabilities is an important problem and a recent development in tackling the problem is the use of deep learning models to detect software vulnerabilities. While effective, it is hard to explain why a deep learning model predicts a piece of code as vulnerable or not because of the black-box nature of deep learning models. Ind...
Preprint
Full-text available
Social engineering cyberattacks are a major threat because they often prelude sophisticated and devastating cyberattacks. Social engineering cyberattacks are a kind of psychological attack that exploits weaknesses in human cognitive functions. Adequate defense against social engineering cyberattacks requires a deeper understanding of what aspects o...
Preprint
Full-text available
The bribery problem in election has received considerable attention in the literature, upon which various algorithmic and complexity results have been obtained. It is thus natural to ask whether we can protect an election from potential bribery. We assume that the protector can protect a voter with some cost (e.g., by isolating the voter from poten...
Article
Blockchain technology is believed by many to be a game changer in many application domains. While the first generation of blockchain technology (i.e., Blockchain 1.0) is almost exclusively used for cryptocurrency, the second generation (i.e., Blockchain 2.0), as represented by Ethereum, is an open and decentralized platform enabling a new paradigm...
Preprint
Full-text available
Malicious software (malware) is a major cyber threat that shall be tackled with Machine Learning (ML) techniques because millions of new malware examples are injected into cyberspace on a daily basis. However, ML is known to be vulnerable to attacks known as adversarial examples. In this SoK paper, we systematize the field of Adversarial Malware De...
Preprint
Full-text available
Machine learning based malware detection is known to be vulnerable to adversarial evasion attacks. The state-of-the-art is that there are no effective countermeasures against these attacks. Inspired by the AICS'2019 Challenge organized by the MIT Lincoln Lab, we systematize a number of principles for enhancing the robustness of neural networks agai...
Article
The majority of vehicle accidents are attributable to driver error, such as substance use, distractions, fatigue, speeding and driving experience. Many of these driver errors are also associated with delay discounting, where individuals that excessively devalue a reward are more likely to use substances such as alcohol, cigarettes and cocaine, and...
Preprint
Cybersecurity dynamics is a mathematical approach to modeling and analyzing cyber attack-defense interactions in networks. In this paper, we advance the state-of-the-art in characterizing one kind of cybersecurity dynamics, known as preventive and reactive cyber defense dynamics, which is a family of highly nonlinear system models. We prove that th...
Preprint
Full-text available
Fine-grained software vulnerability detection is an important and challenging problem. Ideally, a detection system (or detector) not only should be able to detect whether or not a program contains vulnerabilities, but also should be able to pinpoint the type of a vulnerability in question. Existing vulnerability detection methods based on deep lear...
Preprint
Automatically detecting software vulnerabilities is an important problem that has attracted much attention. However, existing vulnerability detectors still cannot achieve the vulnerability detection capability and locating precision that would warrant their adoption for real-world use. In this paper, we present Vulnerability Deep Learning-based Loc...
Chapter
Full-text available
Digital signatures are widely used to assure authenticity and integrity of messages (including blockchain transactions). This assurance is based on assumption that the private signing key is kept secret, which may be exposed or compromised without being detected in the real world. Many schemes have been proposed to mitigate this problem, but most s...
Preprint
Deep Learning has been very successful in many application domains. However, its usefulness in the context of network intrusion detection has not been systematically investigated. In this paper, we report a case study on using deep learning for both supervised network intrusion detection and unsupervised network anomaly detection. We show that Deep...
Article
Fine-grained software vulnerability detection is an important and challenging problem. Ideally, a detection system (or detector) not only should be able to detect whether or not a program contains vulnerabilities, but also should be able to pinpoint the type of a vulnerability in question. Existing vulnerability detection methods based on deep lear...
Preprint
Intrusion Detection Systems (IDSs) are a necessary cyber defense mechanism. Unfortunately, their capability has fallen behind that of attackers. This motivates us to improve our understanding of the root causes of their false-negatives. In this paper we make a first step towards the ultimate goal of drawing useful insights and principles that can g...
Preprint
The blockchain technology is believed by many to be a game changer in many application domains, especially financial applications. While the first generation of blockchain technology (i.e., Blockchain 1.0) is almost exclusively used for cryptocurrency purposes, the second generation (i.e., Blockchain 2.0), as represented by Ethereum, is an open and...
Conference Paper
As modern social coding platforms such as GitHub and Stack Overflow become increasingly popular, their potential security risks increase as well (e.g., risky or malicious codes could be easily embedded and distributed). To enhance the social coding security, in this paper, we propose to automate cross-platform user identification between GitHub and...
Conference Paper
We consider the electoral bribery problem in computational social choice. In this context, extensive studies have been carried out to analyze the computational vulnerability of various voting (or election) rules. However, essentially all prior studies assume a deterministic model where each voter has an associated threshold value, which is used as...
Article
Full-text available
Bribery in election (or computational social choice in general) is an important problem that has received a considerable amount of attention. In the classic bribery problem, the briber (or attacker) bribes some voters in attempting to make the briber’s designated candidate win an election. In this paper, we introduce a novel variant of the bribery...
Preprint
Full-text available
In cyberspace, evolutionary strategies are commonly used by both attackers and defenders. For example, an attacker's strategy often changes over the course of time, as new vulnerabilities are discovered and/or mitigated. Similarly, a defender's strategy changes over time. These changes may or may not be in direct response to a change in the opponen...
Article
A class of the preventive and reactive cyber defense dynamics has recently been proven to be globally convergent , meaning that the dynamics always converges to a unique equilibrium whose location only depends on the values of the model parameters (but not the initial state of the dynamics). In this paper, we unify the aforementioned class of p...
Chapter
Full-text available
Cybersecurity Dynamics is new concept that aims to achieve the modeling, analysis, quantification, and management of cybersecurity from a holistic perspective, rather than from a building-blocks perspective. It is centered at modeling and analyzing the attack-defense interactions in cyberspace, which cause a “natural” phenomenon—the evolution of th...
Article
Full-text available
Like how useful weather forecasting is, the capability of forecasting or predicting cyber threats can never be overestimated. Previous investigations show that cyber attack data exhibits interesting phenomena, such as long-range dependence and high nonlinearity, which impose a particular challenge on modeling and predicting cyber attack rates. Devi...
Article
In cyberspace, evolutionary strategies are commonly used by both attackers and defenders. For example, an attacker’s strategy often changes over the course of time, as new vulnerabilities are discovered and/or mitigated. Similarly, a defender’s strategy changes over time. These changes may or may not be in direct response to a change in the opponen...
Article
Various system metrics have been proposed for measuring the quality of computer-based systems, such as dependability and security metrics for estimating their performance and security characteristics. As computer-based systems grow in complexity with many subsystems or components, measuring their quality in multiple dimensions is a challenging task...
Book
This book constitutes the proceedings of the Second International Conference on Science of Cyber Security, SciSec 2019, held in Nanjing, China, in August 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 62 submissions. These papers cover the following subjects: Artificial Intelligence fo...
Preprint
Full-text available
Malware continues to be a major cyber threat, despite the tremendous effort that has been made to combat them. The number of malware in the wild steadily increases over time, meaning that we must resort to automated defense techniques. This naturally calls for machine learning based malware detection. However, machine learning is known to be vulner...
Conference Paper
As the popularity of modern social coding paradigm such as Stack Overflow grows, its potential security risks increase as well (e.g., insecure codes could be easily embedded and distributed). To address this largely overlooked issue, in this paper, we bring an important new insight to exploit social coding properties in addition to code content for...
Preprint
We develop a decentralized coloring approach to diversify the nodes in a complex network. The key is the introduction of a local conflict index that measures the color conflicts arising at each node which can be efficiently computed using only local information. We demonstrate via both synthetic and real-world networks that the proposed approach si...
Preprint
Full-text available
Public blockchains provide a decentralized method for storing transaction data and have many applications in different sectors. In order for users to track transactions, a simple method is to let them keep a local copy of the entire public ledger. Since the size of the ledger keeps growing, this method becomes increasingly less practical, especiall...
Preprint
Bribery in election (or computational social choice in general) is an important problem that has received a considerable amount of attention. In the classic bribery problem, the briber (or attacker) bribes some voters in attempting to make the briber's designated candidate win an election. In this paper, we introduce a novel variant of the bribery...
Conference Paper
The Internet of Things (IoT) technology is transforming the world into Smart Cities, which have a huge impact on future societal lifestyle, economy and business. Intelligent Transportation Systems (ITS), especially IoT-enabled Electric Vehicles (EVs), are anticipated to be an integral part of future Smart Cities. Assuring ITS safety and security is...
Preprint
The accurate measurement of security metrics is a critical research problem because an improper or inaccurate measurement process can ruin the usefulness of the metrics, no matter how well they are defined. This is a highly challenging problem particularly when the ground truth is unknown or noisy. In contrast to the well perceived importance of de...
Preprint
Full-text available
Adversarial machine learning in the context of image processing and related applications has received a large amount of attention. However, adversarial machine learning, especially adversarial deep learning, in the context of malware detection has received much less attention despite its apparent importance. In this paper, we present a framework fo...
Article
Full-text available
The rollback mechanism is critical in crash recovery and debugging, but its security problems have not been adequately addressed. This is justified by the fact that existing solutions always require modifications on target software or only work for specific scenarios. As a consequence, rollback is either neglected or restricted or prohibited in exi...
Preprint
The detection of software vulnerabilities (or vulnerabilities for short) is an important problem that has yet to be tackled, as manifested by many vulnerabilities reported on a daily basis. This calls for machine learning methods to automate vulnerability detection. Deep learning is attractive for this purpose because it does not require human expe...
Article
Analyzing cyber incident datasets is an important method for deepening our understanding of the evolution of the threat situation. This is a relatively new research topic and many studies remain to be done. In this paper, we report a statistical analysis of a breach incident dataset corresponding to 12 years (2005-2017) of cyber hacking activities...