Laurie Williams's research while affiliated with North Carolina State University and other places

Publications (338)

Conference Paper
Full-text available
Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a twofold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. A systematic derivation of practices for managing secrets can help practitioners in secure developm...
Preprint
Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. A systematic derivation of practices for managing secrets can help practitioners in secure develop...
Article
Full-text available
Context Applying vulnerability detection techniques is one of many tasks using the limited resources of a software project. Objective The goal of this research is to assist managers and other decision-makers in making informed choices about the use of software vulnerability detection techniques through an empirical study of the efficiency and effe...
Preprint
The OpenSSF Scorecard project is an automated tool to monitor the security health of open source software. We used the tool to understand the security practices and gaps in npm and PyPI ecosystems and to confirm the applicability of the Scorecard tool.
Preprint
CONTEXT: Applying vulnerability detection techniques is one of many tasks using the limited resources of a software project. OBJECTIVE: The goal of this research is to assist managers and other decision-makers in making informed choices about the use of software vulnerability detection techniques through an empirical study of the efficiency and eff...
Preprint
The goal of this study is to aid developers in securely accepting dependency updates by measuring if the code changes in an update have passed through a code review process. We implement DepDive, an update audit tool for packages in Crates.io, npm, PyPI, and RubyGems registry. DepDive first (i) identifies the files and the code changes in an update...
Preprint
Background: Most of the existing machine learning models for security tasks, such as spam detection, malware detection, or network intrusion detection, are built on supervised machine learning algorithms. In such a paradigm, models need a large amount of labeled data to learn the useful relationships between selected features and the target class....
Article
Full-text available
Checked-in secrets in version-controlled software projects pose security risks to software and services. Secret detection tools can identify the presence of secrets in the code, commit changesets, and project version control history. As these tools can generate false positives, developers are provided with mechanisms to bypass the warnings generate...
Conference Paper
Full-text available
Modern software development frequently uses third-party packages , raising the concern of supply chain security attacks. Many attackers target popular package managers, like npm, and their users with supply chain attacks. In 2021 there was a 650% year-on-year growth in security attacks by exploiting Open Source Software's supply chain. Proactive ap...
Preprint
Background: Machine learning techniques have been widely used and demonstrate promising performance in many software security tasks such as software vulnerability prediction. However, the class ratio within software vulnerability datasets is often highly imbalanced (since the percentage of observed vulnerability is usually very low). Goal: To help...
Article
Context Using feature toggles is a technique to turn a feature either on or off in program code by checking the value of a variable in a conditional statement. This technique is increasingly used by software practitioners to support continuous integration and continuous delivery (CI/CD). However, using feature toggles may increase code complexity,...
Article
Full-text available
ContextMachine learning-based security detection models have become prevalent in modern malware and intrusion detection systems. However, previous studies show that such models are susceptible to adversarial evasion attacks. In this type of attack, inputs (i.e., adversarial examples) are specially crafted by intelligent malicious adversaries, with...
Preprint
Full-text available
Modern software development frequently uses third-party packages, raising the concern of supply chain security attacks. Many attackers target popular package managers, like npm, and their users with supply chain attacks. In 2021 there was a 650% year-on-year growth in security attacks by exploiting Open Source Software's supply chain. Proactive app...
Preprint
Vulnerabilities in open source packages can be a security risk for the client projects that use these packages as dependencies. When a new vulnerability is discovered in a package, the package should quickly release a fix in a new version, referred to as security release in this study. The security release should be well-documented and require mini...
Preprint
Full-text available
Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction...
Preprint
Background: Modern software uses many third-party libraries and frameworks as dependencies. Known vulnerabilities in these dependencies are a potential security risk. Software composition analysis (SCA) tools, therefore, are being increasingly adopted by practitioners to keep track of vulnerable dependencies. Aim: The goal of this study is to under...
Article
Full-text available
Background In order that the general public is not vulnerable to hackers, security bug reports need to be handled by small groups of engineers before being widely discussed. But learning how to distinguish the security bug reports from other bug reports is challenging since they may occur rarely. Data mining methods that can find such scarce target...
Preprint
Full-text available
We study 10 C/C++ projects that have been using a static analysis security testing tool. We analyze the historical scan reports generated by the tool and study how frequently memory-related alerts appeared. We also studied the subsequent developer action on those alerts. We also look at the CVEs published for these projects within the study timelin...
Preprint
Full-text available
Lack of security expertise among software practitioners is a problem with many implications. First, there is a deficit of security professionals to meet current needs. Additionally, even practitioners who do not plan to work in security may benefit from increased understanding of security. The goal of this paper is to aid software engineering educa...
Article
Context: Security smells are recurring coding patterns that are indicative of security weakness and require further inspection. As infrastructure as code (IaC) scripts, such as Ansible and Chef scripts, are used to provision cloud-based servers and systems at scale, security smells in IaC scripts could be used to enable malicious users to exploit v...
Article
Full-text available
Background Using feature toggles is a technique that allows developers to either turn a feature on or off with a variable in a conditional statement. Feature toggles are increasingly used by software companies to facilitate continuous integration and continuous delivery. However, using feature toggles inappropriately may cause problems which can ha...
Preprint
BACKGROUND: Machine learning-based security detection models have become prevalent in modern malware and intrusion detection systems. However, previous studies show that such models are susceptible to adversarial evasion attacks. In this type of attack, inputs (i.e., adversarial examples) are specially crafted by intelligent malicious adversaries,...
Article
Full-text available
Context:The ‘as code’ suffix in infrastructure as code (IaC) refers to applying software engineering activities, such as version control, to maintain IaC scripts. Without the application of these activities, defects that can have serious consequences may be introduced in IaC scripts. A systematic investigation of the development anti-patterns for I...
Preprint
Software developers often fail to run simple checks for known vulnerabilities in software dependencies, creating a common security risk. To remedy this, GitHub began sending security alerts to hosted projects that declare a vulnerable dependency in October 2017. This naturally raises an important question: how did vulnerable dependency fix rates ch...
Preprint
Context: The 'as code' suffix in infrastructure as code (IaC) refers to applying software engineering activities, such as version control, to maintain IaC scripts. Without the application of these activities, defects that can have serious consequences may be introduced in IaC scripts. A systematic investigation of the development anti-patterns for...
Chapter
Predictable, rapid, and data-driven feature rollout; lightning-fast; and automated fix deployment are some of the benefits most large software organizations worldwide are striving for. In the process, they are transitioning toward the use of continuous deployment practices. Continuous deployment enables companies to make hundreds or thousands of so...
Article
Full-text available
Context Modern software systems are deployed in sociotechnical settings, combining social entities (humans and organizations) with technical entities (software and devices). In such settings, on top of technical controls that implement security features of software, regulations specify how users should behave in security-critical situations. No mat...
Preprint
Background: Security bugs need to be handled by small groups of engineers before being widely discussed (otherwise the general public becomes vulnerable to hackers that exploit those bugs). But learning how to separate the security bugs from other bugs is challenging since they may occur very rarely. Data mining that can find such scarce targets re...
Article
Context Vulnerability Prediction Models (VPMs) are an approach for prioritizing security inspection and testing to find and fix vulnerabilities. VPMs have been created based on a variety of metrics and approaches, yet widespread adoption of VPM usage in practice has not occurred. Knowing which VPMs have strong prediction and which VPMs have low dat...
Article
Full-text available
Software engineers can find vulnerabilities with less effort if they are directed towards code that might contain more vulnerabilities. HARMLESS is an incremental support vector machine tool that builds a vulnerability prediction model from the source code inspected to date, then suggests what source code files should be inspected next. In this way...
Conference Paper
Full-text available
Static analysis tools (SATs) often fall short of developer satisfaction despite their many benefits. An understanding of how developers in the real-world act on the alerts detected by SATs can help improve the utility of these tools and determine future research directions. The goal of this paper is to aid researchers and tool makers in improving t...
Preprint
Context: Security smells are coding patterns in source code that are indicative of security weaknesses. As infrastructure as code (IaC) scripts are used to provision cloud-based servers and systems at scale, security smells in IaC scripts could be used to enable malicious users to exploit vulnerabilities in the provisioned systems. Goal: The goal o...
Preprint
Using feature toggles is a technique that allows developers to either turn a feature on or off with a variable in a conditional statement. Feature toggles are increasingly used by software companies to facilitate continuous integration and continuous delivery. However, using feature toggles inappropriately may cause problems, such as dead code and...
Preprint
When security bugs are detected, they should be (a)~discussed privately by security software engineers; and (b)~not mentioned to the general public until security patches are available. Software engineers usually report bugs to bug tracking system, and label them as security bug reports (SBRs) or not-security bug reports (NSBRs), while SBRs have a...
Article
Context In continuous deployment, software and services are rapidly deployed to end-users using an automated deployment pipeline. Defects in infrastructure as code (IaC) scripts can hinder the reliability of the automated deployment pipeline. We hypothesize that certain properties of IaC source code such as lines of code and hard-coded strings used...
Poster
Full-text available
How Open Source Developers Respond to Static Analysis Security Testing Tool Alerts: A Brief Synopsys
Conference Paper
According to the National Institute of Standards and Technology (NIST), penetration testing is an assessment conducted on software systems to identify vulnerabilities that could be exploited by adversaries¹. Despite the importance of penetration testing in software security, practitioners search for strategies and guidance on how to get started in...
Conference Paper
Full-text available
Static analysis tool alerts can help developers detect potential defects in the code early in the development cycle. However, developers are not always able to respond to the alerts with their preferred action and may turn away from using the tool. In this paper, we qualitatively analyze 280 Stack Overflow (SO) questions regarding static analysis t...
Article
Context: Infrastructure as code (IaC) is the practice to automatically configure system dependencies and to provision local and remote instances. Practitioners consider IaC as a fundamental pillar to implement DevOps practices, which helps them to rapidly deliver software and services to end-users. Information technology (IT) organizations, such as...
Preprint
Context: In continuous deployment, software and services are rapidly deployed to end-users using an automated deployment pipeline. Defects in infrastructure as code (IaC) scripts can hinder the reliability of the automated deployment pipeline. We hypothesize that certain properties of IaC source code such as lines of code and hard-coded strings use...
Preprint
Infrastructure as code (IaC) scripts are used to automate the maintenance and configuration of software development and deployment infrastructure. IaC scripts can be complex in nature, containing hundreds of lines of code, leading to defects that can be difficult to debug, and lead to wide-scale system discrepancies such as service outages at scale...
Preprint
Context:Infrastructure as code (IaC) is the practice to automatically configure system dependencies and to provision local and remote instances. Practitioners consider IaC as a fundamental pillar to implement DevOps practices, which helps them to rapidly deliver software and services to end-users. Information technology (IT) organizations, such as...
Article
Context Michael Howard conceptualized the attack surface of a software system as a metaphor for risk assessment during the development and maintenance of software. While the phrase attack surface is used in a variety of contexts in cybersecurity, professionals have different conceptions of what the phrase means. Objective The goal of this systemat...
Article
Full-text available
Software defect data has long been used to drive software development process improvement. If security defects (vulnerabilities) are discovered and resolved by different software development practices than non-security defects, the knowledge of that distinction could be applied to drive process improvement. The goal of this research is to support t...
Conference Paper
Configuration as code (CaC) tools, such as Ansible and Puppet, help software teams to implement continuous deployment and deploy software changes rapidly. CaC tools are growing in popularity, yet what challenges programmers encounter about CaC tools, have not been characterized. A systematic investigation on what questions are asked by programmers,...
Conference Paper
Identifying security issues before attackers do has become a critical concern for software development teams and software users. While methods for finding programming errors (e.g. fuzzers ¹, static code analysis [3] and vulnerability prediction models like Scandariato et al. [10]) are valuable, identifying security issues related to the lack of sec...
Conference Paper
Context: Software defect data has long been used to drive software development process improvement. If security defects (i.e.,vulnerabilities) are discovered and resolved by different software development practices than non-security defects, the knowledge of that distinction could be applied to drive process improvement. Objective: The goal of this...
Conference Paper
Use of infrastructure as code (IaC) scripts helps software teams manage their configuration and infrastructure automatically. Information technology (IT) organizations use IaC scripts to create and manage automated deployment pipelines to deliver services rapidly. IaC scripts can be defective, resulting in dire consequences, such as creating wide-s...
Conference Paper
Continuous deployment is a software engineering process where incremental software changes are automatically tested and frequently deployed to production environments. With continuous deployment, the elapsed time for a change made by a developer to reach a customer can now be measured in days or even hours. To understand the emerging practices surr...
Article
Context: Practitioners establish a piece of software's security objectives during the software development process. To support control and assessment, practitioners and researchers seek to measure security risks and mitigations during software development projects. Metrics provide one means for assessing whether software security objectives have be...
Conference Paper
Developing security requirements that are compliant with security regulations is key for developing secure software systems. Statements within regulatory documents are frequently overlapping, both within and between documents. Approaches to identifying and address this overlap have been developed in academia and industry. However, these approaches...
Conference Paper
To date, vulnerability research has focused on the binary classification of code as vulnerable or not vulnerable. To better understand the conditions in which vulnerabilities occur, researchers must consider the severity of these vulnerabilities in addition to a binary classification system. To explore this issue, we mined 2,979 publicly disclosed...
Conference Paper
We propose and evaluate an information extraction and analysis framework that combines human intelligent (crowdsourcing) with automated methods to produce improved security and privacy requirements incorporating knowledge from post-deployment artifacts such as breach reports.
Article
Full-text available
Society needs more secure software. But predicting vulnerabilities is difficult and existing methods are not applied in practical use due to various limitations. The goal of this paper is to design a vulnerability prediction method in a cost-aware manner so that it can balance the percentage of vulnerabilities found against the cost of human effort...
Article
Full-text available
With the goal of helping software engineering researchers understand how to improve their papers, Mary Shaw presented "Writing Good Software Engineering Research Papers" in 2003. Shaw analyzed the abstracts of the papers submitted to the 2002 International Conference of Software Engineering (ICSE) to determine trends in research question type, cont...
Conference Paper
Science of security necessitates conducting methodologically-defensible research and reporting such research comprehensively to enable replication and future research to build upon the reported study. The comprehensiveness of reporting is as important as the research itself in building a science of security. Key principles of science - replication,...
Article
Full-text available
ContextUser activity logs should capture evidence to help answer who, what, when, where, why, and how a security or privacy breach occurred. However, software engineers often implement logging mechanisms that inadequately record mandatory log events (MLEs), user activities that must be logged to enable forensics. GoalThe objective of this study is...
Article
Full-text available
Identifying security requirements early on can lay the foundation for secure software development. Security requirements are often implied by existing functional requirements but are mostly left unspecified. The Security Discoverer (SD) process automatically identifies security implications of individual requirements sentences and suggests applicab...