ArticlePublisher preview available

A systematic literature review on software security testing using metaheuristics

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

The security of an application is critical for its success, as breaches cause loss for organizations and individuals. Search-based software security testing (SBSST) is the field that utilizes metaheuristics to generate test cases for the software testing for some pre-specified security test adequacy criteria This paper conducts a systematic literature review to compare metaheuristics and fitness functions used in software security testing, exploring their distinctive capabilities and impact on vulnerability detection and code coverage. The aim is to provide insights for fortifying software systems against emerging threats in the rapidly evolving technological landscape. This paper examines how search-based algorithms have been explored in the context of code coverage and software security testing. Moreover, the study highlights different metaheuristics and fitness functions for security testing and code coverage. This paper follows the standard guidelines from Kitchenham to conduct SLR and obtained 122 primary studies related to SBSST after a multi-stage selection process. The papers were from different sources journals, conference proceedings, workshops, summits, and researchers’ webpages published between 2001 and 2022. The outcomes demonstrate that the main tackled vulnerabilities using metaheuristics are XSS, SQLI, program crash, and XMLI. The findings have suggested several areas for future research directions, including detecting server-side request forgery and security testing of third-party components. Moreover, new metaheuristics must also need to be explored to detect security vulnerabilities that are still unexplored or explored significantly less. Furthermore, metaheuristics can be combined with machine learning and reinforcement learning techniques for better results. Some metaheuristics can be designed by looking at the complexity of security testing and exploiting more fitness functions related to detecting different vulnerabilities.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
https://doi.org/10.1007/s10515-024-00433-0
1 3
A systematic literature review onsoftware security testing
using metaheuristics
FatmaAhsan1· FaisalAnwer1
Received: 10 August 2023 / Accepted: 13 March 2024
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature
2024
Abstract
The security of an application is critical for its success, as breaches cause loss for
organizations and individuals. Search-based software security testing (SBSST) is the
field that utilizes metaheuristics to generate test cases for the software testing for
some pre-specified security test adequacy criteria This paper conducts a systematic
literature review to compare metaheuristics and fitness functions used in software
security testing, exploring their distinctive capabilities and impact on vulnerability
detection and code coverage. The aim is to provide insights for fortifying software
systems against emerging threats in the rapidly evolving technological landscape.
This paper examines how search-based algorithms have been explored in the context
of code coverage and software security testing. Moreover, the study highlights dif-
ferent metaheuristics and fitness functions for security testing and code coverage.
This paper follows the standard guidelines from Kitchenham to conduct SLR and
obtained 122 primary studies related to SBSST after a multi-stage selection pro-
cess. The papers were from different sources journals, conference proceedings,
workshops, summits, and researchers’ webpages published between 2001 and 2022.
The outcomes demonstrate that the main tackled vulnerabilities using metaheuris-
tics are XSS, SQLI, program crash, and XMLI. The findings have suggested several
areas for future research directions, including detecting server-side request forgery
and security testing of third-party components. Moreover, new metaheuristics must
also need to be explored to detect security vulnerabilities that are still unexplored
or explored significantly less. Furthermore, metaheuristics can be combined with
machine learning and reinforcement learning techniques for better results. Some
metaheuristics can be designed by looking at the complexity of security testing and
exploiting more fitness functions related to detecting different vulnerabilities.
Keywords SBSST· Meta-heuristic· Optimization algorithm· Evolutionary
algorithm· Software security testing· Code coverage· XSS· SQLI· XMLI·
Program crash
Extended author information available on the last page of the article
/ Pu blished on line : 23 M ay 20 24
Automated Software Engineering (2024) 31:44
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... The severity of the vulnerabilities also becomes higher, such as the Remote Code Execution (RCE) vulnerability Log4Shell (CVE-2021-44228) 1 in Apache Log4j2, causing a huge interruption for the web services in the whole internet. ...
... The existing literature survey studies up to now [1,8,27,28,67,71] have focused on a specific subset of exploitation techniques, for example, Felderer et al. [27] conducted a survey on the state-of-the-art techniques for security testing. ...
... As seen in Table 1, most of the existing surveys follow a systematic approach, except for the works of Shahriar et al. [67], and Felderer et al. [27], which do not explain the process they followed. Only two works derived an explicit set of research questions, aiming to answer them and generate new knowledge on the matter [1,8] however, we still group a number of the techniques that we found in our study as fuzzing techniques (denote with the symbol "◗" in the Table 1, meaning "partially cover"). ...
Preprint
Full-text available
The exploit or the Proof of Concept of the vulnerability plays an important role in developing superior vulnerability repair techniques, as it can be used as an oracle to verify the correctness of the patches generated by the tools. However, the vulnerability exploits are often unavailable and require time and expert knowledge to craft. Obtaining them from the exploit generation techniques is another potential solution. The goal of this survey is to aid the researchers and practitioners in understanding the existing techniques for exploit generation through the analysis of their characteristics and their usability in practice. We identify a list of exploit generation techniques from literature and group them into four categories: automated exploit generation, security testing, fuzzing, and other techniques. Most of the techniques focus on the memory-based vulnerabilities in C/C++ programs and web-based injection vulnerabilities in PHP and Java applications. We found only a few studies that publicly provided usable tools associated with their techniques.
Article
Full-text available
Cloud computing has become a pivotal component of all digital organizations and its adoption poses many security concerns for organizations’ data and stakeholders. Data integrity and privacy is prominent security threats that demand the development of efficient security models based on exact algorithms, heuristics, and meta-heuristic algorithms. This paper proposes a genetic algorithm-based block cipher, XECryptoGA, to generate the 128-bit random keys to maximizing Shannon's entropy. The XECryptoGA employs the shift cipher, XOR-operation, the concept of elitism, and the roulette wheel selection method. In XECryptoGA, the shift cipher, XOR operation, GA, and application of crossover and mutation are used for the encryption and decryption process. The shift cipher, GA, and XOR operation create confusion, while the application of crossover and mutation is used to generate the diffusion in the model. The proposed XECryptoGA has been tested using several small and large datasets on various security parameters: Key generation time, brute-force attacks, avalanche effects, and encryption and decryption throughputs. The simulation study confirms that the proposed XECryptoGA has achieved security goals: confidentiality, integrity, and privacy. The experiment results show that the proposed model performs better than CryptoGA regarding Key generation time, Brute force attacks, avalanche effect, and security providence. While better than the state-of-art algorithms AES, DES, 3DES, and BLOWFISH in terms of encryption throughput and decryption throughput.
Article
Full-text available
The widespread adoption of cloud technology in the healthcare industry has achieved effective outcomes in sharing health records and sensitive data. In recent years, many organizations have prioritized E-Health as their primary goal for advancement in health services. As a result, Attribute Based Encryption (ABE) has emerged as a reliable model for health information exchange across cloud settings. Eventually, it leads to provide acceptable solutions for challenging scenarios like fine-grained access control. Despite ABE significance and its breadth of applications, no systematic and comprehensive survey exists in the literature that covers every variation of ABE in healthcare, highlighting its past and present status. This paper presents a systematic and comprehensive study of ABE works concerning E-Health as the authors rigorously investigate healthcare-focused ABE frameworks and examine them based on various descriptive criteria, along with categorizing them systematically in 10 distinct domains and sub-domains, ultimately offering observations and potential recommendations. The descriptive research design, significant findings along with the suggested future works will help future research in ABE to secure the existing E-Health data sharing more effectively and will facilitate researchers and practitioners to comprehend the past trend and current state of ABE architectures in secure health data-sharing scenarios, as well as the prospect of ABE deployment in most recent technological evolutions.
Article
Full-text available
Cryptography is the mechanism of providing significant security services such as confidentiality, authentication, and integrity. These services are essentially more important in the current era, where lots of IoT devices are generating data and being stored in the cloud environment. Public key cryptography is highly effective in achieving confidentiality and authentication services. One of the popular public-key cryptography schemes is the RSA algorithm, which besides other concepts uses two very large prime integers. In the proposed study, the authors enhanced the RSA algorithm in the context of generation of a more complex key pair, i.e., public and private key, so that adversary should never be able to determine the private key using public-key. The proposed scheme uses four random large prime numbers to generate public–private key pairs and applies XOR operation along with the more complex intermediate process in key-generation encryption and decryption phases to achieve higher algorithm complexity, which would require more time to break the proposed cipher and would make it extremely difficult for third-parties to attack, hence boosting security. The method is also compared with other RSA-based algorithms to demonstrate the potency of the proposed algorithm in terms of enhanced performance and security.
Conference Paper
Full-text available
Nowadays, people are using more cloud services than ever before as it provides more storage, collaborative environment, and more security than any other platform. Public Key Cryptography plays a significant role in securing the cloud applications, particularly, Elliptic Curve Cryptography, as its small key size nature is the most suitable aspect in the Cloud. Although, many contributions have been made in recent years to enhance the security aspect of Elliptic Curve approaches in Cloud service by modifications made in the algorithm or in various algorithm phases, but a review work that integrates recent studies providing research directions is missing in the literature. In this paper, we reviewed recent studies along with the various phases of Elliptic Curve Cryptography, which then follows the data analysis of various approaches and techniques. Several research directions and open research problems are derived that would efficiently assist the future relevant research.
Chapter
Full-text available
In today’s cyber space, where large amounts of data are being exchanged and stored on remote storages, Cryptography plays a major role. Public key cryptography such as RSA is one its effective type, which uses two keys, one for encryption and one for decryption. Concerning the recent advancements in the domain of cryptography, many cryptographers have proposed various extended and enhanced form of RSA algorithms in order to improve the reliability and efficiency of the information security world. In this paper, we studied several extended forms of RSA algorithm. We implemented several and compare them in terms of its efficiency in terms of key generation, encryption and decryption time. Finally, we suggested a multipoint parallel RSA scheme to improve the overall algorithm execution speed compared to standard RSA. This method will also prove to be computationally less costly and more secure as compared to standard RSA.KeywordsCryptographyParallel cryptographyPublic cryptographyRSACryptanalysis
Conference Paper
In the rapidly evolving landscape of software development, ensuring reliable and efficient software systems is essential. However, traditional software testing methods often struggle to achieve comprehensive test coverage and adaptability to changing software dynamics. To address these challenges, this paper proposes an innovative approach that integrates reinforcement learning techniques into software testing automation. Our goal is to enhance test generation and prioritization strategies, leading to improved fault detection, adaptability, and resource utilization. By developing an intelligent testing framework that learns from feedback received during the testing process, we optimize test coverage and fault detection using reinforcement learning. Initial experiments demonstrate the potential of our approach in improving software testing outcomes. The integration of reinforcement learning into software testing automation holds promise for advancing the field, enabling more reliable and adaptable software systems, and reducing development costs.
Chapter
In the past decades, automated test case generation for detecting software vulnerabilities using the search-based algorithm has been a crucial research area for security researchers. Several proposed techniques partially achieve the automatic test case generation of detecting security issues that target specific programming languages or API types. This review paper performs a comparative analysis of research papers published during 2010–21 specific to the search-based techniques used for automatic test case generation. We have chosen five criteria for evaluating the proposed techniques. These are target vulnerability, search-based algorithms used, test case generation source, test case generation method, and target language. We focused primarily on the techniques detecting the four most dangerous vulnerabilities found in the modern distributed system, namely Denial of Service (e.g., program crash), SQL injection, Cross-Site Scripting, and XML injection. We also incorporated the fitness function comparison table for each method, including the merits and limitations. This work will assist security researchers in finding new directions in the field of search-based software security testing.
Article
Context Web applications follow a client–server schema, so it is more appropriate for evolutionary test case generation considering both client and server. However, test cases from the client-side are composed of event sequences, which are quite time-consuming when executed due to the interaction with the browser. Furthermore, premature convergence is a problem for evolutionary algorithms because of the decline of population diversity. These problems restrict the applicability of evolutionary algorithms in test case generation for web applications. Objective Parallelization has been proven helpful in optimizing test case generation. So, to improve the efficiency and effectiveness of test generation for web applications, this paper proposes a parallel evolutionary test case generation approach where test cases are generated from the client-side behavior model to cover the sensitive paths of server-side code by using a parallel genetic algorithm based on the island model. Method A parallel execution strategy is presented to drive multi-individuals to execute on multi-browsers simultaneously to shorten the execution time of populations during evolution. And an island model with a corresponding migration mechanism and subpopulation evolution strategy is well-designed to increase population diversity during evolution. Meanwhile, the server-side code triggered by parallel individuals is identified to guide the evolution process. Results Experiments are conducted on six widely-used web applications, and the results show that compared with the sequential evolutionary test case generation, our approach decreases the iterations and evolution time required by 33.43% and 63.10% on average, respectively. The efficiency of test generation has been greatly enhanced. Conclusion This paper provides a parallel evolutionary test case generation for web applications, where the parallel execution strategy is presented to shorten the execution time of populations during evolution, increasing test generation efficiency. Moreover, the island model with a migration mechanism is introduced to increase population diversity during evolution, improving the test generation effectiveness.
Article
RESTful web services are often used for building a wide variety of enterprise applications. The diversity and increased number of applications using RESTful APIs means that increasing amounts of resources are spent developing and testing these systems. Automation in test data generation provides a useful way of generating test data in a fast and efficient manner. However, automated test generation often results in large test suites that are hard to evaluate and investigate manually. This article proposes a taxonomy of the faults we have found using search-based software testing techniques applied on RESTful APIs. The taxonomy is a first step in understanding, analyzing, and ultimately fixing software faults in web services and enterprise applications. We propose to apply a density-based clustering algorithm to the test cases evolved during the search to allow a better separation between different groups of faults. This is needed to enable engineers to highlight and focus on the most serious faults. Tests were automatically generated for a set of eight case studies, seven open-source and one industrial. The test cases generated during the search are clustered based on the reported last executed line and based on the error messages returned, when such error messages were available. The tests were manually evaluated to determine their root causes and to obtain additional information. The article presents a taxonomy of the faults found based on the manual analysis of 415 faults in the eight case studies and proposes a method to support the classification using clustering of the resulting test cases.