Roles with co-occurring smells within a snapshot

Source publication

Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime

Conference Paper

Full-text available

May 2022

Infrastructure as Code is the practice of automating the provisioning, configuration, and orchestration of network nodes using code in which variable values such as configuration parameters, node hostnames, etc. play a central role. Mistakes in these values are an important cause of infrastructure defects and corresponding outages. Ansible, a popul...

An empirical study of task infections in Ansible scripts

Article

Full-text available

Dec 2023
EMPIR SOFTW ENG

Context Despite being beneficial for managing computing infrastructure at scale, Ansible scripts include security weaknesses, such as hard-coded passwords. Security weaknesses can propagate into tasks, i.e., code constructs used for managing computing infrastructure with Ansible. Propagation of security weaknesses into tasks makes the provisioned infrastructure susceptible to security attacks. A systematic characterization of task infection, i.e., the propagation of security weaknesses into tasks, can aid practitioners and researchers in understanding how security weaknesses propagate into tasks and derive insights for practitioners to develop Ansible scripts securely. Objective The goal of the paper is to help practitioners and researchers understand how Ansible-managed computing infrastructure is impacted by security weaknesses by conducting an empirical study of task infections in Ansible scripts. Method We conduct an empirical study where we quantify the frequency of task infections in Ansible scripts. Upon detection of task infections, we apply qualitative analysis to determine task infection categories. We also conduct a survey with 23 practitioners to determine the prevalence and severity of identified task infection categories. With logistic regression analysis, we identify development factors that correlate with presence of task infections. Results In all, we identify 1,805 task infections in 27,213 scripts. We identify six task infection categories: anti-virus, continuous integration, data storage, message broker, networking, and virtualization. From our survey, we observe tasks used to manage data storage infrastructure perceived to have the most severe consequences. We also find three development factors, namely age, minor contributors, and scatteredness to correlate with the presence of task infections. Conclusion Our empirical study shows computing infrastructure managed by Ansible scripts to be impacted by security weaknesses. We conclude the paper by discussing the implications of our findings for practitioners and researchers.

An Introduction to Software Ecosystems

Chapter

May 2023

This chapter defines and presents the kinds of software ecosystems that are targeted in this book. The focus is on the development, tooling and analytics aspects of "software ecosystems", i.e., communities of software developers and the interconnected software components (e.g., projects, libraries, packages, repositories, plug-ins, apps) they are developing and maintaining. The technical and social dependencies between these developers and software components form a socio-technical dependency network, and the dynamics of this network change over time. We classify and provide several examples of such ecosystems, many of which will be explored in further detail in the subsequent chapters of the book. The chapter also introduces and clarifies the relevant terms needed to understand and analyse these ecosystems, as well as the techniques and research methods that can be used to analyse different aspects of these ecosystems.

Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort?

Conference Paper

Full-text available

May 2023

Research on Automated Operation and Maintenance Methods for Power Management Networks Based on Ansible

Conference Paper

Nov 2023

Exploring the Feasibility of ChatGPT for Improving the Quality of Ansible Scripts in Edge-Cloud Infrastructures Through Code Recommendation

Chapter

Jan 2024

Edge-cloud system aims to reduce the processing time of Big data by bringing massive infrastructures closer to the source of data. Infrastructure as Code (IaC) supports the automatic deployment and management of these infrastructures through reusable code, and Ansible is the most popular IaC tool. As the quality of Ansible script directly influences the quality of Edge-cloud system, many researchers have studied improving the quality of Ansible scripts. However, there has yet to be an attempt to leverage the power of ChatGPT. Thus, we study to explore the feasibility of ChatGPT to improve the quality of Ansible scripts. Three raters evaluate ChatGPT’s code recommendation ability on 48 code revision cases from 25 Ansible project GitHub repositories, and we analyze the rating results. As a result, we can confirm that ChatGPT can recognize and understand Ansible script. However, its ability largely depends on how to user formulates the questions. Thus, we can confirm the need for prompt engineering for ChatGPT to acquire stable code recommendation results.

Exploring LLM-based Automated Repairing of Ansible Script in Edge-Cloud Infrastructures

Article

Dec 2023
J WEB ENG

Edge-Cloud system requires massive infrastructures located in closer to the user to minimize latencies in handling Big data. Ansible is one of the most popular Infrastructure as Code (IaC) tools crucial for deploying these infrastructures of the Edge-cloud system. However, Ansible also consists of code, and its code quality is critical in ensuring the delivery of high-quality services within the Edge-Cloud system. On the other hand, the Large Langue Model (LLM) has performed remarkably on various Software Engineering (SE) tasks in recent years. One such task is Automated Program Repairing (APR), where LLMs assist developers in proposing code fixes for identified bugs. Nevertheless, prior studies in LLM-based APR have predominantly concentrated on widely used programming languages (PL), such as Java and C, and there has yet to be an attempt to apply it to Ansible. Hence, we explore the applicability of LLM-based APR on Ansible. We assess LLMs’ performance (ChatGPT and Bard) on 58 Ansible script revision cases from Open Source Software (OSS). Our findings reveal promising prospects, with LLMs generating helpful responses in 70% of the sampled cases. Nonetheless, further research is necessary to harness this approach’s potential fully.

The Docker Hub Image Inheritance Network: Construction and Empirical Insights

Conference Paper

Oct 2023

What Do Infrastructure-as-Code Practitioners Discuss: An Empirical Study on Stack Overflow

Conference Paper

Oct 2023

Characterizing Static Analysis Alerts for Terraform Manifests: An Experience Report

Conference Paper

Oct 2023

While Terraform has gained popularity to implement the practice of infrastructure as code (IaC), little is known about characteristics or actionability of static analysis for Terraform manifests. Such lack of understanding hinders practitioners to adopt static analysis for their Terraform development process, as it happened for Company A, an organization who uses Terraform to create automated software deployment pipelines. In this experience report, we summarize our study of 491 static analysis alerts that occur for 10 open source and one proprietary Terraform repositories. From our analysis, we observed: (i) 10 categories of static analysis alerts appear in Terraform manifests, of which five are related to security, (ii) Majority of the practitioners understand static analysis alert messages and underlying root causes, (iii) Terraform resources with dependencies have 1.5x - 2.1x more static analysis alerts than resources with no dependencies, and (iv) Practitioner perceptions vary from one alert category to another while deciding on taking actions for reported alerts. This paper concludes with recommendations for practitioners, toolsmiths and researchers on how to use and analyze static analysis alerts of Terraform in an actionable manner.

Infrastructure-as-Code Ecosystems

Chapter

May 2023

Infrastructure as Code (IaC) is the practice of automating the provisioning, configuration, and orchestration of systems onto which software is deployed through scripts in domain-specific languages. With the increasing importance of reliable and repeatable deployments, ecosystems are emerging around online repositories of reusable IaC assets. In this chapter, we study two such ecosystems in detail: the one forming around the Docker Hub repository of reusable Docker images and the one forming around the Ansible Galaxy repository of reusable Ansible roles. We start with an introduction to Docker, the most popular container management tool, and Ansible, the most popular configuration management tool. Although both tools are used to configure machines onto which applications are deployed, they differ fundamentally in the means through which this is achieved. Next, we discuss the Docker Hub and Ansible Galaxy online repositories for reusable Docker images and Ansible roles. Having introduced these emerging ecosystems, we highlight a number of approaches taken by researchers studying them. Subsequently, we survey the state of the art in research on the practices followed by their contributors and users, ranging from the versioning of releases and keeping dependencies up to date to detecting bugs. We conclude with the challenges that researchers face when analyzing these ecosystems.

Roles with co-occurring smells within a snapshot

Citations