Audris Mockus’s research while affiliated with Vilnius University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (246)


Moving Faster and Reducing Risk: Using LLMs in Release Deployment
  • Preprint
  • File available

October 2024

·

7 Reads

Rui Abreu

·

Vijayaraghavan Murali

·

Peter C Rigby

·

[...]

·

Nachiappan Nagappan

Release engineering has traditionally focused on continuously delivering features and bug fixes to users, but at a certain scale, it becomes impossible for a release engineering team to determine what should be released. At Meta's scale, the responsibility appropriately and necessarily falls back on the engineer writing and reviewing the code. To address this challenge, we developed models of diff risk scores (DRS) to determine how likely a diff is to cause a SEV, i.e., a severe fault that impacts end-users. Assuming that SEVs are only caused by diffs, a naive model could randomly gate X% of diffs from landing, which would automatically catch X% of SEVs on average. However, we aimed to build a model that can capture Y% of SEVs by gating X% of diffs, where Y >> X. By training the model on historical data on diffs that have caused SEVs in the past, we can predict the riskiness of an outgoing diff to cause a SEV. Diffs that are beyond a particular threshold of risk can then be gated. We have four types of gating: no gating (green), weekend gating (weekend), medium impact on end-users (yellow), and high impact on end-users (red). The input parameter for our models is the level of gating, and the outcome measure is the number of captured SEVs. Our research approaches include a logistic regression model, a BERT-based model, and generative LLMs. Our baseline regression model captures 18.7%, 27.9%, and 84.6% of SEVs while respectively gating the top 5% (weekend), 10% (yellow), and 50% (red) of risky diffs. The BERT-based model, StarBERT, only captures 0.61x, 0.85x, and 0.81x as many SEVs as the logistic regression for the weekend, yellow, and red gating zones, respectively. The generative LLMs, iCodeLlama-34B and iDiffLlama-13B, when risk-aligned, capture more SEVs than the logistic regression model in production: 1.40x, 1.52x, 1.05x, respectively.

Download

OSS License Identification at Scale: A Comprehensive Dataset Using World of Code

September 2024

·

11 Reads

The proliferation of open source software (OSS) has led to a complex landscape of licensing practices, making accurate license identification crucial for legal and compliance purposes. This study presents a comprehensive analysis of OSS licenses using the World of Code (WoC) infrastructure. We employ an exhaustive approach, scanning all files containing ``license'' in their filepath, and apply the winnowing algorithm for robust text matching. Our method identifies and matches over 5.5 million distinct license blobs across millions of OSS projects, creating a detailed project-to-license (P2L) map. We verify the accuracy of our approach through stratified sampling and manual review, achieving a final accuracy of 92.08%, with precision of 87.14%, recall of 95.45%, and an F1 score of 91.11%. This work enhances the understanding of OSS licensing practices and provides a valuable resource for developers, researchers, and legal professionals. Future work will expand the scope of license detection to include code files and references to licenses in project documentation.


Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development

September 2024

·

24 Reads

In Open Source Software, resources of any project are open for reuse by introducing dependencies or copying the resource itself. In contrast to dependency-based reuse, the infrastructure to systematically support copy-based reuse appears to be entirely missing. Our aim is to enable future research and tool development to increase efficiency and reduce the risks of copy-based reuse. We seek a better understanding of such reuse by measuring its prevalence and identifying factors affecting the propensity to reuse. To identify reused artifacts and trace their origins, our method exploits World of Code infrastructure. We begin with a set of theory-derived factors related to the propensity to reuse, sample instances of different reuse types, and survey developers to better understand their intentions. Our results indicate that copy-based reuse is common, with many developers being aware of it when writing code. The propensity for a file to be reused varies greatly among languages and between source code and binary files, consistently decreasing over time. Files introduced by popular projects are more likely to be reused, but at least half of reused resources originate from ``small'' and ``medium'' projects. Developers had various reasons for reuse but were generally positive about using a package manager.


Fig. 1: Study sample (n=249) distribution plots by (a) sex, (b) age, (c) BMI group, and (d) PMI.
Fig. 2: Plot of TDS vs. PMI (n=249).
Fig. 3: Plot of TDS vs. naturally log-transformed PMI (n=249).
Fig. 4: Plot of TDS vs. ADD (n=249).
Fig. 5: Plot of TDS vs. naturally log-transformed ADD (n=249).

+2

Identifying Factors to Help Improve Existing Decomposition-Based PMI Estimation Methods

August 2024

·

13 Reads

Accurately assessing the postmortem interval (PMI) is an important task in forensic science. Some of the existing techniques use regression models that use a decomposition score to predict the PMI or accumulated degree days (ADD), however, the provided formulas are based on very small samples and the accuracy is low. With the advent of Big Data, much larger samples can be used to improve PMI estimation methods. We, therefore, aim to investigate ways to improve PMI prediction accuracy by (a) using a much larger sample size, (b) employing more advanced linear models, and (c) enhancing models with factors known to affect the human decay process. Specifically, this study involved the curation of a sample of 249 human subjects from a large-scale decomposition dataset, followed by evaluating pre-existing PMI/ADD formulas and fitting increasingly sophisticated models to estimate the PMI/ADD. Results showed that including the total decomposition score (TDS), demographic factors (age, biological sex, and BMI), and weather-related factors (season of discovery, temperature history, and humidity history) increased the accuracy of the PMI/ADD models. Furthermore, the best performing PMI estimation model using the TDS, demographic, and weather-related features as predictors resulted in an adjusted R-squared of 0.34 and an RMSE of 0.95. It had a 7% lower RMSE than a model using only the TDS to predict the PMI and a 48% lower RMSE than the pre-existing PMI formula. The best ADD estimation model, also using the TDS, demographic, and weather-related features as predictors, resulted in an adjusted R-squared of 0.52 and an RMSE of 0.89. It had an 11% lower RMSE than the model using only the TDS to predict the ADD and a 52% lower RMSE than the pre-existing ADD formula. This work demonstrates the need (and way) to incorporate demographic and environmental factors into PMI/ADD estimation models.


Towards Automation of Human Stage of Decay Identification: An Artificial Intelligence Approach

August 2024

·

11 Reads

Determining the stage of decomposition (SOD) is crucial for estimating the postmortem interval and identifying human remains. Currently, labor-intensive manual scoring methods are used for this purpose, but they are subjective and do not scale for the emerging large-scale archival collections of human decomposition photos. This study explores the feasibility of automating two common human decomposition scoring methods proposed by Megyesi and Gelderman using artificial intelligence (AI). We evaluated two popular deep learning models, Inception V3 and Xception, by training them on a large dataset of human decomposition images to classify the SOD for different anatomical regions, including the head, torso, and limbs. Additionally, an interrater study was conducted to assess the reliability of the AI models compared to human forensic examiners for SOD identification. The Xception model achieved the best classification performance, with macro-averaged F1 scores of .878, .881, and .702 for the head, torso, and limbs when predicting Megyesi's SODs, and .872, .875, and .76 for the head, torso, and limbs when predicting Gelderman's SODs. The interrater study results supported AI's ability to determine the SOD at a reliability level comparable to a human expert. This work demonstrates the potential of AI models trained on a large dataset of human decomposition images to automate SOD identification.




ICPUTRD: Image Cloud Platform for use in tagging and research on decomposition

December 2023

·

1 Read

·

2 Citations

Journal of Forensic Sciences

Human decomposition studies aim to understand the various factors influencing human decay to assess the deceased and develop postmortem interval (PMI) estimation methods. These types of studies are typically conducted through physical experiments examining the deceased; however, big data systems have the potential to transform how large‐scale forensic anthropology research questions can be addressed with curated images of donors with known demographic, climatic, and postmortem historical data. This study introduces ICPUTRD (Image Cloud Platform for Use in Tagging and Research on Decomposition), a web‐based software system, which enables forensic scientists to easily access, enhance (or curate), and analyze very large photographic collections documenting the longitudinal process of human decomposition. ICPUTRD, a JavaScript‐based application, was designed and built through a combination of the Waterfall and Agile software development life‐cycle methods and provides an image search and tagging features with a predefined nomenclature of forensic‐related keywords. To evaluate the system, a user study was conducted, involving 27 participants who completed pre‐ and post‐study surveys and three research tasks. Analysis of the study results confirmed the feasibility and practicality of ICPUTRD to facilitate aspects of forensic research and casework involving large collections of digital photographs of human decomposition. It was observed that the nomenclature lacked certain law enforcement keywords, so future work will focus on expanding it to ensure ICPUTRD is suited for all its intended users.




Citations (53)


... The identification and analysis of these licenses are crucial for understanding the legal frameworks that govern OSS distribution and usage. Previous studies have explored specific subsets of OSS licensing; however, a comprehensive analysis across the entire open source landscape has been lacking [Reid and Mockus 2023]. In this study, we present a comprehensive dataset of open source projects and their licenses, utilizing the World of Code (WoC) infrastructure to perform an exhaustive scan of the OSS landscape. ...

Reference:

OSS License Identification at Scale: A Comprehensive Dataset Using World of Code
Applying the Universal Version History Concept to Help De-Risk Copy-Based Code Reuse
  • Citing Conference Paper
  • October 2023

... The decomposition scoring was performed by a forensic anthropologist who was well-trained in Gelderman et al.'s [5] scoring method. Specifically, the study sample photos (i.e., photos of the head/neck, torso, and limbs per subject) were assessed and scored on an in-house developed data visualization and annotation software called ICPUTRD (Image Cloud Platform for Use in Tagging and Research on Decomposition) [10]. Once the decomposition scoring was completed, the TDS was calculated for each subject in the study sample. ...

ICPUTRD: Image Cloud Platform for use in tagging and research on decomposition
  • Citing Article
  • December 2023

Journal of Forensic Sciences

... (4) Organizers can use a formative research methodology to improve their hackathon practices, considering effectiveness, efficiency, and appeal as evaluation dimensions. [14], [17], [18], [60], [61], [83]- [91] Hackathon goals and how to achieve them Future research should expand studies beyond the immediate outputs of hackathons and consider short, midterm, and long-term impacts while developing instruments to assess actual impact rather than relying solely on individuals' perceptions. Attention should be paid to studying the goals of individuals involved in preparing and running hackathons, including goal alignment and considering goals beyond those of organizers and participants. ...

One-off events? An empirical study of hackathon code creation and reuse

Empirical Software Engineering

... To track the vulnerabilities in independent projects and complex SSC, existing works have investigated the vulnerabilities from the following aspects. First, to understand vulnerabilities sources in SSC, existing works have proved that vulnerabilities could come from the latest but vulnerable dependencies (Gkortzis et al. 2021), outdated dependencies (Cox et al. 2015;Wang et al. 2020;Xia et al. 2013), nonstandard or illegal code reuse (Reid et al. 2022;Wyss et al. 2022). More specifically, Gkortzis et al. (2021) found there is a strong correlation between the number of dependencies and the number of vulnerabilities. ...

The extent of orphan vulnerabilities from code reuse in open source software
  • Citing Conference Paper
  • July 2022

... Research has revealed that developers have different needs when using DL frameworks to perform different tasks [39]. To satisfy developers' diverse needs, substantial packages have been released by DL enthusiasts in the Python community [29]. ...

On the Variability of Software Engineering Needs for Deep Learning: Stages, Trends, and Application Types
  • Citing Article
  • January 2022

IEEE Transactions on Software Engineering

... [24] Believed to be coined during an OpenBSD cryptographic development event in 1999 [13], the meaning and nature of these events have developed from early ad-hoc exploratory programming sessions to represent modern innovation events that offer new opportunities for cooperative research and scientific discovery. Growing in both popularity and success, hackathons foster learning, drive community engagement, increase networking and relationship-building, and are effective for addressing civic, environmental, and public health issues, leading to increased adoption across various fields from higher education to healthcare to business services [8] [24]. ...

The Secret Life of Hackathon Code - Where does it come from and where does it go?

... multiple days apart) with each including one or more unlabeled body parts. Therefore, we first identify sequences of similar images in our data using an unsupervised approach [16] and ask a domain user to annotate only one image for each sequence. We then merge the user-supplied annotation with CAM-based pseudo annotations generated for the other images of that sequence. ...

SChISM: Semantic Clustering via Image Sequence Merging for Images of Human-Decomposition
  • Citing Conference Paper
  • January 2021

... Several researchers have extracted concepts from artifacts such as source code but hardly attempted to link concepts to developers. One such prior work that focuses on the developer-centric concept is by Dey et al. [14] who analyzed developer familiarity specific to using third-party libraries in source code. Their approach thus describes a skill set in lower-level libraries. ...

Replication Package for Representation of Developer Expertise in Open Source Software
  • Citing Conference Paper
  • May 2021

... World of Code (WoC) 1 is an infrastructure designed to cross-reference source code changes across the entire FLOSS community, enabling sampling, measurement, and analysis across software ecosystems [Ma et al. 2019[Ma et al. , 2021. It functions as a software analysis pipeline, handling data discovery, retrieval, storage, updates, and transformations for downstream tasks [Ma et al. 2021]. ...

World of code: enabling a research workflow for mining and analyzing the universe of open source VCS data

Empirical Software Engineering