Raula Gaikovina Kula

Raula Gaikovina Kula
Nara Institute of Science and Technology | NAIST · Graduate School of Information Science

Dr of Eng

About

134
Publications
26,944
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,574
Citations
Citations since 2016
109 Research Items
1492 Citations
20162017201820192020202120220100200300
20162017201820192020202120220100200300
20162017201820192020202120220100200300
20162017201820192020202120220100200300
Introduction
The effective and efficient reuse of software assets is extremely important to the success of a software development project. Using various techniques on data mining, code searching, program analysis, clone analysis we explore approaches to provide developers tools for the collection, analysis and evaluation of software assets.

Publications

Publications (134)
Preprint
Full-text available
Links are an essential feature of the World Wide Web, and source code repositories are no exception. However, despite their many undisputed benefits, links can suffer from decay, insufficient versioning, and lack of bidirectional traceability. In this paper, we investigate the role of links contained in source code comments from these perspectives....
Article
Full-text available
Security vulnerability in third-party dependencies is a growing concern not only for developers of the affected software, but for the risks it poses to an entire software ecosystem, e.g., Heartbleed vulnerability. Recent studies show that developers are slow to respond to the threat of vulnerability, sometimes taking four to eleven months to act. T...
Preprint
Full-text available
Context: Code Review (CR) is the cornerstone for software quality assurance and a crucial practice for software development. As CR research matures, it can be difficult to keep track of the best practices and state-of-the-art in methodology, dataset, and metric. Objective: This paper investigates the potential of benchmarking by collecting methodol...
Article
Full-text available
Software plays a central role in modern societies, with its high economic value and potential for advancing societal change. In this paper, we characterise challenges and opportunities for a country progressing towards entering the global software industry, focusing on Papua New Guinea (PNG). By hosting a Software Engineering workshop, we conducted...
Article
Full-text available
Technical Debt is a metaphor used to describe the situation in which long-term software artifact quality is traded for short-term goals in software projects. In recent years, the concept of self-admitted technical debt (SATD) was proposed, which focuses on debt that is intentionally introduced and described by developers. Although prior work has ma...
Preprint
Full-text available
The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are...
Preprint
Full-text available
Contemporary development projects benefit from code review as it improves the quality of a project. Large ecosystems of interdependent projects like OpenStack generate a large number of reviews, which poses new challenges for collaboration (improving patches, fixing defects). Review tools allow developers to link between patches, to indicate patch...
Article
Full-text available
The ability of an Open Source Software (OSS) project to attract, onboard, and retain any newcomer is vital to its livelihood. Although, evidence suggests an upsurge in novice developers joining social coding platforms (such as GitHub), the extent to which their activities result in a OSS contribution is unknown. Henceforth, we execute the protocols...
Preprint
Full-text available
AlphaCode is a code generation system for assisting software developers in solving competitive programming problems using natural language problem descriptions. Despite the advantages of the code generating system, the open source community expressed concerns about practicality and data licensing. However, there is no research investigating generat...
Preprint
Full-text available
An increase in diverse technology stacks and third-party library usage has led developers to inevitably switch technologies. To assist these developers, maintainers have started to release their libraries to multiple technologies, i.e., a cross-ecosystem library. Our goal is to explore the extent to which these cross-ecosystem libraries are intertw...
Preprint
Full-text available
Forking is a common practice for developers when building upon on already existing projects. These forks create variants, which have a common code base but then evolve the code in different directions, which is specific to that forked project requirements. An interesting side-effect of having multiple forks is the ability to select between differen...
Preprint
Full-text available
Reliance on third-party libraries is now commonplace in contemporary software engineering. Being open source in nature, these libraries should advocate for a world where the freedoms and opportunities of open source software can be enjoyed by all. Yet, there is a growing concern related to maintainers using their influence to make political stances...
Preprint
Full-text available
The risk to using third-party libraries in a software application is that much needed maintenance is solely carried out by library maintainers. These libraries may rely on a core team of maintainers (who might be a single maintainer that is unpaid and overworked) to serve a massive client user-base. On the other hand, being open source has the bene...
Preprint
Full-text available
Popular adoption of third-party libraries for contemporary software development has led to the creation of large inter-dependency networks, where sustainability issues of a single library can have widespread network effects. Maintainers of these libraries are often overworked, relying on the contributions of volunteers to sustain these libraries. I...
Preprint
Full-text available
Third-party library dependencies are commonplace in today's software development. With the growing threat of security vulnerabilities, applying security fixes in a timely manner is important to protect software systems. As such, the community developed a list of software and hardware weakness known as Common Weakness Enumeration (CWE) to assess vul...
Preprint
Full-text available
Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with...
Preprint
In the field of data science, and for academics in general, the Python programming language is a popular choice, mainly because of its libraries for storing, manipulating, and gaining insight from data. Evidence includes the versatile set of machine learning, data visualization, and manipulation packages used for the ever-growing size of available...
Article
Full-text available
Discussions is a new feature of GitHub for asking questions or discussing topics outside of specific Issues or Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, how they perceive it, and how it impacts the develop...
Article
Full-text available
Third-party package usage has become a common practice in contemporary software development. Developers often face different challenges, including choosing the right libraries, installing errors, discrepancies, setting up the environment, and building failures during software development. The risks of maintaining a third-party package are well know...
Article
Full-text available
It has become common practice for software projects to adopt third-party dependencies. Developers are encouraged to update any outdated dependency to remain safe from potential threats of vulnerabilities. In this study, we present an approach to aid developers show whether or not a vulnerable code is reachable for JavaScript projects. Our prototype...
Article
Full-text available
The widespread adoption of third-party libraries for contemporary software development has led to the creation of large inter-dependency networks, where sustainability issues of a single library can have widespread network effects. Maintainers of these libraries are often overworked, relying on the contributions of volunteers to sustain these libra...
Article
Full-text available
Although many software development projects have moved their developer discussion forums to generic platforms such as Stack Overflow, Eclipse has been steadfast in hosting their self-supported community forums. While recent studies show forums share similarities to generic communication channels, it is unknown how project-specific forums are utiliz...
Article
Full-text available
Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of open-source scientific software which implements bleeding-edge science in its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the current practice of establishing and main...
Preprint
Full-text available
It has become common practice for software projects to adopt third-party dependencies. Developers are encouraged to update any outdated dependency to remain safe from potential threats of vulnerabilities. In this study, we present an approach to aid developers show whether or not a vulnerable code is reachable for JavaScript projects. Our prototype...
Preprint
Full-text available
Third-party package usage has become a common practice in contemporary software development. Developers often face different challenges, including choosing the right libraries, installing errors, discrepancies, setting up the environment, and building failures during software development. The risks of maintaining a third-party package are well know...
Conference Paper
Full-text available
The management of third-party package dependencies is crucial to most technology stacks, with package managers acting as brokers to ensure that a verified package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of package ecosystems with their own management features. While recent...
Article
Full-text available
Code reviews serve as a quality assurance activity for software teams. Especially for Modern Code Review, sharing a link during a review discussion serves as an effective awareness mechanism where "Code reviews are good FYIs [for your information].". Although prior work has explored link sharing and the information needs of a code review, the exten...
Preprint
Full-text available
Context: Open source software development has become more social and collaborative, especially with the rise of social coding platforms like GitHub. Since 2016, GitHub started to support more informal methods such as emoji reactions, with the goal to reduce commenting noise when reviewing any code changes to a repository. Interestingly, preliminary...
Preprint
Full-text available
The management of third-party package dependencies is crucial to most technology stacks, with package managers acting as brokers to ensure that a verified package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of package ecosystems with their own management features. While recent...
Preprint
Full-text available
Although many software development projects have moved their developer discussion forums to generic platforms such as Stack Overflow, Eclipse has been steadfast in hosting their self-supported community forums. While recent studies show forums share similarities to generic communication channels, it is unknown how project-specific forums are utiliz...
Preprint
Full-text available
The Node.js Package Manager (i.e., npm) archive repository serves as a critical part of the JavaScript community and helps support one of the largest developer ecosystems in the world. However, as a developer, selecting an appropriate npm package to use or contribute to can be difficult. To understand what features users and contributors consider i...
Preprint
Full-text available
Context: Contemporary code review tools are a popular choice for software quality assurance. Using these tools, reviewers are able to post a linkage between two patches during a review discussion. Large development teams that use a review-then-commit model risk being unaware of these linkages. Objective: Our objective is to first explore how patch...
Article
Full-text available
Context Contemporary code review tools are a popular choice for software quality assurance. Using these tools, reviewers are able to post a linkage between two patches during a review discussion. Large development teams that use a review-then-commit model risk being unaware of these linkages. Objective Our objective is to first explore how patch l...
Article
Context Code Review (CR) is the cornerstone for software quality assurance and a crucial practice for software development. As CR research matures, it can be difficult to keep track of the best practices and state-of-the-art in methodology, dataset, and metric. Objective This paper investigates the potential of benchmarking by collecting methodolo...
Preprint
Full-text available
Although computer science papers are often accompanied by software artifacts, connecting research papers to their software artifacts and vice versa is not always trivial. First of all, there is a lack of well-accepted standards for how such links should be provided. Furthermore, the provided links, if any, often become outdated: they are affected b...
Preprint
Full-text available
Context: Open Source Software (OSS) projects rely on a continuous stream of new contributors for sustainable livelihood. Recent studies reported that new contributors experience many barriers in their first contribution. One of the critical barriers is the social barrier. Although a number of studies investigated the social barriers to new contribu...
Preprint
Full-text available
Code Review plays a crucial role in software quality, by allowing reviewers to discuss and critique any new patches before they can be successfully integrated into the project code. Yet, it is unsure the extent to which coding pattern changes (i.e., repetitive code) from when a patch is first submitted and when the decision is made (i.e., during th...
Article
Full-text available
Contemporary software development is distributed and characterized by high dynamics with continuous and frequent changes to fix defects, add new user requirements or adapt to other environmental changes. To manage such changes and ensure software quality, modern code review is broadly adopted as a common and effective practice. Yet several open-sou...
Preprint
Full-text available
Technical Debt is a metaphor used to describe the situation in which long-term code quality is traded for short-term goals in software projects. In recent years, the concept of self-admitted technical debt (SATD) was proposed, which focuses on debt that is intentionally introduced and described by developers. Although prior work has made important...
Preprint
Full-text available
Online collaboration platforms such as GitHub have provided software developers with the ability to easily reuse and share code between repositories. With clone-and-own and forking becoming prevalent, maintaining these shared files is important, especially for keeping the most up-to-date version of reused code. Different to related work, we propose...
Preprint
Full-text available
Discussions is a new feature of GitHub for asking questions or discussing topics outside of specific Issues or Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, how they perceive it, and how it impacts the develop...
Preprint
Full-text available
In 2018, the software industry giants Microsoft made a move into the Open Source world by completing the acquisition of mega Open Source platform, GitHub. This acquisition was not without controversy, as it is well-known that the free software communities includes not only the ability to use software freely, but also the libre nature in Open Source...
Preprint
The ability for an Open Source Software (OSS) project to attract, onboard, and retain any newcomer is vital to its livelihood. Evidence suggests more new users are joining GitHub, however, the extent to which they contribute to OSS projects is unknown. In this study, we coin the term newcomer candidate to describe a novice developer that is a new u...
Conference Paper
Full-text available
Modern code review (MCR) is now broadly adopted as an established and effective software quality assurance practice , with an increasing number of open-source as well as commercial software projects identifying code review as a crucial practice. During the MCR process, developers review, provide constructive feedback, and/or critique each others' p...
Article
The popularity of Open Source Software(OSS) is at an all-time high and for it to remain so it is vital for new developers to continually join and contribute to the OSS community. In this paper, to better understand the first time contributor, we study the characteristics of the first pull request(PR) made to an OSS project by developers. We mine Gi...
Preprint
Full-text available
Pythonic code is idiomatic code that follows guiding principles and practices within the Python community. Offering performance and readability benefits, Pythonic code is claimed to be widely adopted by experienced Python developers, but can be a learning curve to novice programmers. To aid with Pythonic learning, we create an automated tool, calle...