-
[show abstract]
[hide abstract]
ABSTRACT: Quality models for software products and processes help both to developers and users to better understand their characteristics. In the specific case of libre (free, open source) software, the availability of a mature and reliable development community is an important factor to be considered, since in most cases both the evolvability and future fitness of the product depends on it. Up to now, most of the quality models for communities have been based on the manual examination by experts, which is time-consuming, generally inconsistent and often error-prone. In this paper, we propose a methodology, and some examples of how it works in practice, of how a quality model for development communities can be automated. The quality model used is a part of the QualOSS quality model, while the metrics are those collected by the FLOSS Metrics project.
Quality of Information and Communications Technology (QUATIC), 2010 Seventh International Conference on the; 11/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Collaborative projects built around virtual communities on the Internet have gained momentum over the last decade. Nevertheless, their rapid growth rate rises some questions: which is the most effective approach to manage and organize their content creation process? Can these communities scale, controlling their projects as their size continues to grow over time? To answer these questions, we undertake a quantitative analysis of privileged users in FLOSS development projects and in Wikipedia. From our results, we conclude that the inequality level of user contributions in both types of initiatives is remarkably distinct, even though both communities present almost identical patterns regarding the number of distinct contributors per file (in FLOSS projects) or per article (in Wikipedia). As a result, totally open projects like Wikipedia can effectively deal with faster growing rates, while FLOSS projects may be affected by bottlenecks on committers who play critical roles.
System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on; 02/2009
-
[show abstract]
[hide abstract]
ABSTRACT: Developer turnover can result in a major problem when developing software. When senior developers abandon a software project, they leave a knowledge gap that has to be managed. In addition, new (junior) developers require some time in order to achieve the desired level of productivity. In this paper, we present a methodology to measure the effect of knowledge loss due to developer turnover in software projects. For a given software project, we measure the quantity of code that has been authored by developers that do not belong to the current development team, which we define as orphaned code. Besides, we study how orphaned code is managed by the project. Our methodology is based on the concept of software archaeology, a derivation of software evolution. As case studies we have selected four FLOSS (free, libre, open source software) projects, from purely driven by volunteers to company-supported. The application of our methodology to these case studies will give insight into the turnover that these projects suffer and how they have managed it and shows that this methodology is worth being augmented in future research.
System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on; 02/2009
-
[show abstract]
[hide abstract]
ABSTRACT: Wikipedia is one of the most successful examples of massive collaborative content development. However, many of the mechanisms and procedures that it uses are still unknown in detail. For instance, how equal (or unequal) are the contributions to it has been discussed in the last years, with no conclusive results. In this paper, we study exactly that aspect by using Lorenz curves and Gini coefficients, very well known instruments to economists. We analyze the trends in the inequality of distributions for the ten biggest language editions of Wikipedia, and their evolution over time. As a result, we have found large differences in the number of contributions by different authors (something also observed in free, open source software development), and a trend to stable patterns of inequality in the long run.
Hawaii International Conference on System Sciences, Proceedings of the 41st Annual; 02/2008
-
[show abstract]
[hide abstract]
ABSTRACT: Libre (free / open source) software development is a complex phenomenon. Many actors (core developers, casual contributors, bug reporters, patch submitters, users, etc.), in many cases volunteers, interact in complex patterns without the constrains of formal hierarchical structures or organizational ties. Understanding this complex behavior with enough detail to build explanatory models suitable for prediction is an open challenge, and few results have been published to date in this area. Therefore statistical, non-explanatory models (such as the traditional regression model) have a clear role, and have been used in some evolution studies. Our proposal goes in this direction, but using a model that we have found more useful: time series analysis. Data available from the source code management repository is used to compute the size of the software over its past life, using this information to estimate the future evolution of the project. In this paper we present this methodology and apply it to three large projects, showing how in these cases predictions are more accurate than regression models, and precise enough to estimate with little error their near future evolutions.
Software Maintenance, 2007. ICSM 2007. IEEE International Conference on; 11/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Software growth (and more broadly, software evolution) is usually considered in terms of size or complexity of source code. However in different studies, usually different metrics are used, which make it difficult to compare approaches and results. In addition, not all metrics are equally easy to calculate for a given source code, which leads to the question of which one is the easiest to calculate without losing too much information. To address both issues, in this paper present a comprehensive study, based on the analysis of about 700,000 C source code files, calculating several size and complexity metrics for all of them. For this sample, we have found double Pareto statistical distributions for all metrics considered, and a high correlation between any two of them. This would imply that any model addressing software growth should produce this Pareto distributions, and that analysis based on any of the considered metrics should show a similar pattern, provided the sample of files considered is large enough.
Mining Software Repositories, 2007. ICSE Workshops MSR '07. Fourth International Workshop on; 06/2007
-
[show abstract]
[hide abstract]
ABSTRACT: During 2003, the Mozilla project transitioned from company-promoted (sponsored by AOL) to community-promoted (sponsored by the Mozilla Foundation). What happened to the group of developers during this transition? There was any significant impact on its activity or composition? To answer these questions, we have performed an analysis of the CVS repository of Mozilla, using the CVSAnalY tool, finding little on activity, but dramatic changes in the the composition of the development team.
Mining Software Repositories, 2007. ICSE Workshops MSR '07. Fourth International Workshop on; 06/2007
-
[show abstract]
[hide abstract]
ABSTRACT: In order to predict the number of changes in the following months for the project Eclipse, we have applied a statistical (non-explanatory) model based on time series analysis. We have obtained the monthly number of changes in the CVS repository of Eclipse, using the CVSAnalY tool. The input to our model was the filtered series of the number of changes per month, and the output was the number of changes per month for the next three months. Then we aggregated the results of the three months to obtain the total number of changes in the given period in the challenge.
Mining Software Repositories, 2007. ICSE Workshops MSR '07. Fourth International Workshop on; 06/2007
-
[show abstract]
[hide abstract]
ABSTRACT: There are some concerns in the research community about the convenience of using low-level metrics (such as SLOC, source lines of code) for characterizing the evolution of software, instead of the more traditional higher lever metrics (such as the number of modules or files). This issue has been raised in particular after some studies that suggest that libre (free, open source) software evolves differently than 'traditional' software, and therefore it does not conform to Lehman's laws of software evolution. Since those studies on libre software evolution use SLOCs as the base metric, while Lehman's and other traditional studies use modules or files, it is difficult to compare both cases. To overcome this difficulty, and to explore the differences between SLOC and files/modules counts in libre software projects, we have selected a large sample of programs and have calculated both size metrics over time. Our study shows that in those cases the evolution patterns in both cases (counting SLOCs or files) is the same, and that some patterns not conforming to Lehman's laws are indeed apparent
Software Maintenance and Reengineering, 2006. CSMR 2006. Proceedings of the 10th European Conference on; 04/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Software evolution research has recently focused on new development paradigms, studying whether laws found in more classic development environments also apply. Previous works have pointed out that at least some laws seem not to be valid for these new environments and even Lehman has labeled those (up to the moment few) cases as anomalies and has suggested that further research is needed to clarify this issue. In this line, we consider in this paper a large set of libre (free, open source) software systems featuring a large community of users and developers. In particular, we analyze a number of projects found in literature up to now, including the Linux kernel. For comparison, we include other libre software kernels from the BSD family, and for completeness we consider a wider range of libre software applications. In the case of Linux and the other operating system kernels we have studied growth patterns also at the subsystem level. We have observed in the studied sample that super-linearity occurs only exceptionally, that many of the systems follow a linear growth pattern and that smooth growth is not that common. These results differ from the ones found generally in classical software evolution studies. Other behaviors and patterns give also a hint that development in the libre software world could follow different laws than those known, at least in some cases.
Principles of Software Evolution, Eighth International Workshop on; 10/2005
-
[show abstract]
[hide abstract]
ABSTRACT: The concept of source code, understood as the source components used to obtain a binary, ready to execute version of a program, comprises currently more than source code written in a programming language. Specially when we move apart from systems-programming and enter the realm of end-user applications, we find source files with documentation, interface specifications, internationalization and localization modules, multimedia files, etc. All of them are source code in the sense that the developer works directly with them, and the application is built automatically using them as input. This work discusses the relationship between 'classical' source code (usually written in a programming language) and these other files by analyzing a publicly-available software versioning repository. Aspects that have been studied include the nature of the software repository, the different mixtures of source code found in several software projects stored in it, the specialization of developers to the different tasks, etc.
Source Code Analysis and Manipulation, 2004. Fourth IEEE International Workshop on; 10/2004
-
[show abstract]
[hide abstract]
ABSTRACT: In 1993, the authors began using 386BSD (which was libre software,
or freeware) to teach computer science classes at Madrid's Carlos III
University, in Spain. Seven years later, NetBSD and GNU/Linux are the
operating systems of choice for several of the university's computer
science teaching laboratories
IEEE Software 06/2000; · 1.51 Impact Factor