ArticlePDF Available

A Survey of Version Control Systems

Authors:

Abstract

Version control has been an essential aspect of any software development project since early 1980s. In the recent years, however, we see version control as a common feature embedded in many collaborative based software packages; such as word processors, spreadsheets and wikis. In this paper, we explain the common structure of version control systems, provide historical information on their development, and identify future improvements.
A preview of the PDF is not available
... VCS makes the development process easier and faster, where provides the ability to track and control modifications to data over time. A VCS provides many advantages for software developers, it helps to share data between nodes, and each node can be kept up to date with the latest version of the data [9]. The version control systems are not only related to the modifications of data, but also the reasoning behind the modifications. ...
... The version control systems are not only related to the modifications of data, but also the reasoning behind the modifications. VCS stores information about which files were modified, when they were modified, who made the modification and what the files contained before the modification [19], as well as help developers to know who works on these files [9]. So, VCS allows the developers to cooperate and work on the same software project at the same time [12]. ...
... In recent years, the distributed repository has gained increased attention because it allows cooperation without the need for a central repository. In the past twenty years have seen the emergence of distributed version control system (DVCS), these systems have many repositories, each works independently and still have a master repository [9]. ...
Article
Full-text available
Version control systems (VCS) are widely applied at software companies as a collaborative tool and to maintain multiple versions of source code and documentation. VCS is a software tool that manages development of software projects and provide methods to manage several developers working together and track them. Collaboration considers the master purpose of version control systems. Modern VCS supports the parallel development of artifacts using branches and merges. Currently, the version con-trol system adopts on two approaches to software development, the Centralized Version Control Sys-tem (CVCS) and the Distributed Version Control System (DVCS). This article introduces the concepts and comparison of Version Control Systems and some criteria to consider when selecting.
... The version control system then tracks the subsequent versions of the shared content, as well as changes, in order to enable fixing error made in the revision process, querying past versions, and integrating content from different contributors. Much effort related to version control has been carried out both in research and in applications; see surveys [Altmanninger et al., 2009, Koc andTansel, 2011]. The prime applications were collaborative document authoring process, computer-aided design, and software development systems. ...
... of documents -covering multiple formats (e.g., text, binary and XML formats) -have been proposed both in research [David, 1994, Rusu et al., 2005, Rönnau and Borghoff, 2009, Pandey and Munson, 2013, Thao and Munson, 2011 and in industry [Pilato, 2004, Borghoff, 2009, Koc andTansel, 2011] for a more detailed survey about practical version control tools. At their basis, general-purpose version control systems are either built on top of line-based differencing algorithms like GNU diff [Myers, 1986], or simply use cryptographic hashing techniques [Chacon, 2009] in order to detect new versions when changes are committed. ...
... Un tel système facilite la résolution d'erreurs survenues durant le processus de révision, l'interrogation de versions antérieures, et la fusion de contenu provenant de contributeurs différents. Tels que résumés dans[Koc and Tansel, 2011, Altmanninger et al., 2009], des efforts considérables liés à la gestion de versions des données ont été entrepris à la fois dans la recherche et dans les applications.Les premières applications furent le processus collaboratif de rédaction de documents, la conception assistée par ordinateurs et les systèmes de développement logiciels.Présentement, des outils de contrôle de versions puissants tels que Subversion[Pilato, 2004] et Git[Chacon, 2009] gèrent efficacement de très grands dépôts de code source et des systèmes de fichiers partagés.Toutefois, les approches actuelles ne supportent pas la gestion de l'incertitude dans les données. C'est le cas de l'incertitude qui résulte de conflits. ...
Thesis
Full-text available
This thesis addresses some fundamental problems inherent to the need of uncertainty handling in multi-source Web applications with structured information, namely uncertain version control in Web-scale collaborative editing platforms, integration of uncertain Web sources under constraints, and truth finding over structured Web sources. Its major contributions are: uncertainty management in version control of treestructured data using a probabilistic XML model; initial steps towards a probabilistic XML data integration system for uncertain and dependent Web sources; precision measures for location data and; exploration algorithms for an optimal partitioning of the input attribute set during a truth finding process over conflicting Web sources.
... The version control system then tracks the subsequent versions of the shared content, as well as changes, in order to enable fixing error made in the revision process, querying past versions, and integrating content from different contributors. Much effort related to version control has been carried out both in research and in applications; see surveys [Altmanninger et al., 2009, Koc andTansel, 2011]. The prime applications were collaborative document authoring process, computer-aided design, and software development systems. ...
... of documents -covering multiple formats (e.g., text, binary and XML formats) -have been proposed both in research [David, 1994, Rusu et al., 2005, Rönnau and Borghoff, 2009, Pandey and Munson, 2013, Thao and Munson, 2011 and in industry [Pilato, 2004, Borghoff, 2009, Koc andTansel, 2011] for a more detailed survey about practical version control tools. At their basis, general-purpose version control systems are either built on top of line-based differencing algorithms like GNU diff [Myers, 1986], or simply use cryptographic hashing techniques [Chacon, 2009] in order to detect new versions when changes are committed. ...
... Un tel système facilite la résolution d'erreurs survenues durant le processus de révision, l'interrogation de versions antérieures, et la fusion de contenu provenant de contributeurs différents. Tels que résumés dans[Koc and Tansel, 2011, Altmanninger et al., 2009], des efforts considérables liés à la gestion de versions des données ont été entrepris à la fois dans la recherche et dans les applications.Les premières applications furent le processus collaboratif de rédaction de documents, la conception assistée par ordinateurs et les systèmes de développement logiciels.Présentement, des outils de contrôle de versions puissants tels que Subversion[Pilato, 2004] et Git[Chacon, 2009] gèrent efficacement de très grands dépôts de code source et des systèmes de fichiers partagés.Toutefois, les approches actuelles ne supportent pas la gestion de l'incertitude dans les données. C'est le cas de l'incertitude qui résulte de conflits. ...
Article
Full-text available
This thesis addresses some fundamental problems inherent to the need of uncertainty handling in multi-source Web applications with structured information, namely uncertain version control in Web-scale collaborative editing platforms, integration of uncertain Web sources under constraints, and truth finding over structured Web sources. Its major contributions are: uncertainty management in version control of tree-structured data using a probabilistic XML model; initial steps towards a probabilistic XML data integration system for uncertain and dependent Web sources; precision measures for location data and; exploration algorithms for an optimal partitioning of the input attribute set during a truth finding process over conflicting Web sources.
... In existing systems are used for version control purposes, the majority of current version control systems centralized, meaning they have a single repository and are also managed by the autho that can manipulate or temper the repository's data. The disadvantages are that client not have unlimited oversight of the report or file, and that documents can be erased, trolled, or altered, which can cause havoc in the development process [21]. ...
... In existing systems that are used for version control purposes, the majority of current version control systems are centralized, meaning they have a single repository and are also managed by the authority that can manipulate or temper the repository's data. The disadvantages are that clients do not have unlimited oversight of the report or file, and that documents can be erased, controlled, or altered, which can cause havoc in the development process [21]. ...
Article
Full-text available
Version control is an important component of configuration management, and most enterprise-level software uses different tools and technologies to manage the software version control such as CVS, Subversion, or Perforce. Following the success of bitcoin, the first practical application of blockchain, it is being implemented in other fields such as healthcare, supply chains, financial management, real estate, electoral systems, and so on. Blockchain’s core features include decentralization, immutability, and interminability. Most version control repositories are centralized and can be modified by external sources, implying that they are in danger of being corrupted or controlled. In this study, we present the BDA-SCV architecture for implementing a version control system in blockchain technology. Our proposed approach would replace the necessity for a centralized system, with a decentralized approach implemented in the blockchain using distributed file storage, for which we will use the InterPlanetary File System (IPFS), which is a distributed file system. The proof of authority (PoA) consensus algorithm will be used to approve the developer communicating modifications to the private blockchain network; the authority will only provide permission and will not be able to add, edit, or delete code files. For each change, a ledger block will be created with a reference to the file stored in the distributed repository. A block cannot be manipulated once it has been created. Smart contracts will be used to register developers, create blocks, and manage the repository. The suggested model is implemented using the Hyperledger Fabric network, and the developer and authorizer ends are built into the dotnet web application.
... Further optimizations come with automatic triggering of individual development tasks in predefined time periods, usually when computers have low usage. Nowdays, continuous SW integration is a part of more generic continuous engineering relying on digital twins and covering all parts of cyberphysical systems, SW, hardware (HW), mechanical and electrical components, etc. [6,8,11,14,16]. Once an implementation of SW changes is finished and saved to version control system, individual integration and validation steps are carried out automatically and feedback of the process is provided to that additional implementation changes can be done [11,12,14]. Multiple platform communication problems launched initiatives for unified communication environment which emerged in new standards such as x-in-the-loop (XIL) [1,7,10,27]. ...
... Nowdays, continuous SW integration is a part of more generic continuous engineering relying on digital twins and covering all parts of cyberphysical systems, SW, hardware (HW), mechanical and electrical components, etc. [6,8,11,14,16]. Once an implementation of SW changes is finished and saved to version control system, individual integration and validation steps are carried out automatically and feedback of the process is provided to that additional implementation changes can be done [11,12,14]. Multiple platform communication problems launched initiatives for unified communication environment which emerged in new standards such as x-in-the-loop (XIL) [1,7,10,27]. ...
... Cenário de integração. Em um sistema de controle de versão, um commit é uma versão que agrupa mudanças em determinados arquivos de um projeto [10]. Considerando essa definição, um cenário de integração é definido como uma quádrupla de commits, que chamaremos aqui de commits base, left, right e merge. ...
Conference Paper
Software development is increasingly complex, with developers simultaneously working on different parts of the source code to build, maintain, and enhance systems. However, this collaborative nature of development can lead to conflicts when multiple individuals attempt to simultaneously modify the same file. In this scenario, code merge tools play a crucial role in detecting and resolving these conflicts. One such tool is CSDiff, a conflict detection and resolution tool, an alternative to the traditional and widespread Diff3. CSDiff stands out by using customizable separators specific to each programming language to help in conflict resolution. In this article, we propose an improvement to the functionality of CSDiff focusing on reducing false positive and false negatives conflicts found when using the original tool. Through an analysis based on Python programs, we compare CSDiff with and without the proposed improvement; we assess the impact on reducing errors presented by the original version of the tool. The results indicate that the proposed improvement not only reduces the number of reported false positive conflicts, leading to a higher proportion of scenarios with correctly resolved conflicts, but also results in a decrease in false negatives when compared to the original tool.
... They have the advantage of storing the entire repository on each user's local computer. This characteristic makes them particularly well-suited for large projects involving numerous independent developers [17]. DVCS also offers faster performance compared to CVCS, as most commands are executed locally without the need for a network connection. ...
Article
Full-text available
There have been several studies on mono-and multi-repository structures and branching strategies. However, most of those studies focused on the basics of repository structures and used a small number of project samples. This paper uses data from more than 50 000 repositories collected from GitHub. The results indicate that: 1) mono-repository projects generally involve smaller teams, with the majority being handled by one or two developers , 2) multi-repository projects often require larger teams, typically consisting of three or more developers, 3) mono-repository projects are favored for shorter durations, with over half of the projects completed within six months, 4) multi-repository projects, on the other hand, have higher usage percentages in longer development periods, suggesting their suitability for more time-consuming endeavors. Examining branching strategies reveals that: 1) the trunk-based approach is commonly used in both mono-and multi-repository projects, 2) GitHub Flow has much wider usage in multi-repository projects rather than mono-repository. These findings offer valuable insights for developers and project managers in selecting the appropriate repository structure and branching strategy based on project requirements. Understanding team dynamics, project complexity, and desired development periods aids in optimizing collaboration and achieving successful outcomes.
... The work of Walter F. Tichy on RCS [2] presents a deep fundamental insight into technical aspects of SCM systems. Abdullah Uz Tansel et al. gives in his research a brief history and builds a bridge to nowadays SCM systems [11]. The paper of Christian Bird et al. describes the ideas why companies deal with various SCM solutions [12]. ...
Article
Full-text available
[Full read](https://elmar-dott.com/publications/expressions-for-scm/) In the last decades, many standards were established to increase productivity during Software Lifecycle Management. All these techniques and methodologies promise a higher success rate in software projects which could affirm themselves in the case the involved protagonists are willing to follow the instances recommended. Semantic Versioning, for example, addresses the information leak between functional changes, BugFixes and compatibility of existing and future releases of artifacts. Diving deeper into the daily craftsmanship of software projects enables us to identify the Source Control Management Systems (SCM) as a big treasure box. Much information can be extracted from these repositories, which are currently ignored for project analyzing. Expressions on SCM Commit Messages represent a new formalism that is both human-readable and machine-processable. Such a standard also forms a bridge between the code base and the requirements management and release management, since these activities are identified by a freely expandable vocabulary in the SCM. Another advantage of this strategy is the clear and compact expressiveness for development teams. A very practical aspect of my proposal is the easy applicability of the presented solution in real software development projects. As with the Semantic Versioning methodology already mentioned, there are no additional technical requirements to be met, since commit messages are a fundamental function of SCM systems. This paper discusses the option to improve data collection for controlling software projects and knowledge sharing in collaborative teams.
... These requirements point towards versioning as used in Revision Control Systems [26]. There are two main approaches to versioning: snapshot and changeset (delta) [16]. The former method stores a complete copy of the data set for each version. ...
Conference Paper
As today's organizational computer networks are ever evolving and becoming more and more complex, finding potential vulnerabilities and conducting security audits has become a crucial element in securing these networks. The first step in auditing a network is reconnaissance by mapping it to get a comprehensive overview over its structure. The growing complexity, however, makes this task increasingly effortful, even more as mapping (instead of plain scanning), presently, still involves a lot of manual work. Therefore, the concept proposed in this paper automates the scanning and mapping of unknown and non-cooperative computer networks in order to find security weaknesses or verify access controls. It further helps to conduct audits by allowing comparing documented with actual networks and finding unauthorized network devices, as well as evaluating access control methods by conducting delta scans. It uses a novel approach of augmenting data from iteratively chained existing scanning tools with context, using genuine analytics modules to allow assessing a network's topology instead of just generating a list of scanned devices. It further contains a visualization model that provides a clear, lucid topology map and a special graph for comparative analysis. The goal is to provide maximum insight with a minimum of a priori knowledge.
Article
Version control systems are powerful tools for managing history information and shaping personal and collaborative processes. While many complex tools exist for software engineering, and basic functionality for capturing versions is often found in collaborative applications such as text editors and design layout tools, these systems are not attuned to the needs and behaviors of creative practitioners within those domains, and fail to support creative practitioners in many others. Through 18 semi-structured interviews across diverse domains of creativity, we investigate how creative practitioners use version histories in their process. With the familiar paradigms and features of software version control as an organizing structure, we discuss how these creative practitioners embrace, challenge, and complicate uses of version histories in four ways: using versions as a palette of materials, providing confidence and freedom to explore, leveraging low-fidelity version capture, and reflecting on and reusing versions across long time scales. We discuss how the themes present across this wide range of mediums and domains can provide insight into future designs and uses of version control systems to support creative process.
Conference Paper
Full-text available
This article examines the benefits of using text animated transitions for navigating in the revision history of textual documents. We propose an animation technique for smoothly transitioning between different text revisions, then present the Diffamation system. Diffamation supports rapid exploration of revision histories by combining text animated transitions with simple navigation and visualization tools. We finally describe a user study showing that smooth text animation allows users to track changes in the evolution of textual documents more effectively than flipping pages.
Conference Paper
Full-text available
Office applications such as OpenOffice and Microsoft Office are widely used to edit the majority of today's business documents: office documents. Usually, version control systems consider office documents as binary objects, thus severely hindering collaborative work. Since XML has become a de-facto standard for office applications, we focus on versioning office documents by structured XML version control approaches. This enables state-of-the-art version control for office documents.A basic prerequisite to XML version control is a diff algorithm, which detects structural changes between XML documents. In this paper, we evaluate state-of-the-art XML diff algorithms w.r.t. their suitability to OpenOffice XML documents and the future OASIS office document standard. It turns out that, due to the specific XML office format, a careful examination of the diff algorithm characteristics is necessary. Therefore, we identify important features for XML diff approaches to handle office documents. We have implemented a first OpenOffice versioning API that can be used in version control systems as a replacement for line-based or binary diffs, which are currently used.
Conference Paper
Full-text available
Version control is an activity very important for high-quality software production. The structure used by version control systems is the same used by file systems, but in general the abstraction level made by software developers considers the file contents and its internal structure, including details as classes, methods, control blocks and others. Fine-grained version control tools can provide a more detailed version control. However traditional tools and models provide very low flexibility and present high cost and impact of deployment in software development environments. In this paper, there are presented a model and a tool which aim at providing support to fine-grained version control activities.
Article
Computer-Aided Software Engineering environments are becoming essential for complex software projects, just as CAD systems have become essential for complex hardware projects. DSEE, the DOMAIN Software Engineering Environment, is a distributed, production quality, software development environment that runs on Apollo workstations. DSEE provides source code control, configuration management, release control, advice management, task management, and user-defined dependency tracking with automatic notification. DSEE incorporates some of the best ideas from existing systems. This paper describes DSEE, contrasts it other systems, and discusses some of the technical issues involved in the construction of a highly-reliable, safe, efficient, and distributed development environment.
Chapter
This instructive book takes you step by step through ways to track, merge, and manage both open source and commercial software projects with Mercurial, using Windows, Mac OS X, Linux, Solaris, and other systems. Mercurial is the easiest system to learn when it comes to distributed revision control. And it's a very flexible tool that's ideal whether you're a lone programmer working on a small project, or part of a huge team dealing with thousands of files. Mercurial permits a countless variety of development and collaboration methods, and this book offers several concrete suggestions to get you started. This guide will help you: * Learn the basics of working with a repository, changesets, and revisions * Merge changes from separate repositories * Set up Mercurial to work with files on a daily basis, including which ones to track * Get examples and tools for setting up various workflow models * Manage a project that's making progress on multiple fronts at once * Find and fix mistakes by isolating problem sources * Use hooks to perform actions automatically in response to repository events * Customize the output of Mercurial
Conference Paper
In recent years, new software development methodologies and styles have become popular. In particular, many applications are being developed in the open-source community by groups of loosely coordinated programmers scattered across the globe. This style of widely distributed collaboration creates a suite of new problems for software development. Instead of being able to knock on the door of a collaborator, all communication between programmers working together on a system must be mediated through the computer. But at the same time, the bandwidth available for communication is dramatically more limited than those available to local collaborators. In this paper, we present a new SCM system called Stellation which is specifically designed to address the limits of current SCM systems, particularly when those systems are applied to large projects developed in a geographically distributed environment. Stellation attempts to enhance communication and collaboration between programmers by providing a mechanism called multidimensionality that allows them to share viewpoints on the structure and organization of the system; by providing a hierarchical branching mechanism that allows the granularity of coordination to be varied for different purposes; and by providing a mechanism for integrating programming language knowledge into the system, allowing it to be used for organizational and coordination purposes.
Article
day on the project, a second may have barely time to dabble in the project enough to keep current, while a third participant may be sent off on an urgent temporary assignment just before finishing a modification. It would be nice if each participant could be abstracted from the vicissitudes of the lives of the others. The system described here provides this abstraction by keeping the "files of the project" in a repository. It gives each participant his or her own copy of them and offers a number of commands to update the copy, to commit changes to the repository, etc. It is akin to some distributed file systems with optimistic concurrency control (see, e.g., Mullender and Tanenbaum[1] or Svobodova[2]), in so far as these are capable of implementing concurrency over a group of files. Its main novelties are its ease of use, its relative simplicity, and the length of the concurrency time span it supports (effectively forever). It is implemented as a simple set of command files ("shell scr