Book

Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices

Authors:

Abstract

From the Book:Software systems become legacy systems when they begin to resist modification and evolution. However, the knowledge embodied in legacy systems constitutes significant corporate assets. Assuming these system still provide significant business value, they must then be modernized or replaced. This book describes a risk-managed approach to legacy system modernization that applies a knowledge of software technologies and an understanding of engineering processes within a business context.AudienceModernizing Legacy Systems: Software Technologies, Engineering Processes and Business Practices should be useful to anyone involved in modernizing a legacy system. As a software engineer, it should help you understand some of the larger business concerns that drive a modernization effort. As a software designer, this book should help you understand the impact of legacy code, coupled with incremental development and deployment practices, on design activities. As a system architect, this book explains the processes and techniques that have failed or succeeded in practice. It should also provide insight into how you can repeat these successes and avoid the failures. As an IT manager, this book explains how technology and business objectives influence the software modernization processes. In particular, it should help you answer the following questions: When and how do I decide if a modernization or replacement effort is justified? How do I develop an understanding of the legacy system? How do I gain an understanding of, and evaluate the applicability of, infsystem technologies that can be used in the modernization of my system? When do I involve the stakeholders and how can I reconcile their conflicting needs? What role does architecture play in legacy system modernization? How can I estimate the cost of a legacy system modernization? How can I evaluate and select a modernization strategy? How can I develop a detailed modernization plan?Organization and ContentModernizing Legacy Systems: Software Technologies, Engineering Processes and Business Practices shows how legacy systems can be incrementally modernized. It uses and extends the methods and techniques described in Building Systems from Commercial Components Wallnau, 2001 to draw upon engineering expertise early in the conceptual phase to ensure realistic and comprehensive planning.This book features an extensive case study involving a major modernization effort. The legacy system in this case study consists of nearly 2 million lines of COBOL code developed over 30 years. The system is being replaced with a modern system based on the Java 2 Enterprise Edition (J2EE) architecture. Additional challenges include a requirement to incrementally develop and deploy the system. We look at the strategy used to modernize the system; the use of Enterprise JavaBeans, message-oriented middleware, Java, and other J2EE technologies to produce the modern system; the supporting software engineering processes and techniques; and the resulting system. Chapters 1 of this book provides an introduction to the challenges and practices of software evolution and Chapter 2 introduces the major case study in the bo introduces the Risk-Managed Modernization (RMM) approach which is elaborated in Chapters 4 through 17 and illustrated by the case study. Throughout Chapters 4 through 17 we provide an activity diagram of RMM as a road map to each chapter.Chapter 18 provides some recommendations to help guide your modernization efforts (although these recommendations cannot be fully appreciated without reading the main body of the book).Throughout this book we use the Unified Modelling Language (UML) to represent architecture drawings and design patterns. A brief introduction to UML is provided in Chapter 6.
... Application modernisation can be defined as the refactoring, re-purposing or consolidation of legacy software systems in order to align them more closely with current business needs. Reengineering is a kind of software modernisation in which the system quality is improved by means of a systematic process over three stages [2]: reverse engineering, restructuring and forward engineering. Firstly, a reverse engineering stage analyses the existing system and extracts knowledge which is represented at different abstraction levels. ...
... The design process consists of six activities: 1. Problem identification and motivation. 2. Define the objectives of a solution. ...
... Software reengineering is a systematic way of modernising (e.g., a migration) a legacy system [2]. A reengineering process is normally applied in three stages [41], as shown in Figure 2.1. ...
Thesis
Full-text available
Data Engineering is the Computer Science discipline concerned with the principles, techniques, methods and tools to support the data management in the software development. Data are normally stored in database management systems (e.g. Relational, Object-oriented or NoSQL) and Data Engineering has been mainly focused on relational data so far, although interest is shifting towards NoSQL databases. In this thesis, we have addressed issues which are related to some of the main topics of Data Engineering, such as Data Reengineering, Data Reverse Engineering, Data Integration and Data Tooling. More specifically, we have explored the application of Model-Driven Engineering (MDE) in data engineering. MDE emphasizes the systematic use of models to improve software productivity and some aspects of the software quality, such as maintainability or interoperability. Model-driven techniques have proven useful not only as regards developing new software applications but also the reengineering of legacy systems. Models and metamodels provide a high-level formalism with which to represent artefacts commonly manipulated in the different stages of a software evolution process (e.g., a software migration) while model transformation allows the automation of the evolution tasks to be performed. Some approaches and experiences of model-driven software reengineering have recently been presented but they have been focused on the code while data reengineering aspects have been overlooked. A data reengineering should be covered through three dimensions: schema conversion, data conversion and program conversion. This thesis focuses on the first dimension of data reengineering. We present an MDE-based process for the schema conversion whose purpose is to improve the quality of the logical schema in a relational database migration scenario. The approach proposed is organised following the three stages of a software reengineering process: reverse engineering, restructuring and forward engineering. Each stage has been implemented by means of model transformation chains. We have validated our approach through its application to a real widely-used database. We also provide an assessment of the use of MDE techniques in our implementation. Our process applies not only a technological migration but also a schema conversion with which to provide data quality improvements. Most modern relational database management systems have the ability to monitor and enforce referential integrity constraints (implemented by foreign key) but heavily evolved legacy information systems may not make use of this important feature, if their design predates its availability. The detection of implicit foreign keys in legacy systems has been a long-term research topic in the database reengineering community and a variety of different methods have been proposed, which are based on the analysis of the three kinds of artefacts that form a data-intensive information system, namely schema, application code and data. Our schema conversion process improves the data quality of a system by eliciting foreign keys through strategies of analysis of schema, data and application code analysis. Owing to empirical evidence on eliciting foreign keys in large-scale industrial systems is scarce and often ”problems” (case studies) are carefully selected to fit a particular ”solution” (method), rather than the other way around, we also carry out a different approach: we re-engineer a real, complex and mission-critical information system and it leads to a new manual and complementary analysis that consists of the triangulation of the results obtained in all of our previous analysis. We define several criteria for the acceptance of the candidate foreign keys discovered in the schema, data and application code analysis and we discuss the final results. The schema conversion process has also considered checking and fixing automatically, if necessary, the database normalisation level. For this task, we have integrated the Concept Explorer tool (ConExp) into our MDE solution in order to identify functional dependencies in a relational database. From the knowledge gained from the integration of ConExp, we explore the MDE capabilities for the tool interoperability through the creation of a bidirectional bridge between the DB-Main data reengineering tool and the Objectiver requirement engineering tool. As DB-Main offers several alternatives to access its data (API, and XML and proprietary formats), we have evaluated different strategies to implement the syntactic mapping (i.e. creating model injectors and extractors). In addition, we have explored the use of QVT relational to implement the semantic mapping. One of the main challenges in the adoption of MDE in large and complex processes is still the availability of tools to support MDE-based processes which integrate manual and automated tasks. Therefore, in this thesis we have also proposed a tool with which to provide the definition and enactment of MDE migration processes in general, and which supports the execution of our data reengineering process in particular. The lack of IDEs capable of integrating the execution of automated tasks and manual tasks to be performed by developers has motivated the creation of the tool described in this thesis. We have defined a SPEM-based language for defining models that represent migration plans. For each particular migration project, these models are instantiated in order to contain all the information needed for the enactment (e.g. resource paths and transformation tools). Then, these models are enacted by means of a process interpreter which generates Ant scripts to execute the automated tasks and Trac tickets for managing manual tasks with the Mylyn tool. Summarizing, the aim of our work has been applying MDE techniques with three different purposes: tackling the schema conversion during a data reengineering, approaching the integration of a database reengineering tool with other software tools, and building a tool able to automate the development of migration processes. With the two former, we have investigated the benefits of MDE with regard to traditional solutions, and the latter objective addressed how MDE may be useful to develop tools supporting software processes.
... Nesneye Yönelimli Programlamanın (NYP) endüstride en iyi uygulama (best practice) haline gelmesi ve 1990'lı yıllardan itibaren eski kurumsal yazılımların (legacy system) dönüştürülme ihtiyacıyla birlikte Yazılım Yeniden Yapılamanın (Software Re-engineering) (YYY) önemi gittikçe artmıştır [1]. YYY ile ilgili yöntem, araç ve teknikleri içeren çalışmalar incelendiğinde bunların genel olarak: (a) mevcut yazılımların işlevsel (functional) ve/veya işlevsel olmayan (non-functional) niteliklerinin geliştirilmesi [2] ile (b) eski yazılım sistemlerinin dönüştürülmesinde kullanılan yöntem ve araçlar üzerine odaklanıldığı gözlenmektedir [3,4]. Ancak son yıllardaki bilgi, iletişim, veri ve yazılım teknolojilerindeki hızlı gelişmeler, bireysel ve kurumsal ihtiyaçları da etkilemiş, beraberinde köklü değişiklikleri gündeme getirmiştir. ...
... Program gösteriminde soyut söz dizim ağaçları (abstract syntax tree), ayrıştırma ağaçları (parse tree), çizge (graph) vb tekniklerden bir ya da birkaçı kullanılabilmektedir. Program dönüştürme etkinliği gereksinim ve çeşitli ölçütlerine bağlı olarak (soyutlama düzeyleri, hedef mimari, yazılım dili vb.) program göçü (migration), program çevirisi (translation) vb. yazılım etkinliklerini içerebilmektedir [3,4]. ...
... Hayır Alpha'dan "extend" edilecek 3.2 Re-structuring 3 Alpha "Requirements", "software system" (existing), "software system (target)", "work", "team" 2 ...
Conference Paper
Full-text available
Günümüzde yazılım ömür devri kısalmış, güncel yazılım yöntem ve teknikleriyle geliştirilen sistemler eski yazılım sistemleri (legacy system) arasında yer almaya başlamıştır. Dolayısıyla, Yazılım Yeniden Yapılamanın (Software Re-engineering) (YYY) önemi artarak devam etmektedir. Bu bağlamda YYY projelerinde kullanılacak yazılım süreç modelleri iş kuralları, teknoloji ve bilgi alanından bağımsız olabilmeli ve her türlü yazılım gereksinimine cevap verebilmelidir. Çevik yazılım geliştirme yöntem ve uygulamaları desteklenirken yazılım takımları kendi ihtiyaçları ve deneyimleri doğrultusunda YYY için gereken uygulama, araç ve teknikleri esnek biçimde kullanabilmelidir. Değişik büyüklük, yapı ve platformdaki yazılım sistemlerine kolayca uyarlanabilmeli, küçük çaplı projeden büyüğe doğru evirilebilen ve ölçeklenebilir nitelikte olmalıdır. Literatür incelendiğinde söz konusu problem sahalarına çözüm getirebilecek bir YYY çerçevesinin olmadığı gözlenmektedir. Bu amaçla çalışmamızda, YYY süreçlerini bütünleşik olarak ele alan ve Öz Çerçeve (Essence Framework) (ÖÇ) Standardına dayalı bir YYY modeli geliştirilmiştir. İlk izlenimlerimiz önerilen modelin yazılım mühendisliği alanına katkıda bulunabilecek nitelikte olduğu, endüstri uygu-lamaları ve deneysel veriyle desteklenmesi gerektiği yönündedir.
... Although Rani et al. have shown that there has been an increase in classes being commented in Pharo versions over time [4], many key classes still lack comments, and many existing comments have become outdated or inconsistent over time. Several other programming languages show the same symptoms of outdated or missing comments due to rapid project schedules or developer neglect [5]. ...
... This led to a total of 24 classes being evaluated in four different evaluation forms by a total of seven developers. 5 We selected these 24 classes from Pharo 9 version. We used Google's online survey tool for the evaluation to reduce the Hawthorne effect 6 and to provide an easy way to collect answers. ...
... Daher muss ebenso ein Mittelweg zwischen Anzahl der Knoten und Energieverbrauch im Hinblick auf Gewährleistung der Verfügbarkeit und Ausfallsicherheit gefunden werden. (260). Aus dieser Überarbeitung des Systems können weitere Vorteile für die Logistik-Dienstleister entstehen. ...
... Aus dieser Überarbeitung des Systems können weitere Vorteile für die Logistik-Dienstleister entstehen. So können die Antwortzeiten reduziert, Kosten eingespart, Prozesse optimiert oder Transportzeiten reduziert werden (260). ...
Thesis
Full-text available
Um die Blockchain herrscht zurzeit ein großer Hype, da ihre unzählige Einsatzmöglichkeiten nachgesagt werden. Daher bietet sich an die Anwendung der Blockchain auf das Supply Chain Management und die Logistik an, da hierbei oftmals viele Akteure an Prozessen betei-ligt sind weshalb schnell Probleme auftreten können. Hierbei wird speziell die Sendungsverfolgung betrachtet, da sie von vielen Kunden genutzt wird und immer mehr als gegeben angesehen wird. Jedoch ist die Sendungsverfolgung das Ergebnis eines komplexen Prozesses mit welchem eine Sendung vom Versender zum Emp-fänger transportiert wird. Daher wird zunächst eine Ist-Analyse durchgeführt bei der die aktuelle Situation aus unter-schiedlichen Perspektiven betrachtet wird, um ein Verständnis über den Prozess und das Angebot der Anbieter zu gewissen. Darüber hinaus besteht die Notwendigkeit die Anforde-rungen der Kunden zu erfassen, um so bisher ungelöste Probleme zu erkennen. Als Resultat der Ist-Analyse sind vier Schwachstellen identifiziert worden, die durch den Einsatz der Blockchain gelöst werden sollen. Neben dieser Lösung der Schwachstellen durch den Einsatz von Blockchain und Smart Contracts werden weitere Anwendungsmöglichkeiten herausgearbeitet, an denen eine Integ-ration veranschaulicht wird. Diese Einsatzmöglichkeiten dienen als Grundlage für die Opti-mierung der Sendungsverfolgung, bei der Blockchain und Smart Contracts eingesetzt wer-den. An die Optimierung gliedert sich die Betrachtung der Umsetzbarkeit der Lösung. Dabei wer-den Stärken und Schwächen der Lösung identifiziert sowie ein Prototyp entwickelt, an dem die beschriebene Lösung dargestellt wird. Abschließend wird aus diesen Erkenntnissen eine Handlungsempfehlung gegeben, ob eine Umsetzung der Lösung sinnvoll ist.
... Part of the reason for such mistakes is also how a software project evolves over time, the challenge in obtaining and maintaining specifications for a software system, and the shared authorship of code which also evolves over time. This has prompted software practitioners to call the difficult process of software evolution, and the errors that creep in as a result, as the legacy crisis [86]. This brings us to the prospect of automated program repair, where a program can heal itself from errors and vulnerabilities! ...
Preprint
Full-text available
Automated program repair is an emerging technology which consists of a suite of techniques to automatically fix bugs or vulnerabilities in programs. In this paper, we present a comprehensive survey of the state of the art in program repair. We first study the different suite of techniques used including search based repair, constraint based repair and learning based repair. We then discuss one of the main challenges in program repair namely patch overfitting, by distilling a class of techniques which can alleviate patch overfitting. We then discuss classes of program repair tools, applications of program repair as well as uses of program repair in industry. We conclude the survey with a forward looking outlook on future usages of program repair, as well as research opportunities arising from work on code from large language models.
... Studies by Jha et al. [28] and Böhme et al. [7] shows that nearly 70% cost of most software projects goes on maintenance and evolution. This is informally referred to as the legacy crisis [55]. Empirically, fixing source code defects are known to be easy during the early stages of software development, i.e. when the software system is small in size, functionality and complexity. ...
Thesis
While developing software systems, poor design and implementation choices can negatively affect the maintainability of software systems. There are recurring patterns of poorly designed (fragments of) software systems - these are referred to as design smells. Role-stereotypes indicate generic responsibilities that classes play in system design. Although the concepts of role-stereotypes and design smells are widely divergent, both are significant contributors to the design and maintenance of software systems. This work presents an exploratory study approach based on a combination of statistical analysis and unsupervised learning methods to understand the relation between design smells with role-stereotypes and how this relationship varies across different desktop and mobile applications. The study was performed on a dataset consisting of twelve (12) Java projects mined from GitHub. The findings indicate that three (3) out of six (6) role-stereotypes considered in this study are more prone to design smells. In addition, we found that design smells are more frequent in desktop applications than in mobile applications especially the Service Provider and Information Holder role-stereotypes. Based on unsupervised learning, it was observed that some pairs or groups of role-stereotypes are prone to similar types of design smells as compared to others. We believe that this relationship may be associated with the characteristic and collaborative properties between role-stereotypes. Additionally, the aforementioned clustering technique revealed which groups of design smells often co-occur. Specifically, {SpeculativeGenerality, Swis-sArmyKnife} and {LongParameterList, ClassDataShouldBePrivate} are observed to occur frequently together in desktop and mobile applications. Therefore, this study provides important insights on this previously concealed behaviour about the relation between design smells with role-stereotypes.
... The recent emerging studies [6,5] related to quantum software systems maintenance mostly focused on re-engineering new quantum algorithms within traditional software systems. For example, Pérez-Castillo [6] proposed a modeldriven re-engineering [25] that allows the migration of classical or legacy systems together with quantum algorithms and the integration of new quantum software during the re-engineering of classical or legacy systems while preserving knowledge. While maintenance effort is a broader dimension, examining the maintenance in terms of the technical debt composition in a quantum software system is one direction to understand how this software is being maintained. ...
Preprint
Quantum computing is a rapidly growing field attracting the interest of both researchers and software developers. Supported by its numerous open-source tools, developers can now build, test, or run their quantum algorithms. Although the maintenance practices for traditional software systems have been extensively studied, the maintenance of quantum software is still a new field of study but a critical part to ensure the quality of a whole quantum computing system. In this work, we set out to investigate the distribution and evolution of technical debts in quantum software and their relationship with fault occurrences. Understanding these problems could guide future quantum development and provide maintenance recommendations for the key areas where quantum software developers and researchers should pay more attention. In this paper, we empirically studied 118 open-source quantum projects, which were selected from GitHub. The projects are categorized into 10 categories. We found that the studied quantum software suffers from the issues of code convention violation, error-handling, and code design. We also observed a statistically significant correlation between code design, redundant code or code convention, and the occurrences of faults in quantum software.
... The layered pattern is one of the most common architectural "tools" used to structure software systems [70,18]. • Architecture reconstruction is important to reengineer legacy systems without losing the domain knowledge embodied in them: Legacy software systems embed important knowledge acquired over the years, are of a proven technology, which makes them critical assets for enterprises [34,79,165]. Several billion lines of legacy code exist, yield high maintenance cost, are prone to failures due to lack of experts and suppliers/vendors [79], and an important portion of them comply with the layered pattern. ...
Preprint
Full-text available
Architectural reconstruction is a reverse engineering activity aiming at recovering the missing decisions on a system. It can help identify the components, within a legacy software application, according to the application's architectural pattern. It is useful to identify architectural technical debt. We are interested in identifying layers within a layered application since the layered pattern is one of the most used patterns to structure large systems. Earlier component reconstruction work focusing on that pattern relied on generic component identification criteria, such as cohesion and coupling. Recent work has identified architectural-pattern specific criteria to identify components within that pattern. However, the architectural-pattern specific criteria that the layered pattern embodies are loosely defined. In this paper, we present a first systematic literature review (SLR) of the literature aiming at inventorying such criteria for layers within legacy applications and grouping them under four principles that embody the fundamental design principles underlying the architectural pattern. We identify six such criteria in the form of design rules. We also perform a second systematic literature review to synthesize the literature on software architecture reconstruction in the light of these criteria. We report those principles, the rules they encompass, their representation, and their usage in software architecture reconstruction .
... However, software security defects increased due to implementation failures regarding security best coding practices. Escaping software faults into later stages of software development will increase the maintenance cost [1], [2]. Also, after application deployment, cyberhackers will try to detect these coding vulnerabilities and exploit them to achieve their goals. ...
Article
Full-text available
False Positive Alerts (FPA), generated by Static Analyzers Tools (SAT), reduce the effectiveness of the automatic code review, letting them be underused in practice. Researchers conduct a lot of tests to improve SAT accuracy while keeping FPA at a lower rate. They use different simulated and production datasets to validate their proposed methods. This paper surveys recent approaches dealing with FPA filtering; it compares them and discusses their usefulness. It also studies the used datasets to validate the identified methods and show their effectiveness to cover most program defects. This study focuses mainly on the security bugs covered by the datasets and handled by the existing methods.
... the changing technical and business environments. Different kinds of change include bug fixing, capacity enhancement, removal of outdated functions, and performance improvement. Software maintenance is an ongoing activity and it constitutes about 75% of the total involved cost especially for large and complex software systems (April & Abran, 2012;Seacord et. al., 2003). Different developer's activities like reading, navigating, searching, and editing that is involved during the process of software maintenance possess a direct link along with the structural attributes of the underlying source code of the system (Soh et. al., 2016). Moreover, usually, strict deadlines along with various external constra ...
... To address such bugs, developers rely on bug reports to find buggy code files. This process can be time-consuming, depending on the quality of the bug report; it also requires developers to manually search for suspicious code files [1]. An automated recommender of candidate buggy code files can significantly reduce the cost of software maintenance. ...
Article
Full-text available
With the use of increasingly complex software, software bugs are inevitable. Software developers rely on bug reports to identify and fix these issues. In this process, developers inspect suspected buggy source code files, relying heavily on a bug report. This process is often time-consuming and increases the cost of software maintenance. To resolve this problem, we propose a novel bug localization method using topic-based similar commit information. First, the method determines similar topics for a given bug report. Then, it extracts similar bug reports and similar commit information for these topics. To extract similar bug reports on a topic, a similarity measure is calculated for a given bug report. In the process, for a given bug report and source code, features shared by similar source codes are classified and extracted; combining these features improves the method’s performance. The extracted features are presented to the convolutional neural network’s long short-term memory algorithm for model training. Finally, when a bug report is submitted to the model, a suspected buggy source code file is detected and recommended. To evaluate the performance of our method, a baseline performance comparison was conducted using code from open-source projects. Our method exhibits good performance.
... To solve this problem, Pérez-Castillo [218] proposed a software modernization approach (model-driven reengineering) [243] to restructuring classical systems together with existing or new quantum algorithms to provide target systems that combine both classical and quantum information systems. The software modernization method in classical software engineering has been proved to be an effective mechanism that can realize the migration and evolution of software while retaining business knowledge. ...
Preprint
Quantum software plays a critical role in exploiting the full potential of quantum computing systems. As a result, it is drawing increasing attention recently. This paper defines the term "quantum software engineering" and introduces a quantum software life cycle. Based on these, the paper provides a comprehensive survey of the current state of the art in the field and presents the challenges and opportunities that we face. The survey summarizes the technology available in the various phases of the quantum software life cycle, including quantum software requirements analysis, design, implementation, test, and maintenance. It also covers the crucial issue of quantum software reuse.
... Software Modernization, also known as software migration, is the task of rewriting or porting existing legacy software systems to new platforms, architectures or programming languages [60]. Modernization includes improvement of applications (e.g., adding new features), change of programming languages and runtime environments, migrating existing data architectures, and making components reusable [61]. ...
Thesis
Full-text available
Domain modeling is an important model-driven engineering activity, which is typically used in the early stages of software projects. Domain models capture concepts and relationships of respective application fields using a modeling language and domain-specific terms. They are a key factor in achieving shared understanding of the problem area among stakeholders, improving communication in software development, and generating code and software. Domain models are a prerequisite for domain-specific language development and are often implemented in software and data integration and software modernization projects, which constitute the larger part of industrial IT investment. Several studies from recent years have shown that model-driven methods are much more widespread in the industry than previously thought, yet their application is challenging. Creating domain models requires that software engineers have both experience in model-driven engineering and detailed domain knowledge. While the former is one of the general modeling skills, the required domain knowledge varies from project to project. Domain knowledge acquisition is a time-consuming manual process because it requires multidisciplinary collaboration and gathering of information from different groups of people, documents, and other sources of knowledge, and is rarely supported in current modeling environments. Consistent access to large amounts of structured domain knowledge is not possible due to the heterogeneity of formats, access methods, schemas, and semantic representations. Besides, existing suitable knowledge bases were mostly manually created and are therefore not extensive enough. The automated construction of knowledge resources utilizes mainly information extraction approaches that focus on factual knowledge at the instance level and therefore cannot be used for conceptual-level domain modeling. This thesis develops novel methods and tools that provide domain information directly during modeling to reduce the initial effort of using domain modeling and to help software developers create domain models. It works on the connection of the areas of software modeling, knowledge bases, information extraction and recommender systems to automatically acquire conceptual knowledge from structured knowledge sources and unstructured natural language datasets, to transform the aggregated knowledge into appropriate recommendations, and to develop suitable assistance services for modeling environments. With this thesis, the paradigm of Semantic Modeling Support is proposed, the methodological foundation for providing automated modeling assistance. It includes an iterative procedure of model refinement, knowledge acquisition, and element recommendation that allows to query and provide the necessary domain knowledge for a range of support scenarios at each stage of domain model development, keeping the human in the loop. To address the lack of conceptual knowledge resources, new methods are developed to extract conceptual terms and relationships directly from large N-gram text data using syntactic patterns, co-occurrences, and statistical features of text corpora. A large Semantic Network of Related Terms is automatically constructed with nearly 6 million unique one-word terms and multi-word expressions connected with over 355 million weighted binary and ternary relationships. It allows to directly answer top-N queries. This thesis introduces an extensible query component with a set of fully connected knowledge bases to uniformly access structured knowledge with well-defined relationships. The developed Mediator-Based Querying Architecture with Generic Templates is responsible for retrieving lexical information from heterogeneous knowledge bases and mapping to modeling language-specific concepts. Furthermore, it is demonstrated how to implement the semantic modeling support strategies by extending the widely used Eclipse Modeling Project. The Domain Modeling Recommender System generates context-sensitive modeling suggestions based on the connected knowledge bases, the semantic network of terms, and an integrated ranking strategy. Finally, this thesis reports on practical experience with the application of the developed methods and tools in three research projects.
... Software reengineering is defined as an engineering process seeking to generate an evolvable system [120]. Generally, it includes all the subsequent activities to software delivery that aims at improving the understanding of the software as well as enhancing various quality parameters, such as system maintainability and complexity [137]. ...
Thesis
Software technologies are constantly evolving to facilitate the development, deployment, and maintenance of applications in different areas. In parallel, these applications evolve continuously to guarantee an adequate quality of service, and they become more and more complex. Such evolution often involves increased development and maintenance costs, that can become even higher when these applications are deployed in recent execution infrastructures such as the cloud. Nowadays, reducing these costs and improving the quality of applications are main objectives of software engineering. Recently, microservices have emerged as an example of a technology or architectural style that helps to achieve these objectives.While microservices can be used to develop new applications, there are monolithic ones (i.e., monoliths) built as a single unit and their owners (e.g., companies, etc.) want to maintain and deploy them in the cloud. In this case, it is common to consider rewriting these applications from scratch or migrating them towards recent architectural styles. Rewriting an application or migrating it manually can quickly become a long, error-prone, and expensive task. An automatic migration appears as an evident solution.The ultimate aim of our dissertation is contributing to automate the migration of monolithic Object-Oriented (OO) applications to microservices. This migration consists of two steps: microservice identification and microservice packaging. We focus on microservice identification based on source code analysis. Specifically, we propose two approaches.The first one identifies microservices from the source code of a monolithic OO application relying on code structure, data accesses, and software architect recommendations. The originality of our approach can be viewed from three aspects. Firstly, microservices are identified based on the evaluation of a well-defined function measuring their quality. This function relies on metrics reflecting the "semantics" of the concept "microservice". Secondly, software architect recommendations are exploited only when they are available. Finally, two algorithmic models have been used to partition the classes of an OO application into microservices: clustering and genetic algorithms.The second approach extracts from an OO source code a workflow that can be used as an input of some existing microservice identification approaches. A workflow describes the sequencing of tasks constituting an application according to two formalisms: control flow and /or data flow. Extracting a workflow from source code requires the ability to map OO conceptsinto workflow ones.To validate both approaches, we implemented two prototypes and conducted experiments on several case studies. The identified microservices have been evaluated qualitatively and quantitatively. The extracted workflows have been manually evaluated relying on test suites. The obtained results show respectively the relevance of the identified microservices and the correctness of the extracted workflows.
... Improving software security by implementing code that conforms to the CERT secure coding standards can be a significant investment for a software developer, particularly when refactoring or otherwise modernizing existing software systems [Seacord 2003]. However, a software developer does not always benefit from this investment because it is not easy to market code quality. ...
... Today, manual debugging and maintenance often takes up 80% of the resources in a software project, prompting practitioners to long declare a legacy crisis. 25 In the future, program repair can provide tool support by repairing bugs from complex changes in software projects. This can help resolve a dilemma of developers when managing program changes: "Our dilemma is that we hate change and love it at the same time; what we really want is for things to remain the same but get better." ...
Article
Automated program repair can relieve programmers from the burden of manually fixing the ever-increasing number of programming mistakes.
... The task we wish to facilitate (or enable) is identifying potential anomalies in the maintenance activity profiles of software projects. Software maintenance has long been characterised by its (huge) costs [3,[21][22][23]. Early detection of anomalies may help reduce these costs and prevent escalations. ...
Preprint
Full-text available
Lehman's Laws teach us that a software system will become progressively less satisfying to its users over time, unless it is continually adapted to meet new needs. A line of previous works sought to better understand software maintenance by studying how commits can be classified into three main software maintenance activities. Corrective: fault fixing; Perfective: system improvements; Adaptive: new feature introduction. In this work we suggest visualizations for exploring software maintenance activities in both project and individual developer scopes. We demonstrate our approach using a prototype we have built using the Shiny R framework. In addition, we have also published our prototype as an online demo. This demo allows users to explore the maintenance activities of a number of popular open source projects. We believe that the visualizations we provide can assist practitioners in monitoring and maintaining the health of software projects. In particular, they can be useful for identifying general imbalances, peaks, deeps and other anomalies in projects' and developers' maintenance activities.
... They are, however, hard and expensive to manage (Bennett and Rajlich 2000). It has been realized that the maintenance and evolution costs of legacy systems are normally somewhat between 40% and 90% of the total costs of the life-cycle of the system (Foster 1993) (Glass 2003) (Seacord, Plakosh and Lewis 2003). A brand new information system that is installed today may be a legacy system in the future. ...
... Therefore, in this thesis, we are interested in the reengineering solution (depicted in red in Figure 1.1). Software reengineering is the process of generating evolvable systems [Seacord 2003] and therefore extending the life time of a legacy software system. Several definitions have been proposed to describe the software reengineering process. ...
Thesis
Legacy software systems often represent significant investmentsfor the companies that develop them with the intention of using themfor a long period of time. The quality of these systems can be degraded over time due to the complex changes incorporated to them.In order to deal with these systems when their quality degradation exceeds a critical threshold, a number of strategies can be used. Thesestrategies can be summarized in: 1) discarding the system and developinganother one from scratch, 2) carrying on the (massive) maintenance of the systemdespite its cost, or 3) reengineering the system. Replacement and massive maintenance are not suitable solutions when the cost and time are to be taken into account, since they require a considerable effort and staff to ensurethe system conclusion in a moderate time. In this thesis, we are interested in the reengineering solution. In general, software reengineering includes all activities following the delivery to the user to improve thesoftware system quality. This latter is often characterized with a set of quality attributes. We propose three contributions to improve specific quality attributes namely: maintainability, understandability and modularity.In order to improve maintainability, we propose to migrateobject oriented legacy software systems into equivalent component based ones.Contrary to exiting approaches that consider a component descriptor as a clusterof classes, each class in the legacy system will be migrated into a componentdescriptor. In order to improve understandability, we propose an approach forrecovering runtime architecture models of object oriented legacy systems and managing the complexity of the resulted models.The models recovered by our approach have the following distinguishing features: Nodes are labeled with lifespans and empirical probabilities of existencethat enable 1) a visualization with a level of detail. 2) the collapsing/expanding of objects to hide/show their internal structure.In order to improve modularity of object-oriented software systems,we propose an approach for identifying modulesand services in the source code.In this approach, we believe that the composite structure is the main structure of the system that must be retained during the modularization process, the component and its composites must be in the same module. Existing modularization works that has this same vision assumes that the composition relationships between the elements of the source code are already available, which is not always obvious. In our approach, module identification starts with a step of runtime architecture models recovery. These models are exploited for the identification of composition relationships between the elements of the source code. Once these relationships have been identified, a composition conservative genetic algorithm is applied on the system to identify modules. Lastly, the services provided by the modules are identified using the runtime architecture models of the software system. Some experimentations and casestudies have been performed to show the feasibility and the gain inmaintainability, understandability and modularity of the software systems studied with our proposals.
... It is known that for many software engineering projects up to 80% of the time is spent in debugging and fixing errors. This is an unfortunate narrative on the state-of-practice in software development, prompting practitioners to label the situation as a legacy crisis a decade back [2]. Since then, the scale of software has increased, and the use of third party code, or geographically distributed software development has also dramatically increased. ...
... (Ying et al., 2005) (Fluri et al., 2007) (Farooq et al., 2012) In software industry 90% of the projects belongs to the maintenance and in enhancement phase. (Seacord et al., 2003) (Bacchelli and Bird, 2013) Due to heavy turnover in the software industry most of the time the Programmer who wrote the program are not available when the same program in maintenance and enhancement stage, therefore in this situation a proper commented program helps the maintenance programmer in order to understand programming logic. In APIs, program documents generated based on comments helps a lot for novices and experts to understand language interfaces and function signatures. ...
Article
Full-text available
Comments have an important role in software development. Especially medium to large scale projects have a reasonably large code base. Useful and good quality comments play a significant part while maintaining and evolving such projects. In this work we present a taxonomy of comments based on their styles, parsing rules, recursivity, and usage. We also present quality design considerations which the programming languages should ensure so that the support of comments should be free of any side effects.
... A natural question then follows: is it the accidental complexity that quadruples the increased complexity in the solution domain? We believe that there is great value in investing effort to answer this question with a further research because the results of RQ 4 show that complexity has an enormous influence on the maintenance time, which consumes 90% of the total cost of software projects [171]. Figure 10 clearly shows that different complexity triggers (code characteristics) have significantly different levels of influence on complexity increase. ...
Thesis
Full-text available
Large software development companies primarily deliver value to their customers by continuously enhancing the functionality of their products. Continuously developing software for customers insures the enduring success of a company. In continuous development, however, software complexity tends to increase gradually, the consequence of which is deteriorating maintainability over time. During short periods of time, the gradual complexity increase is insignificant, but over longer periods of time, complexity can develop to an inconceivable extent, such that maintenance is no longer profitable. Thus, proactive complexity assessment methods are required to prevent the gradual growth of complexity and instead build quality into developed software. Many studies have been conducted to delineate methods for complexity assessment. These focus on three main areas: 1) the landscape of complexity, i.e., the source of the complexity; 2) the possibilities for complexity assessment, i.e., how complexity can be measured and whether the results of assessment reflects reality; and 3) the practicality of using complexity assessment methods, i.e., the successful integration and use of assessment methods in continuous software development. Partial successes were achieved in all three areas. Firstly, it is clear that complexity is understood in terms of its consequences, such as spent time or resources, rather than in terms of its structure per se, such as software characteristics. Consequently, current complexity measures only assess isolated aspects of complexity and fail to capture its entirety. Finally, it is also clear that existing complexity assessment methods are used for isolated activities (e.g., defect and maintainability predictions) and not for integrated decision support (e.g., continuous maintainability enhancement and defect prevention). This thesis presents 14 new findings across these three areas. The key findings are that: 1) Complexity increases maintenance time multifold when software size is constant. This consequential effect is mostly due to a few software characteristics, and whilst other software characteristics are essential for software development, they have an insignificant effect on complexity growth; 2) Two methods are proposed for complexity assessment. The first is for source code, which represents a combination of existing complexity measures to indicate deteriorating areas of code. The second is for textual requirements, which represents new complexity measures that can detect the inflow of poorly specified requirements; 3) Both methods were developed based on two critical factors: (i) the accuracy of assessment, and (ii) the simplicity of interpretation. The methods were integrated into practitioners’ working environments to allow proactive complexity assessment, and prevent defects and deteriorating maintainability. In addition, several additional key observations were made: Primarily the focus should be in creating more sophisticated software complexity measures based on empirical data indicative of the code characteristics that most influence complexity. It is desirable to integrate such complexity assessment measures into the practitioners’ working environments to ensure that complexity is assessed and managed proactively. This would allow quality to be built into the product rather than having to conduct separate, post-release refactoring activities.
... Localizing and fixing bugs is known to be an effort-prone and time-consuming task for software developers [33,71,86]. To support programmers in this common activity, researchers have proposed a number of approaches aimed at automatically repairing programs [5,8,20,21,36,37,41,42,45,48,52,53,60,63,65,75,77,83,85,88]. ...
Preprint
Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. First, we mine millions of bug-fixes from the change histories of projects hosted on GitHub, in order to extract meaningful examples of such bug-fixes. Next, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. In our empirical investigation we found that such a model is able to fix thousands of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9-50% of the cases, depending on the number of candidate patches we allow it to generate. Also, the model is able to emulate a variety of different Abstract Syntax Tree operations and generate candidate patches in a split second.
... Debugging is an important and continuous activity in the maintenance phase of the software life cycle. Knowing that software maintenance costs about 90% of the overall system costs [3] and that the testing and debugging activities constitute from 50% to 70% of development costs [4], it is important to have debugging methodologies and techniques that aid in the quick detection and fixing of bugs. ...
Conference Paper
Full-text available
This paper proposes a novel methodology for enabling debugging and tracing of production web applications without affecting its normal flow and functionality. This method of debugging enables developers and maintenance engineers to replace a set of existing resources such as images, server side scripts, cascading style sheets with another set of resources per web session. The new resources will only be active in the debug session and other sessions will not be affected. This methodology will help developers in tracing defects, especially those that appear only in production environments and in exploring the behavior of the system. A realization of the proposed methodology has been implemented in Java.
... Previous studies showed that around 90% of software development cost is utilized on maintenance and evolution activities [1]. Improving the competence of the bug fixing process would lower the costs of software development as indicated by several studies [2]. The bug fixing process would benefit greatly from improving the bug assignment accuracy by assigning bugs to appropriate developers. ...
Article
Full-text available
Most bug assignment approaches utilize text classification and information retrieval techniques. These approaches use the textual contents of bug reports to build recommendation models. The textual contents of bug reports are usually of high dimension and noisy source of information. These approaches suffer from low accuracy and high computational needs. In this paper, we investigate whether using categorical fields of bug reports, such as component to which the bug belongs, are appropriate to represent bug reports instead of textual description. We build a classification model by utilizing the categorical features, as a representation, for the bug report. The experimental evaluation is conducted using three projects namely NetBeans, Freedesktop, and Firefox. We compared this approach with two machine learning based bug assignment approaches. The evaluation shows that using the textual contents of bug reports is important. In addition, it shows that the categorical features can improve the classification accuracy.
... Much time has passed since then, but, many of Lehman's laws still apply these days despite the evolution of technology in general. Software continues to evolve long after the first version has deployed and numerous studies indicate that the costs associated with software maintenance and evolution exceed the 50% (and sometimes are more than 90%) of the total costs related to a software system [121]. To reduce these costs, both managers and developers must understand the factors that drive software evolution and take proactive steps that facilitate changes and ensure software does not decay [138]. ...
Thesis
Full-text available
The great availability of data in OSS communities, as well as mobile distribution platforms (i.e., app stores) has encouraged research on (i) how open source projects and mobile apps are maintained, (ii) how developers interact with each other, and (iii) how developers gather suggestions from users in order to evolve their products. Previous research demonstrated that developers (i) make an intense use of written communication channels (e.g., mailing lists, issue trackers and chats) to coordinate themselves during software maintenance activities, and (ii) usually collect user feedback helpful for improving their products. However, such kinds of messages (i.e., user feedback and developers' discussions) are usually written through natural language and may contain (i) a mix of structured, semi-structured and unstructured information (e.g., a development email may usually enclose code snippets or stack traces), (ii) text having different purposes (e.g,. discussing feature requests, bugs to fix etc.), (iii) unnecessary details (e.g., about 2/3 of app reviews contain useless information from a software maintenance and evolution perspective). Thus, the manual classification (or filtering) of such messages according to their purpose would be a daunting and time-consuming task and we argue that helping developers to discern the content of natural language messages that best fit their information need is a relevant task to support them in decision making processes during software maintenance and evolution (i.e., establish the new features/functionalities to implement in the next release, the bugs which have to be fixed, etc.). To address this issue, in this dissertation, we explore a semi-supervised technique, named Intention Mining, which tries to model the writer's main purpose within a natural text fragment, by exploiting the grammatical structure of the fragment and assigning it to an intention category (e.g., asking/providing help, proposing new features or solutions for a known problem, discussing bugs, etc.). In particular, we show how we exploited the approach for (i) building automated classifiers of development messages, (ii) constructing categorizers of mobile app reviews able to discern useful contents from a software maintenance and evolution perspective, and (iii) building summaries of app reviews, which help developers better understand users needs. We also discuss several tools we built in order to support developers discerning useful information over natural text channels, during the maintenance and evolution of their software. Our approach is not aimed at replacing previous text mining techniques. Conversely, it might be profitably used in combination with other techniques in order to mine helpful information within natural text documents.
... It is accepted wisdom that maintenance dominates so ware development costs [1], with bug-handling being a major contributor [2,3]. e e ort required for handling bugs (including locating and xing the faulty code, and updating the test suite as a result) is likely to be impacted by the programming languages the so ware is built with [4]. ...
Article
Handling bugs is an essential part of software development. The impact of programming language on this task has long been a topic of much debate. For example, some people hold the view that bugs in Python are easy to handle because its code is easy to read and understand, while some others believe the absence of static typing in Python will lead to higher bug-handling effort. This paper presents the first large-scale study to investigate whether the ecosystems of different (categories of) programming language would require different bug-handling effort. The focus is on correlation analysis rather than causal analysis. With 600 most popular projects in 10 languages downloaded from GitHub (summing up to 70,816,938 SLOC and 3,096,009 commits), the experimental results indicate various interesting findings. First, different languages require different bug-handling effort. For example, Java and C# tend to require less time but more line/file modification, Python and PHP tend to require less time and less line/file modification, while Ruby and JavaScript tend to require more time as well as more line/file modification. Second, weak/dynamic languages tend to require more time than strong/static languages, while static languages tend to require more absolute line/file modification. A toy predictive model also provides proof that the inclusion of programming languages could improve the effectiveness when predicting the bug-handling effort of a project.
... Previously, the term modernization and migration have been used interchangeably, and it is the intention in this paper to keep a rather wide interpretation, alongside the view of Barbier & Recoussine. The work by Seacord [2] and Ulrich [3] constitute two seminar books when it comes to modernization of legacy systems and architecture-driven modernization, with concrete examples involving COBOL systems. Tilly et al. [4] explore the challenges on the early stages of Service Oriented Architecture (SOA), when Simple Object Access Protocol (SOAP) was gaining popularity. ...
...  Modernizing the legacy code itself. Code modernization is an attractive approach to organizations as it has the highest probability of success among the three types of approaches [17]. The migration risks are identified early in the code modernization process. ...
Conference Paper
Full-text available
Software has become ubiquitous in healthcare applications, as is evident from its prevalent use for controlling medical devices, maintaining electronic patient health data, and enabling healthcare information technology (HIT) systems. As the software functionality becomes more intricate, concerns arise regarding quality, safety and testing effort. It thus becomes imperative to adopt an approach or methodology based on best engineering practices to ensure that the testing effort is affordable. Automation in testing software can increase product quality, leading to lower field service, product support and liability cost. It can provide types of testing that simply cannot be performed by people.
...  Modernizing the legacy code itself. Code modernization is an attractive approach to organizations as it has the highest probability of success among the three types of approaches [17]. The migration risks are identified early in the code modernization process. ...
... Studies show that software maintenance is, by far, the predominant activity in software engineering (90% of the total cost of a typical software [15,19]). It is needed to keep software systems up-to-date and useful: Any software system reflects (i.e. ...
Article
Full-text available
Software engineering has been striving for years to improve the practice of software development and maintenance. Documentation has long been prominent on the list of recommended practices to improve development and help maintenance. Recently however, agile methods started to shake this view, arguing that the goal of the game is to produce software and that documentation is only useful as long as it helps to reach this goal. On the other hand, in the re-engineering field, people wish they could re-document useful legacy software so that they may continue maintain them or migrate them to new platform. In these two case, a crucial question arises: How much documentation is enough? In this article, we present the results of a survey of software maintainers to try to establish what documentation artifacts are the most important to them.
Conference Paper
Full-text available
Legacy software systems play an important role in the economy but are known to cause high operational and maintenance costs. To reduce these costs, such systems are often migrated to modern infrastructure or languages. There exists a variety of migration strategies, however choosing the best strategy or combination of strategies given technical, economical and business constraints remains a challenging task. We observe a lack of experience reports on industrial migration projects explaining their decisions in detail. In this report, we present the case of an insurance system with 1M Source Lines of Code, running on an expensive mainframe and featuring Natural, Cobol, and Assembler code as well as an Adabas database. We elaborate on why state-of-practice migration strategies were inadequate in this case and introduce an alternative methodology, taking into account the limited budget for the migration. In this project, we use custom transpilation to translate the legacy code automatically to another programming language. In contrast to off-the-shelf transpilers, we implement an iteratively refined transpiler that is fine-tuned to the legacy code at hand. The transpiler guides its own development by pointing out instructions in the legacy code it cannot yet translate. Manual adaptions to the legacy code allow circumventing the implementation of overly complicated translation rules. This ensures the transpiler and the generated code remain lean and efficient while being able to cope with specific challenges of the system at hand. In the presented industrial case, Natural and Assembler sources were transpiled to Cobol running on Linux, combined with some adapted and rewritten Cobol and Java. We illustrate our lessons learned and provide in-depth insights into testing and debugging activities. A comparison with alternative offers by other vendors validates the economic benefits of this approach.
Article
Moore's law states that the number of transistors on a chip will double every two years. A similar force appears to drive the progress of information technology (IT). IT companies tend to struggle to keep up with the latest technological developments, and software solutions are becoming increasingly outdated. The ability for software to change easily is defined as evolvability. One of the major fields researching evolvability is enterprise engineering (EE). The EE research paradigm applies theories from other fields to the evolvability of organisations. We argue that such theories can be applied to software engineering (SE) as well, which can contribute to the construction of software with a clear separation of dynamically changing technologies based on a relatively stable description of functions required for a specific user. EE theories introduce notions of function, construction, and affordance. We reify these concepts in terms of SE. Based on this reification, we propose affordance-driven assembling (ADA) as a software design approach that can aid in the construction of more evolvable software solutions. We exemplify the implementation of ADA in a case study on a commercial system and measure its effectiveness in terms of the impact of changes, as defined by the normalised systems theory.
Book
Full-text available
Buku Praktikum Rekayasa Perangkat Lunak tersusun atas beberapa bagian yakni dimulai dari perangkat lunak dan rekayasanya, Microsoft Visio dan StarUML, objek data, atribut dan relasi, analisis sistem, diagram alir data i, bagan alir (flowchart), bagan alir (flowchart) lanjutan, dan use case diagram. Buku praktikum ini ditujukan kepada khalayak yang ingin memahami mengenai rekayasa perangkat lunak, terutama bagi mahasiswa Program Studi Teknik Informatika, Universitas Serambi Mekkah (USM) dan Program Studi Manajemen Informatika, AMIK Indonesia. Buku praktikum ini diharapkan dapat menjadi acuan untuk membantu mahasiswa dalam perkuliahan.
Article
Bug assignment is the task of ranking candidate developers in terms of their potential competence to fix a bug report. Numerous methods have been developed to address this task, relying on different methodological assumptions and demonstrating their effectiveness with a variety of empirical studies with numerous data sets and evaluation criteria. Despite the importance of the subject and the attention it has received from researchers, there is still no unanimity on how to validate and comparatively evaluate bug‐assignment methods and, often times, methods reported in the literature are not reproducible. In this paper, we first report on our systematic review of the broad bug‐assignment research field. Next, we focus on a few key empirical studies and review their choices with respect to three important experimental‐design parameters, namely, the evaluation metric(s) they report, their definition of who the real assignee is, and the community of developers they consider as candidate assignees. The substantial variability on these criteria led us to formulate a systematic experiment to explore the impact of these choices. We conducted our experiment on a comprehensive data set of bugs we collected from 13 long‐term open‐source projects, using a simple Tf‐IDf similarity metric. On the basis of our arguments and/or experiments, we provide useful guidelines for performing further bug‐assignment research. We conclude that mean average precision (MAP) is the most informative evaluation metric, the developer community should be defined as “all the project members,” and the real assignee should be defined as “any developer who worked toward fixing a bug.”
Article
Full-text available
Software migration projects are often bound either by time or cost or by both. If the project is bound by both time and cost, the user must sacrifice something else, usually the quality. The migration strategy depends on how the project is bound. Most migration projects are bound by time. The new system must be in operation by a given date, no matter what it costs. The project described here—a state employee payroll system—is bound by cost. It must remain within the budget, no matter how long it takes. The original costs were estimated based on the code size and the productivity measured in previous migration projects using three different approaches: conversion, redevelopment, and reimplementation. The conversion approach would have been the cheapest, but it had already been tried and failed. The redevelopment approach was considered to be out of the question due to the high costs. Thus, reimplementation remained as the only alternative. The costs of this approach were estimated using three different estimation methods and approved by the state government. The project has been in progress for 4 years, and until now, the estimated costs and actual costs are in the same order of magnitude: the costs have remained within budget. In fact, the costs are less than what was estimated with some methods. As this particular project is not bound by time, it is a good example of continuous migration. Redevelopment is often prohibitively expensive and/or fails. Automated conversion is error prone, delivers unmaintainable code, and is high risk. Reimplemention sits between conversion and redevelopment: the old business‐logic remains, only the underlying technology is changed (see Figure 1). Reimplementation is often tried after (several) failed attempts to redevelop and/or convert. We illustrate this for a state employee payment system. Our cost/risk estimates turned out to be reasonable; its migration progresses as expected without significant problems.
Chapter
The chapters up to this point have presented an ideal, architecture-centric view of software-systems. Experience shows daily that this idealized world is under continuous pressure from many fronts. Cost factors, time-to-market, short-sighted managers, myopic project teams, and—last but not least—new development paradigms trying to reduce the cost and time-to-market for software development. The philosophy of future-proof software-systems is endangered. This chapter evaluates some of the recent developments, i.e., Agile Methods, Continuous Delivery, and DevOps. Finally, the important topic of legacy system modernization is presented.
Chapter
The central question in modern systems engineering is without doubt: “Which mechanisms, methods, and processes are required to successfully manage complexity, change and uncertainty?” Long and proven experience has shown that the underlying structure, i.e., the systems architecture, determine most of the properties of a complex system! Adequate, well-maintained, and strictly enforced systems architecture during system generation, evolution, and maintenance is the key success factor for the value of long-lived, dependable, trustworthy, and economically viable software-systems. Fortunately, systems and software architecture are becoming more and more a true engineering discipline with accepted principles, patterns, processes, and models. Gone are the days where architecture was a “black art” only mastered by a few professionals. This chapter introduces the key concepts of software architecture.
Chapter
Software-systems evolve over time. The evolution is driven by change in business requirements, market conditions, operating environment changes and technology progress. This requires a relentless adaptation of the Software-systems. Building and maintaining future-proof software-systems needs a clear and realistic pathway. The first element for this is a strategy: The strategy prescribes the process for the management of the evolution of the future-proof software-system. Many software-evolution strategies exist in the literature. Here a proven strategy for very large, long-lived, and mission-critical Software-systems is presented: The “Managed Evolution”. Managed Evolution steers the evolution process in such a way, that the business value, the changeability and the dependability of the future-proof software-systems are continuously improved, while other quality of service properties are guaranteed to be as good as necessary.
Article
Millions of open source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. First, we mine millions of bug-fixes from the change histories of projects hosted on GitHub in order to extract meaningful examples of such bug-fixes. Next, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. In our empirical investigation, we found that such a model is able to fix thousands of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9--50% of the cases, depending on the number of candidate patches we allow it to generate. Also, the model is able to emulate a variety of different Abstract Syntax Tree operations and generate candidate patches in a split second.
Conference Paper
Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. We mine millions of bug-fixes from the change histories of GitHub repositories to extract meaningful examples of such bug-fixes. Then, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. Our model is able to fix hundreds of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9% of the cases.
ResearchGate has not been able to resolve any references for this publication.