Conference Paper

Reverse Engineering of Legacy Systems: A Path Toward Success.

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This paper addresses the question of whether the re- verse engineering of legacy systems is doomed to fail- ure. Our position is that the answer is highly depen- dent on the specific goals of the reverse engineering process. We argue that while most reverse engineer- ing efforts may well fail to achieve the traditional goal of automatically extracted complete specifica- tions suitable for forward engineering, they are likely to succeed on the more modest goal of automatically extracting partial specifications that can augmented by system-assisted human understanders.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Clustering is a technique of automatically constructing categories or taxonomies for a set of objects. Clustering aims at grouping all entities (e.g., source files or classes) into clusters [40] to support AR. For example, in [S28], the authors proposed a word clustering technique, Latent Dirichlet Allocation (LDA), which groups structural and lexical information of the systems to recover its layered architecture. ...
... While, pattern detection concerns common abstractions hidden and embedded in the system. However, in practice not all the entities of a system can be covered by pattern detection approach [40]. In [S1], the authors presented a pattern-based information extraction approach to recover, represent, and explore information about architecture rationale from architecture documents. ...
Article
Full-text available
Context: Information from artifacts in each phase of the software development life cycle can potentially be mined to enhance architectural knowledge. Many text analysis techniques have been proposed for mining such artifacts. However, there is no comprehensive understanding of what artifacts these text analysis techniques analyze, what information they are able to extract or how they enhance architecting activities. Objective: This systematic mapping study aims to study text analysis techniques for mining architecture-related artifacts and how these techniques have been used, and to identify the benefits and limitations of these techniques and tools with respect to enhancing architecting activities. Method: We conducted a systematic mapping study and defined five research questions. We analyzed the results using descriptive statistics and qualitative analysis methods. Results: Fifty-five studies were finally selected with the following results: (1) Current text analysis research emphasizes on architectural understanding and recovery. (2) A spectrum of text analysis techniques have been used in textual architecture information analysis. (3) Five categories of benefits and three categories of limitations were identified. Conclusions: This study shows a steady interest in textual architecture information analysis. The results give clues for future research directions on improving architecture practice through using these text analysis techniques.
... A comparison between our work and related work is given in Section 5. Finally, we summarise the paper and a future plan. Knowledge Representation plan [1, 3, 9, 10, 11, 13, 14, 18, 19], semantic/connectionist network [4], graph (chart) [21, 22],tree/outline/hierarchy [1, 3] Reasoning Techniques classic reasoning [1, 3, 9, 10, 11, 13, 14, 19], uncertainty reasoning [4], inductive reasoning [6] Control Strategies for Reasoning bottom-up [4, 9, 10, 13, 19], top-down [4], dynamic programming/hybrid search [11], flexible/multi-purpose application [21] Knowledge Base Management hierarchy [1] Program Space Management dominance tree [5] Assessment Environment MACS [7], Medona [12] Embedded in Other Tasks transformation [8, 20], bug-seeking [10] ...
... A comparison between our work and related work is given in Section 5. Finally, we summarise the paper and a future plan. Knowledge Representation plan [1, 3, 9, 10, 11, 13, 14, 18, 19], semantic/connectionist network [4], graph (chart) [21, 22],tree/outline/hierarchy [1, 3] Reasoning Techniques classic reasoning [1, 3, 9, 10, 11, 13, 14, 19], uncertainty reasoning [4], inductive reasoning [6] Control Strategies for Reasoning bottom-up [4, 9, 10, 13, 19], top-down [4], dynamic programming/hybrid search [11], flexible/multi-purpose application [21] Knowledge Base Management hierarchy [1] Program Space Management dominance tree [5] Assessment Environment MACS [7], Medona [12] Embedded in Other Tasks transformation [8, 20], bug-seeking [10] ...
Article
Full-text available
Program understanding is the process of acquiring knowledge from a computer program. Although research work utilising knowledge engineering techniques has been undertaken in this field, it is our observation that a thor-ough application of AI methodology has not been suffi-ciently explored. In this paper, we present a clarity guided belief revision approach to domain knowledge recovery in legacy software systems. Novel solutions are given to three key AI issues in the context of domain knowledge recovery from source code: knowledge representation, where con-crete semantic network is separated from abstract semantic network to better accommodate uncertainty reasoning and propagation; uncertainty reasoning, which borrows ideas from confirmation theory and recasts them in the context of semantic network reasoning; heuristic search, which is designed on the principle of programming psychology. Our approach is light-weighted. It can be used stand-alone or as a complement to traditional heavy-weighted domain knowl-edge recovery methods.
... Much research has been done in the domain of legacy system analysis and modernization. Reverse engineering research of legacy systems has been discussed in [10] [11] [12] [13] [14]. Design discovery of legacy systems was also tackled in [15] [16] [17]. ...
... Much research has been done in the domain of legacy system analysis and modernization. Reverse engineering research of legacy systems has been discussed in1011121314. Design discovery of legacy systems was also tackled in151617. The work in [18] also tackled the issue of design metrics in legacy systems. ...
Conference Paper
Full-text available
In this article, we present a process and tool that facilitates the decision making process of selecting an appropriate migration strategy for legacy software systems using the Service Oriented Architecture. Issues related to legacy system migration were initially discussed, followed by two famous techniques namely the Service-Oriented Migration and Reuse Technique of the Department of Defense, as well as that of Erradi. A hybrid process was then introduced, and a tool following the process was illustrated. The tool takes into consideration choosing key migrating evaluation factors, rating the relative importance of such factors, inputting the organizational significance of each factor, as well as operating on individual system components.
... As a result of this rapid growth there is a need to understand the relationships between the different parts of a large system [1] [3]. The substantial amount of existing legacy code and/or high number of participants in code development also necessitates the use of tools for reverse engineering [22]. Reverse engineering is " the process of analyzing a subject system to (a) identify the system's components and their interrelationships and (b) create representations of a system in another form at a higher level of abstraction " [4]. ...
... As a result of this rapid growth there is a need to understand the relationships between the different parts of a large system [1,3]. The substantial amount of existing legacy code and/or high number of participants in code development also necessitates the use of tools for reverse engineering [22]. Reverse engineering is "the process of analyzing a subject system to (a) identify the system's components and their interrelationships and (b) create representations of a system in another form at a higher level of abstraction" [4]. ...
Conference Paper
Full-text available
One of the most critical issues in large-scale software development and maintenance is the rapidly growing size and complexity of software systems. As a result of this rapid growth there is a need to better understand the relationships between the different parts of a large software system. In this paper we present a reverse engineering framework called Columbus that is able to analyze large C++ projects, and a schema for C++ that prescribes the form of the extracted data. The flexible architecture of the Columbus system with a powerful C++ analyzer and schema makes it a versatile and readily extendible toolset for reverse engineering. This tool is free for scientific and educational purposes and we fervently hope that it will assist academic persons in any research work related to C++ re- and reverse engineering.
... Without the understanding of the system, in other words without the accurate documentation of the system, it is not possible to maintain, extend, and integrate the system with other systems [76,89,95]. The methodology to reconstruct this missing documentation is reverse engineering. ...
... Weiderman et al. [53] presented a system modernization approach that leverage middleware and wrapping technology. The use of reverse engineering techniques in legacy system modernization has been reported by Quilici [36] and Weide et al. [52]. Bisbal et al. analyzed the existing legacy system modernization approaches in their survey [12]. ...
Article
Full-text available
Existing research in legacy system modernization has traditionally focused on technical challenges, and takes the standpoint that legacy systems are obsolete, yet crucial for an organization's operation. Nonetheless, it remains unclear whether practitioners in the industry also share this perception. This paper describes the outcome of an exploratory study in which 26 industrial practitioners were interviewed on what makes a software system a legacy system, what the main drivers are that lead to the modernization of such systems, and what challenges are faced during the modernization process. The findings of the interviews have been validated by means of a survey with 198 respondents. The results show that practitioners value their legacy systems highly, the challenges they face are not just technical, but also include business and organizational aspects.
... According to Choi and Scacchi (1990, p.66) reverse engineering in general requires "a design description from an implementation description" and abstraction of four critical properties of a system: structural (resource exchange between modules and subsystems through interfaces), functional (semantics of the exchanged resources with pre-and post-conditions), dynamic (procedural algorithms), and behavioural (behaviour of system objects). In the computer and software environment, Quilici (1995) connects forward engineering and reverse engineering by stating that the purpose of reversed engineering is to produce abstractions from an existing legacy code, and these specifications can be used in forward engineering of a new version of the system. Reverse engineering has not lost its significance, even though the original methods were developed in the early 1990s. ...
Article
The paper introduces a procedure for recovering the service design in existing service systems, based on reverse engineering methodology. The complete procedure for planning a reverse engineering procedure, implementing it and analysing the results is presented. The main objective in the methodological approach is to map the service based on real implementation. The methodology and the reverse engineering tool approach are based on a review of forward engineering methods in services, and designing services and reverse engineering implementations in computer and information systems. The implementation of the framework is demonstrated in an empirical service system consisting of human, material and information elements. The main findings of the study concern analysing the existing service system, how the existing service system includes various types of service designs, and how higher level abstractions of the service system reveal new insights into the system.
... È infatti difficile trovare gli sviluppatori originali che hanno partecipato allo sviluppo del codice; ❑ utilizzo di tools e delle tecniche proprie del reverse engineering per generare rappresentazioni ad alto livello del codice: vedremo nel paragrafo 7 una lista di tools di reverse engineering. In letteratura esistono due definizioni diverse di reverse engineering, una "forte" e una cosiddetta "debole" [22]: ❑ definizione forte di reverse engineering: è un processo che a partire dal codice legacy permette di estrarre le specifiche formali del sistema. Le informazioni di design del sistema vengono derivate dal codice come passo intermedio (Figura 3) e le specifiche formali estratte possono essere usate per creare una nuova implementazione del sistema. ...
Article
Full-text available
Il reverse engineering è un processo di analisi mirato all'identificazione delle componenti e relazioni di un sistema software. Da alcuni esperti è considerato un potente mezzo, mentre da altri un processo non realizzabile in pratica. Chi ha ragione? In questo articolo vengono sottolineati i limiti e le potenzialità del reverse engineering. Prima di concludere con due storie di successo a livello industriale, l'articolo propone anche una possibile classificazione dei tool usati nel processo di reverse engineering.
... As a result of this rapid growth there is a need to understand the relationships between the different parts of a large system [1,3]. The substantial amount of existing legacy code and/or high number of participants in code development also necessitates the use of tools for reverse engineering [22]. Reverse engineering is "the process of analyzing a subject system to (a) identify the system's components and their interrelationships and (b) create representations of a system in another form at a higher level of abstraction" [4]. ...
Article
Full-text available
One of the most critical issues in large-scale software development and maintenance is the rapidly growing size and complexity of software systems. As a result of this rapid growth there is a need to better understand the relationships between the different parts of a large software system. In this paper we present a reverse engineering framework called Columbus that is able to analyze large C++ projects, and a schema for C++ that prescribes the form of the extracted data. The flexible architecture of the Columbus sys-tem with a powerful C++ analyzer and schema makes it a versatile and readily extendible toolset for reverse engineering. This tool is free for scientific and educational purposes and we fervently hope that it will assist academic persons in any research work related to C++ re- and reverse engineering.
... Several proposed approaches (Bisbal 1999, Merlo et. al. 1993, Quilici 1995 advocate the construction of simple wrappers to "call" modules of the legacy system or the complete migration of the legacy system to the new platform. However, both approaches have major drawbacks. ...
Article
Full-text available
With the proliferation of different computing devices and platforms, it is becoming increasingly important for organizations to migrate their existing software systems to new environments, possibly more than one, with minimum effort and risk. In this paper, we introduce Mathaino, a platform independent, non-invasive tool that can be used to migrate, in a semi-automated manner, text-based legacy interfaces to modern web-centric target platforms, by constructing wrappers as front-ends around the original legacy-system user interface.
... The reverse-engineering community has also done work similar to this effort [1,17]. These prior works use a variety of static, dynamic, and hybrid techniques to detect interaction between objects in order to reconstruct high-level design patterns in the software architecture. ...
Article
Full-text available
Data structures define how values being computed are stored and accessed within programs. By recognizing what data structures are being used in an application, tools can make applications more ro-bust by enforcing data structure consistency properties, and devel-opers can better understand and more easily modify applications to suit the target architecture for a particular application. This paper presents the design and application of DDT, a new program analysis tool that automatically identifies data structures within an application. A binary application is instrumented to dy-namically monitor how the data is stored and organized for a set of sample inputs. The instrumentation detects which functions inter-act with the stored data, and creates a signature for these functions using dynamic invariant detection. The invariants of these func-tions are then matched against a library of known data structures, providing a probable identification. That is, DDT uses program consistency properties to identify what data structures an applica-tion employs. The empirical evaluation shows that this technique is highly accurate across several different implementations of stan-dard data structures.
... The reverse-engineering community has also done work similar to this effort [2,19]. These prior works use a variety of static, dynamic, and hybrid techniques to detect interaction between objects in order to reconstruct high-level design patterns in the software architecture. ...
Conference Paper
Full-text available
Data structures define how values being computed are stored and accessed within programs. By recognizing what data structures are being used in an application, tools can make applications more robust by enforcing data structure consistency properties, and developers can better understand and more easily modify applications to suit the target architecture for a particular application. This paper presents the design and application of DDT, a new program analysis tool that automatically identifies data structures within an application. An application binary is instrumented to dynamically monitor how the data is stored and organized for a set of sample inputs. The instrumentation detects which functions interact with the stored data, and creates a signature for these functions using dynamic invariant detection. The invariants of these functions are then matched against a library of known data structures, providing a probable identification. That is, DDT uses program consistency properties to identify what data structures an application employs. The empirical evaluation shows that this technique is highly accurate across several different implementations of standard data structures, enabling aggressive optimizations in many situations.
... Recently developed two-tier client/server implementations have placed 'fat client' applications (such as Windows GUI programs) on the PC with direct connections to remote or distributed data via TCP/IP, perhaps requiring a middleware data layer such as ODBC, avoiding the CICS 3270 presentation. These new applications often required reengineering [10,12] and complex business logic on the client applications with the associated risks and costs. Also, the fat client approach may introduce a recurring support burden because technical staff is required to maintain the operating environment on all client machines. ...
Article
Full-text available
Many organizations rely on legacy applications hosted on mainframe computer systems to provide critical information support. They find themselves pressed to integrate the data and business logic of their legacy applications with information systems developed to take advantage of client/server architectures, graphical user interfaces and the new de facto thin client, the HTML browser. The legacy object modeling approach offers significant advantages to an organization that wants to make its legacy mainframe information available to modern information systems and the World Wide Web. It's noninvasive nature allows new applications to access legacy information through the existing and reliable terminal interface without any changes to the legacy application whatsoever. A legacy object model enhances maintainability. Since the model encapsulates all terminal-specific functions and hides all legacy application behavior, only the model needs to be changed when the existing legacy application's screens change.
... To obtain a service profile of such systems, we use a reverse engineering approach to extract the information from their source code. A number of studies for reverse engineering source code to construct UML models have been proposed [3] [12] [13]. ...
Article
Full-text available
The increasing demand for reliable Web applications gives a central role to Web testing. Most of the existing works are focused on the definition of novel testing techniques, specifically tailored to the Web. However, no attempt was carried out so far to understand the specific nature of Web faults. This paper presents a user session based testing technique that clusters user sessions based on the service profile and selects a set of representative user sessions from each cluster and tailored by augmentation with additional requests to cover the dependence relationships between web pages. The created suite not only can significantly reduce the size of the collected user sessions, also viable to exercise fault sensitive paths. The results demonstrate that our approach consistently detected the majority of known faults using a relatively small number of test cases and will be a powerful system when more and more user sessions are being clustered. Comment: IEEE Publication Format, https://sites.google.com/site/journalofcomputing/
... In the context of architecture recovery, pattern detection and clustering are two complementary approaches. The first finds common abstractions embedded in the system, but in practice never covers all entities in the system [6]. The second classifies all entities in the system, but imposes a new ordering instead of some hidden ordering [7], [8], [9], [10], [11]. ...
Conference Paper
Full-text available
This paper describes a case study that uses clustering to group classes of an existing object-oriented system of significant size into subsystems. The clustering process is based on the structural relations between the classes: associations, generalizations and dependencies. We experiment with different combinations of relationships and different ways to use this information in the clustering process. The results clearly show that dependency relations are vital to achieve good clusterings. The clustering is performed with a third party tool called Bunch. Compared to other clustering methods the results come relatively close to the result of a manual reconstruction. Performance wise the clustering takes a significant amount of time, but not too much to make it unpractical. In our case study, we base the clustering on information from multiple versions and compare the result to that obtained when basing the clustering on a single version. We experiment with several combinations of versions. If the clustering is based on relations that were present in both the reconstructed and the first version this leads to a significantly better clustering result compared to that obtained when using only information from the reconstructed version.
... Fully automated approaches to software reverse engineering (RE) appear to be impossible in general [Qui95]. While automated methods have proven to be useful in unburdening the reengineer from a number of simple but laborious RE activities, it has also been recognized that these techniques are not sufficient to solve the more complex RE problems [ALV93,Big90]. ...
Conference Paper
Full-text available
Reverse engineering is an imperfect process driven by imperfect knowledge. Most current reverse engineering tools do not adequately consider these inherent characteristics. They focus an representing precise, complete and consistent knowledge and work towards enforcing predefined structures on the processes. According to our experience, this design paradigm seriously limits human-centred reverse engineering tools. An altogether different approach is to directly support the statement and subsequent resolution of imperfections. Doing so requires the imperfect knowledge be represented and imperfect procedures accommodated for. We argue that effective tools need to act as a manipulable medium for imperfect knowledge and, based on our experiences with a prototype, elaborate requirements for such tools
... Situations like the ones in these examples can occur frequently in program understanding because of incomplete plan libraries. It is unlikely that a plan library will contain all the plans necessary to understand a program (Chin and Quilici, 1996;Quilici, 1995). Calculate-Sum and Calculate-Count, for example, might also be part of a Calculate-Average plan, which may not actually occur in the plan library. ...
Article
Program understanding is often viewed as the task of extracting plans and design goals from program source. As such, it is natural to try to apply standard AI plan recognition techniques to the program understanding problem. Yet program understanding researchers have quietly, but consistently, avoided the use of these plan recognition algorithms. This paper shows that treating program understanding as plan recognition is too simplistic and that traditional AI search algorithms for plan recognition are not suitable, as is, for program understanding. In particular, we show (1) that the program understanding task differs significantly from the typical general plan recognition task along several key dimensions, (2) that the program understanding task has particular properties that make it particularly amenable to constraint satisfaction techniques, and (3) that augmenting AI plan recognition algorithms with these techniques can lead to effective solutions for the program understanding problem.
... As a result of this rapid growth there is a need to understand the relationships between the different parts of a large system [1] [2]. The substantial amount of existing legacy code and/or high number of the participants in code development also necessitates the use of tools for reverse engineering [11]. Reverse engineering is "the process of analyzing a subject system to (a) identify the system's components and their interrelationships and (b) create representations of a system in another form at a higher level of abstraction" [4]. ...
Article
Full-text available
In this paper we shortly present a reverse engineering framework called Columbus that is able to analyze large C/C++ projects. Columbus supports project handling, data extraction, -representation, -storage and-export. Efficient filtering methods can be used to produce comprehensible diagrams from the extracted information. The flexible architecture of the Columbus system (based on plug-ins) makes it a really versatile and an easily extendible tool for reverse engineering. Keywords: reverse engineering, source code parsing, large-scale software systems, UML, Class Model, C/C++, templates, call graph 1.
Article
ViLLE is a visualization tool for teaching programming to novice programmers. It has an extendable support for multiple programming languages which enables language-independent learning of programming. As a new feature, ViLLE supports automatically assessed exercises. The exercises can be easily integrated into a programming course by using the TRAKLA2 web environment.
Article
In developing countries higher learning institutions have been at the forefront of acquisition developing and use of technology. Most of institutions influence the dissemination and therefore accessibility of technology in their respective regions. Moreover the higher learning institutions have been at the forefront in producing the ICT skills in need within the respective communities and hence influence the knowledge and its application within the society. In developing countries such ICT skills being produced mostly focus on satisfying the various industrial ICT demands. In this study we try to analyze the role of the higher learning institutions in the provision of ICT education and therefore assess the application of such educational curriculum to stimulate development and information access to information divisive communities in the developing countries.
Article
Conflictive animations is an approach to use animations in programming education which was introduced at last year's Koli Calling [4]. Conflictive animations are created so that they do not animate faithfully what the programs intend to do. They aim to compel the student to critically review the animation by asking them to spot possible errors or mistakes in the animation. Thus, students take a new role in their relation to educational tools, which are now prone to fail.
Article
Full-text available
The aim of higher education is to enable students to acquire knowledge and to exercise cognitive skills in order support them in their preparation for a professional career. Rather than transferring knowledge in face-to-face contact the modern teacher has to design a stimulating learning environment. The success of educational models, like Problem-Based-Learning and Active Learning is often explained by the motivating effect of discussing real-life problems in small groups of students. The technology of virtual reality provides new possibilities to involve students in learning activities. No longer do groups of students (and their teacher) have to meet at a fixed time and place. Simulations and gaming can motivate students to engage in activities that make them learn. The biggest challenge for the teacher is to imagine what is motivating for a present day student.
Article
Full-text available
The optional maturity programming exam is considered as an outcome of the secondary curriculum on information technologies in Lithuania. The most important part of the exam is the evaluation of the students' programs. A special application was developed for automatic and manual evaluation of programs. It evaluates program correctness, programming constructs and style. The application proposes a score to evaluators who make the final decision. A comparison of evaluations shows potency in this area.
Article
JLS and JLSCircuitTester are logic design, simulation and testing tools that meet the needs of instructors and students in logic design and computer organization courses. They were designed and implemented by instructors of such courses expressly to lecture with, to do student projects, and to subsequently grade those assignments. They are free, portable, easy to install and easy to learn and use, yet powerful enough to create and test circuits ranging from simple collections of gates to complete CPUs. They come with on-line tutorials, help, and pre-made circuits taken directly from the pages of several commonly used computer organization textbooks.
Article
The range of available visualization tools for programming education is impressive but the research on them is biased mainly on testing the pedagogical effectiveness of the visualization tools. Most of the studies apply empirical techniques in controlled experimentation situations. The results on the field are summarized to be "markedly mixed". As learning, in constructivist point of view, is seen as a process affected by the individual also the use of visualizations in learning programming depends on the learner. Instead of only studying whether visualizations in general are effective for learning, we should also study in which conditions visualizations are effective for certain kinds of learners. Controlled experimentation is also critizised as a method of studying learning since it creates artificial learning situations that do not reveal the real needs of the learner. This article presents a literature review on the work carried out in the field of visualizations and analyzes the situation. On the basis of related work, we propose research questions for future work and discussion about research settings and methodology for achieving useful results for developing the field of visualizations further. The aim is that with this ground work we could better utilize the earlier work: visualization tools that have already been developed and the research results related to these tools.
Article
Full-text available
The aim of this paper is to discuss our experience with, and some broader thoughts on, the use of student-produced podcasts as a means of supporting and assessing learning. The results of an assessment using this medium are reported, and student evaluation of the assessment presented and discussed.
Conference Paper
Several studies have reported positive experiences with Test-Driven Development (TDD) but the results still diverge. In this study we aim to improve understanding on TDD in educational context. We conducted two experiments on TDD in a master's level university course. The research setting was slightly changed in the second experiment and this paper focuses on comparing the differences between the two rounds. We analyzed the students' perceptions and the difficulties they faced with TDD. The given assignment clearly affected the students' reflections so that the more difficult assignment evoked a richer discussion among the students. Additionally, some insights into teaching TDD are discussed.
Article
From qualitative analysis of student interviews emerged three sets of categories, or outcome spaces, describing introductory students' understandings of variables. One outcome space describes different ways of understanding primitive variables. Another describes different understandings of object variables. The third outcome space describes the relationship between the primitive and object variables, again from the point of view of the student cohort. The results show that learners create various kinds of mental models of programming concepts, and that the concept of variable, which is fundamental to most types of programming, is understood in various non-viable ways. With the help of the outcome spaces, teaching materials and tools can be developed to explicitly address potential pitfalls and highlight educationally critical aspects of variables to students. A software tool, which would engage students to interact with and manipulate a visualization of a notional machine, suggests itself as an intriguing avenue for future work.
Article
Our research revealed creativity as a pathway to computer science in the biographies of CS freshman. Furthermore the application of creativity in CS classes was found to be a powerful instrument to address students' motivation and interest. This poster concludes the findings of the research projects by giving concrete teaching tips of how creativity could be regarded when planning and conducting CS lessons.
Article
Full-text available
We use empirical studies of how students understand concurrent programming and write concurrent programs to determine problem areas in students' understandings and approaches. We then suggest ways to deal with these problems to help students understand what is wrong with their concurrent programs. These include testing and visual debugging tools to help students find and understand their errors as well as feedback from teachers that makes use of these tools and knowledge of the students' understandings to clearly explain to students where they have gone wrong.
Article
As the Air Force enters the 21st century, the software that runs our information systems continues to age and become harder to maintain. Organizations that maintain this legacy software are faced with the challenge of rising software maintenance costs. This paper presents a spectrum of software maintenance options that can be used to reduce the cost of maintenance. The software understanding and programmer unfamiliarity factors from the COCOMO II model are compared to graphically show the effect good software understandability can have on the cost of maintenance. The structure, application clarity, and self-descriptiveness of a software module affect its understandability. The cost of software maintenance is quantified by combining factors from the COCOMO II model and the Software Reengineering Assessment Handbook reengineering decision model that affect understandability. A spectrum of software maintenance options include status quo, redocument, reverse engineer, translate source code, restructure within a paradigm, restructure into a new paradigm, and new acquisition is presented. The benefits of each option are presented in terms of their effect on understandability. Organizations faced with rising maintenance costs should consider the full spectrum of software maintenance options before choosing to replace a software module through new acquisition. By using automated tools, some of the reengineering options present a cost-effective way to reduce the cost of software maintenance.
Article
The first activity performed by maintenance programmers when approaching the task of understanding a system is often trying to discover its high level structure, that is identifying its subsystems and their relations: in few words, the software architecture of the system. In this paper, an approach for the architectural analysis of software systems, together with an environment implementing the approach, are described. The approach is based on a hierarchical architectural model that drives the application of a set of architectural recognizers. Each recognizer builds an abstract view describing some architectural aspects of the system, or of some of its parts.
Conference Paper
Full-text available
Automatic program comprehension (PC) has been extensively studied for decades. It has been studied mainly from two different points of view: understanding the functionality of a program and understanding program structure. In this paper, we address the problem of automatic algorithm recognition and introduce a method based on static analysis to recognize algorithms. We discuss the applications of the method in the context of automatic assessment to widen the scope of programming assignments that can be checked automatically.
Conference Paper
Full-text available
TRAKLA2 is an online practicing environment for data structures and algorithms. The system includes visual algorithm simulation exercises, which promote the understanding of the logic and behaviour of several basic data structures and algorithms. One of the key features of the system is that the exercises are graded automatically and feedback for the learner is provided immediately. The system has been used by many institutes worldwide. In addition, several studies conducted with the system have revealed that the learning results are similar to those obtained in closed labs if the tasks are the same. Thus, automatic assessment of visual algorithm simulation exercises provides a meaningful way to reduce the workload of grading the exercises while still maintaining good learning results.
Conference Paper
Full-text available
Explanograms provide "a sketch or diagram that students can play" [10]. They are a directly recorded multi-media resource that can be viewed dynamically. Often they are used in teaching situations to provide animated explanations of concepts or processes. Explanograms were initially based upon proprietary paper and digital pen technology. The project outlined here augments that design by using a tablet PC as a mobile, general purpose capture platform which will interoperate with the existing server based system developed in Sweden. The design of this platform is intended to achieve both learning and research outcomes, in a research linked learning model for global software development. The project has completed an initial development phase during which a prototype has been built, and a consolidation, extension and evaluation phase is now underway. The origins and goals of the research, the methodology adopted, the design of the application and the challenges that the New Zealand based team have faced are presented.
Article
Program understanding activities are more difficult for programs written in languages (such as C) that heavily make use of pointers for data structure manipulation, because the programmer needs to build a mental model of the memory use and of the pointers to its locations. Pointers also pose additional problems to the tools supporting program understanding, since they introduce additional dependences that have to be accounted for. This paper extends the flow insensitive context insensitive points-to analysis (PTA) algorithm proposed by Steensgaard, to cover arbitrary combinations of pointer dereferences, array subscripts and field selections. It exhibits interesting properties, among which scalability resulting from the low complexity and good performances is one. The results of the analysis are valuable by themselves, as their graphical display represents the points-to links between locations. They are also integrated with other program understanding techniques like, e.g., call graph construction, slicing, plan recognition and architectural recovery. The use of this algorithm in the framework of the program understanding environment CANTO is discussed.
Conference Paper
Object-oriented application frameworks provide an established way of reusing the design and implementation of applications in a specific domain. Using a framework for creating applications is not a trivial task, however, and special tools are needed for supporting the process. Tool support, in turn, requires explicit specification of the reuse interfaces of frameworks. Unfortunately these specifications typically become quite extensive and complex for non-trivial frameworks. In this paper we discuss the possibility to reverse engineer a reuse interface specification from a framework's and its example applications' source code. We also introduce a programming environment that supports both making and using such specifications. In our environment, the reuse interface modeling is supported by a concept analysis based reverse engineering technique described in this paper.
Conference Paper
Web applications have become major driving forces for world business. Effective and efficient testing of evolving Web applications is essential for providing reliable services. In this paper, we present a user session based testing technique that clusters user sessions based on the service profile and selects a set of representative user sessions from each cluster. Then each selected user session is tailored by augmentation with additional requests to cover the dependence relationships between Web pages. The created test suite not only can significantly reduce the size of the collected user sessions, but is also viable to exercise fault sensitive paths. We conducted two empirical studies to investigate the effectiveness of our approach- one was in a controlled environment using seeded faults, and the other was conducted on an industrial system with real faults. The results demonstrate that our approach consistently detected the majority of the known faults by using a relatively small number of test cases in both studies.
Article
Program comprehension (PC) is a research field that has been extensively studied from different points of view, including human program understanding and mental models, automated program understanding, etc. In this paper, we discuss algorithm recognition (AR) as a subfield of PC and explain their relationship. We present a method for automatic AR from Java source code. The method is based on static analysis of program code including various statistics of language constructs, software metrics, as well as analysis of roles of variables in the target program. In the first phase of the method, a number of different implementations of the supported algorithms are analyzed and stored in the knowledge base of the system as learning data, and in the second phase, previously unseen algorithms are recognized using this information. We have developed a prototype and successfully applied the method for recognition of sorting algorithms. This process is explained in the paper along with the experiment we have conducted to evaluate the performance of the method. Although the method, at its current state, is still sensitive to changes made to target algorithms, the encouraging results of the experiment demonstrate that it can be further developed to be used as a PC method in various applications, as an example, in automatic assessment tools to check the algorithms used by students, the functionality that is currently missing from these tools.
Article
Full-text available
PatternCoder is a software tool to aid student understanding of class associations. It has a wizard-based interface which allows students to select an appropriate binary class association or design pattern for a given problem. Java code is then generated which allows students to explore the way in which the class associations are implemented in a programming language. This paper describes the rationale behind the tool, gives a description of the tool itself, and reports on our experiences of using the tool in our teaching.
Article
At schools special learning and programming environments are often used in the field of algorithms. Particularly with regard to computer science lessons in secondary education, they are supposed to help novices to learn the basics of programming. In several parts of Germany (e.g., Bavaria) these fundamentals are taught as early as in the seventh grade, when pupils are 12 to 13 years old. Designed age-based learning and programming environments such as Karel the robot and Kara, the programmable ladybug, are used, but learners still underachieve. One possible approach to improving both the teaching and the learning process is to specify the knowledge concerning the learners’ individual problem solving strategies, their solutions, and their respective quality. A goal of the research project described here is to design the learning environment so that it can identify and categorize several problem-solving strategies automatically. Based on this knowledge, learning and programming environments can be improved, which will optimize the computer science lessons in which they are applied. Therefore, the environments must be enhanced with special analytic and diagnostic modules, the results of which can be given to the learner in the form of individualized system feedback messages in the future. In this text preliminary considerations are demonstrated. The research methodology as well as the design and the implementation of the research instruments are explained. We describe first studies, whose results are presented and discussed.
Article
Software development remains mentally challenging despite the continual advancement of training, techniques, and tools. Because completely automating software development is currently impossible, it makes sense to seriously consider how tools can improve the mental activities of developers apart from automating them away. Such mental assistance can be called "cognitive support". Understanding and developing cognitive support in software engineering tools is an important research issue but, unfortunately, at the moment our theoretical foundations for it are inadequately developed. Furthermore, much of the relevant research has occurred outside of the software engineering community, and is therefore not easily available to the researchers who typically develop software engineering tools. Tool evaluation, comparison, and development are consequently impaired. The present work introduces a theoretical framework intended to seed further systematic study of cognitive support in the field of software engineering tools. This theoretical framework, called RODS, imports ideas and methods from a field of cognitive science called "distributed cognition". The crucial concept in RODS is that cognitive support can be understood and explained in terms of the computational advantages that are conferred when cognition is redistributed between software developer and their tools and environment. The name RODS, in fact, comes from the four cognitive support principles the framework describes. With RODS in hand, it is possible to interpret good design in terms of how cognition is beneficially rearranged. To make such analyses fruitful, a cognitive modeling framework called HASTI is also proposed. The main purpose of HASTI is to provide an analysis of ways of modifying developer cognition usi...
Conference Paper
Reverse engineering of legacy systems is a knowledge-intensive process to reconstruct the understanding of a system. A semi-automatic process that can extract architecture level structure from legacy systems is introduced in This work. Exact facts related to cross-references among ELF objects are extracted from files automatically, and then partitioned into hierarchical groups by close cooperation between domain experts and an assistant tool DEREF. By resolving the cross references among these groups, the architectural structure is reconstructed and then visualized using auto-layout techniques. A case study on three embedded operating system demonstrates that this process can be used to obtain a comprehensive understanding about legacy systems even without any a priori knowledge about its design.
Conference Paper
Full-text available
This paper describes a process of gradual reengineering of the procedural components of a legacy system. The process is integrated and completed by the data reengineering process analyzed in a previous paper by the same authors. The proposed method enables the legacy system to be gradually emptied into the reengineered system, without needing to either duplicate the legacy system or freeze it. The process consists of evolving the legacy system components toward firstly a restored system and then toward the reengineered system. Meanwhile, the legacy system can coexist with both the restored and the reengineered parts. By the end of the process, a single system will be in existence: the reengineered one. The method has been applied to reengineer a real system and demonstrated its ability to: support gradual reengineering, maintain the system at work during the process, minimize the need to freeze maintenance requests, renew the operative environment of the reengineered system with respect to the legacy system and, finally, eliminate all the system's aging symptoms
Article
Full-text available
During its life, a legacy system is subjected to many maintenance activities, which cause degradation of the quality of the system: When this degradation exceeds a critical threshold, the legacy system needs to be reengineered. In order to preserve the asset represented by the legacy system, the familiarity with it gained by the system's maintainers and users, and the continuity of execution of current operations during the reengineering process, the system needs to be reengineered gradually. Moreover, each program needs to be reengineered within a short period of time. The paper proposes a reengineering process model, which is applied to an in-use legacy system to confirm that the process satisfies previous requirements and to measure its effectiveness. The reengineered system replaced the legacy one to the satisfaction of all the stakeholders; the reengineering process also had a satisfactory impact on the quality of the system. Finally, this paper contributes to validate the cause-effect relationship between the reengineering process and overcoming the aging symptoms of a software system.
Article
The process of understanding a source code in a high-level programming language is a complex cognitive task. The provision of helpful decision aid subsystems would be of great benefit to software maintainers. Given a library of program plan templates, generating a partial understanding of a piece of software source code can be shown to correspond to the construction of mappings between segments of the source code and particular program plans represented in a library of domain source programs (plans). These mappings can be used as part of the larger task of reverse engineering source code, to facilitate many software engineering tasks such as software reuse, and for program maintenance. We present a novel model of program understanding using constraint satisfaction. The model composes a partial global picture of source program code by transforming knowledge about the problem domain and the program structure into constraints. These constraints facilitate the efficient construction of ma...
Article
Most current models of program understanding are unlikely to scale up successfully. Top-down approaches require advance knowledge of what the program is supposed to do, which is rarely available with aging software systems. Bottom-up approaches require complete matching of the program against a library of programming plans, which is impractical with the large plan libraries needed to understand programs that contain many domain-speciic plans. This paper presents a hybrid approach to program understanding that uses an indexed, hierarchical organization of the plan library to limit the number of candidate plans considered during program understanding. This approach is based on observations made from studying student programmers attempt to perform bottom-up understanding on geometrically-oriented C functions.
Conference Paper
The large size and high-percentage of domain-specific code in most legacy systems makes it unlikely that automated understanding tools will be able to completely understand them. Yet automated tools can clearly recognize portions of the design. That suggests exploring environments in which programmer and system work together to understand legacy software. This paper describes such an environment that supports programmer and system cooperating to extract an object-oriented design from legacy software systems. It combines an automated program understanding component that recognizes standard implementations of domain independent plans with with a structured notebook that the programmer uses to link object-oriented design primitives to arbitrary source code fragments. This jointly extracted information is used to support conceptual queries about the program's code and design
Conference Paper
We present a framework for recognition of data structures in programs, to aid in design recovery. The framework consists of an intermediate representation and a knowledge base containing information about typical implementations of abstract data types. The framework is suited for recognition of data structures combined with their characteristic operations. Abstract data structures can be recognized partially, they can be recognized even if they are delocalized and different independent interpretations of the same structures can be generated
Conference Paper
The development of a tool for modularizing large common business-oriented language (COBOL) programs is described. The motivation for modularizing these programs is discussed, together with a manual modularization process. The business motivation for building a tool to automate the manual process is indicated. An enabling technology and its use in the development of the tool are discussed. Experience to date in alpha-testing the tool is reported