Article

A Memory-Based Approach to Recognizing Programing Plans.

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Most current models of program understanding are unlikely to scale up successfully. Top-down approaches require advance knowledge of what the program is supposed to do, which is rarely available with aging software systems. Bottom-up approaches require complete matching of the program against a library of programming plans, which is impractical with the large plan libraries needed to understand programs that contain many domain-speciic plans. This paper presents a hybrid approach to program understanding that uses an indexed, hierarchical organization of the plan library to limit the number of candidate plans considered during program understanding. This approach is based on observations made from studying student programmers attempt to perform bottom-up understanding on geometrically-oriented C functions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The program representation is typically a complex data structure that can be queried and otherwise manipulated. Abstract syntax trees (ASTs) constructed during the scanning and parsing of the code of a program are a popular choice for the foundation of program representations [Hartman, 1991, Kozaczynski et al., 1992, Quilici, 1994. Data flow graphs are another program representation type that is used. ...
... The approach of Quilici [1994] uses APC to find common operations and objects in C code in order to replace them with object-oriented C++ code from existing libraries. ...
... This category includes GRASPR [Wills, 1996], and PARE [Rist, 1994]. The third purpose category includes APC systems that are used to analyze and improve existing software, see RECOGNIZE [Kozaczynski et al., 1992], and Quilici's method [Quilici, 1994]. The APC systems seem to mirror human program comprehenders: the long-term knowledge stored in the human brain is represented by a programming knowledge repository, which is our term for the long-term storage of programming knowledge in APC systems. ...
... Quilici [11] defines automated program comprehension (APC) as the process of automatically extracting programming knowledge from source code. We will use the term automatic program comprehension as a synonym for the alternative notions used in literature: program recognition [18] and program understanding [7]. ...
... APC systems have many different application areas. Some maintenance related objectives of APC are the generation of documentation, refactoring (RECOGNIZE [8]) and using APC techniques to locate reusable code.The translation of a program from some language to another language can be done with an APC system, for example C programs have been translated to object-oriented C++ (Quilici's method [11]). Visualization of programming concepts may benefit from APC methods: the constructs that are to be visualized are automatically detected, thus reducing the effort of creating the visualization, or even automating the creation process. ...
... The program representation is typically a complex data structure that can be queried and otherwise manipulated. Abstract syntax trees (ASTs) constructed during the scanning and parsing of the code of a program are a popular choice for the foundation of program representations [7,8,11]. Data flow graphs are another program representation type that is used. ...
Article
Full-text available
Automatic program comprehension applications, which try to extract programming knowledge from program code, share many fea- tures of human program comprehension models. However, the human trait of learning seems to be missing among the shared features. We present an approach to integrate machine learning techniques into auto- matic program comprehension, and present an example implementation in the context of automatic analysis of roles of variables.
... Program understanding has been frequently defined as a recognition process of program plans in a fragment of source code [Quilici, 1994] [Kozaczynski, 1994] [Wills, 1990].Research works on program understanding have as goals to identify conceptual information and to develop concepts and extraction tools from existent and operational systems. Two types of extraction concepts have been identified: program plans and program slice. ...
... Indeed, every task corresponds to an abstract concept. We call a program plan a description of correspondences between a task and its sequence of instructions [Quilici, 1994]. In other words, a program plan is a recognition rule of an abstract concept from language concepts specified in source code. ...
... A program plan is a description of a computational unit contained within a program where a computational unit performs some abstract function [10] fying the plan can be located in contiguous (localized) or non-contiguous (de-localized) sequences of code [11]. To date, most plan-based approaches have been developed by research organizations [12, 13, 14] , although some industrial adoption of this approach is occurring [15]. A parsing-based approach is one in which a program is analyzed using the properties of the syntactic structure of a programming language. ...
... Among the techniques used, parsing-based is the most popular. Of note is the fact that the Software Refinery supports the use of transformations although the builtin tools do not use formal transformation as an analysis PA = PAT [29] CS = COBOL/SRE [13, 30] DE = DECODE [12] LT = LANTRN [14] MA = Maintainer's Assistant [20, 31, 32, 33] RE = REDO Toolset [21] RI = Rigi [18, 34] AS = AutoSpec [22, 35] RM = RMTool [19, 36]Table 5 compares research tools using the informational criteria. This table shows that, like the commercial tools, C and COBOL are the most widely supported languages. ...
Conference Paper
Full-text available
Several techniques have been suggested for supporting reverse engineering and design recovery activities. While many of these techniques have been cataloged in various collections and surveys, the evaluation of the corresponding support tools has focused primarily on their usability and supported source languages, mostly ignoring evaluation of the appropriateness of the by-products of a tool for facilitating particular types of maintenance tasks. In this paper, we describe criteria that can be used to evaluate tool by-products based on semantic quality, where the semantic quality measures the ability of a by-product to convey certain behavioral information. We use these criteria to review, compare, and contrast several representative tools and approaches.
... The pattern recovery process relies on a design pattern library, thus there are similarities with the program understanding and architectural recovery approaches based on clich e matching and plan recognition. The AOL intermediate representation is similar to other representations adopted by the program understanding community (Fiutem et al., 1996; Kozaczynski et al., 1992; Ning et al., 1994; Quilici, 1994; Tonella et al., 1996) while the library has its counterpart in the clich e (Fiutem et al., 1996; Tonella et al., 1996 ) and plan collections (Kozaczynski et al., 1992; Ning et al., 1994; Quilici, 1994). Our work substantially di€ers from the above mentioned: problem, programming language and tools as well as the approach are di€erent. ...
... The pattern recovery process relies on a design pattern library, thus there are similarities with the program understanding and architectural recovery approaches based on clich e matching and plan recognition. The AOL intermediate representation is similar to other representations adopted by the program understanding community (Fiutem et al., 1996; Kozaczynski et al., 1992; Ning et al., 1994; Quilici, 1994; Tonella et al., 1996) while the library has its counterpart in the clich e (Fiutem et al., 1996; Tonella et al., 1996 ) and plan collections (Kozaczynski et al., 1992; Ning et al., 1994; Quilici, 1994). Our work substantially di€ers from the above mentioned: problem, programming language and tools as well as the approach are di€erent. ...
Article
Object-Oriented (OO) design patterns are an emergent technology: they are reusable micro-architectures, high-level building blocks. A system which has been designed using well-known, documented and accepted design patterns is also likely to exhibit good properties such as modularity, separation of concerns and maintainability. While for forward engineering the benefits of using design patterns are clear, using reverse engineering technologies to discover instances of patterns in a software artifact (e.g., design or code) may help in several key areas, among which are program understanding, design-to-code traceability and quality assessment. This paper describes a conservative approach and experimental results, based on a multi-stage reduction strategy using OO software metrics and structural properties to extract structural design patterns from OO design or C++ code. To assess the effectiveness of the pattern recovery approach, a process and a portable tool suite written in Java, remotely accessible by means of any WEB browser, has been developed. The developed system and experimental results on 8 industrial software (design and code) and 200,000 lines of public domain C++ code are presented.
... Such concepts are represented using ASTs with additional constraints based on control-flows and data-flows. Quilici [8] extended this method by sacrificing the ability to recognize every concept located in the source code to improve efficiency. These approaches are similar to our proposal in that both compare the patterns of algorithms with the programs, even though our proposal only uses ASTs as patterns and such patterns are extracted from the model answers. ...
... Wills follows a bottom-up strategy to detect plans which limits the practical usage of the approach to source code about 1000 lines. Quilici [Qui94] proposes an indexing technique and combines top-down and bottom-up detection to overcome this problem. ...
Article
Full-text available
Recovering the static structure of le gacy source code e.g. as an UML class diagram is quite well understood. In contrast, recovering high-level behaviour diagrams from source code is still an open issue. This paper proposes to use fuzzy pattern detection techniques for the reco very of UML collaboration diagrams from source code. The approach is based on a knowledge base of basic datatypes and of generic collection classes and of code clichés for Ja va beans and of fuzzy patterns for object structure look-up and modification clichés. We handle the diversity of existing code clichés by organizing them in an object-oriented hierarch y f actorizing important common properties and by relaxing exactness requirements for cliché detection with the help of fuzzy theory. W e handle the runtime efforts for cliché detection using a sophisticated inference mechanism based on generic fuzzy reasoning nets (GFRN's). The work is part of the FUJABA case tool aiming to support round-trip engineering for UML and Java.
... Indeed, every task corresponds to an abstract concept. We call a program plan a description of correspondences between a task and its sequence of instructions [22]. In other words, a program plan is a recognition rule of an abstract concept from language concepts specified in source code. ...
Article
This work proposes a computer assisted assessment approach of algorithmic competencies. Aside from the fact that this approach brings a solution for delicate problem of e-assessment of algorithmic, in addition it is formative. Drawing one's inspiration from the basic principles of the algorithmic field itself, it recovers an interesting efficiency. It reclines on use of scalable solutions basis. Any learner's production, for a given problem, is automatically assessed if it is recognized, or assessed by a human expert if necessary. In the last case, it will enrich the basis if it is judged pedagogically interesting The purpose of this approach is to provide a formative and diagnostic assessment in order to empower the learner to acquire problem-solving skills of algorithmic.
... It is therefore an open question whether it is possible to develop efficient program understanding algorithms. Most program understanding algorithms attack the combinatorial problems in plan matching by using heuristic strategies [1, 3, 6, 5, 8, 12, 17] . Unfortunately, the expected performance of these heuristic approaches is difficult to determine analytically. ...
Article
The plan matching problem is to determine whether a program plan is present in a program. This problem has been shown to be NP-hard, which makes it an open question whether plan matching algorithms can be developed that scale sufficiently well to be useful in practice. This paper discusses experiments in the scalability of a series of constraint-based program plan matching algorithms we have developed. These empirical studies have led to significant improvements in the scalability of our plan matching algorithm, and they suggest that this algorithm can be successfully applied to large, real-world programs.
... Several techniques have been suggested for recovering the design artifacts from the existing systems. These techniques range from formal approach [9] to semiformal functional abstraction [10] and structural abstraction [6]. The representations constructed by these techniques are often biased by the implementations, and as such, do not always correspond to existing high-level models in the recovery process. ...
Conference Paper
As a software system evolves, new features are added and obsolete are removed, the design artifacts gradually diverge from its original design. Many approaches for design recovery or reverse engineering has been suggested, most with some type of support tool. Since a project's time constraints may prohibit use of sophisticated techniques and/or tools due to the learning curves associated with the techniques and tools, methods that can be applied in lieu of complex support tools may be required. Reverse engineering produces a high-level representation of a software system from a low-level one. This paper describes a case study, which use the methodology for reverse engineering that recovers the design artifacts of a software system from its source code and related documentation. The methodology consists of five phases, which can be attempted at different levels of abstraction according to the task at hand to recover the design artifacts. The methodology also makes use of tools, approaches and representations typically found in the forward software development process.
Article
Complex programs often contain multiple, interwoven strands of computation, each responsible for accomplishing a distinct goal. The individual strands responsible for each goal are typically delocalized and overlap rather than being composed in a simple linear sequence. We refer to these code fragments as being interleaved. Interleaving may be intentional-for example, in optimizing a program, a programmer might use some intermediate result for several purposes-or it may creep into a program unintentionally, due to patches, quick fixes, or other hasty maintenance practices. To understand this phenomenon, we have looked at a variety of instances of interleaving in actual programs and have distilled characteristic features. This paper presents our characterization of interleaving and the implications it has for tools that detect certain classes of interleaving and extract the individual strands of computation. Our exploration of interleaving has been done in the context of a case study of acorpus of production mathematical software, written in Fortran from the Jet Propulsion Laboratory. This paper also describes our experiences in developing tools to detect specific classes of interleaving in this software, driven by the need to enhance a formal description of this software library's components. The description, in turn aids in the automated component-based synthesis of software using the library.
Article
Maintenance of legacy systems and the advancement of reverse engineering techniques have placed a renewed interest in the program comprehension process. Descriptive studies of the program comprehension process have resulted in two seemingly opposing approaches. One approach is top-down in nature (consisting of hypothesis generation, decomposition, refinement, and verification) and the other is bottom-up in nature (consisting of low-level component comprehension, and the assembly and integration of related components). An empirical study is conducted to assess the exclusive nature of each approach during the comprehension process. The results refute the exclusive classification of program comprehension processes as either top-down or bottom-up. The constructs common between the approaches are assessed in an attempt to arrive at a reconciliation. A theory-based model of program comprehension is illuminated which is capable of accounting for the utilization of both top-down and bottom-up strategies within a single comprehension episode.
Article
Modernizing heavily evolved and poorly documented information systems is a central software engineering problem in our current IT industry. It is often necessary to reverse engineer the design documentation of such legacy systems. Several interactive CASE tools have been developed to support this human-intensive process. However, practical experience indicates that their applicability is limited because they do not adequately handle imperfect knowledge about legacy systems. In this paper, we investigate the applicability of several major theories of imperfect knowledge management in the area of soft computing and approximate reasoning. The theories are evaluated with respect to how well they meet requirements for generating effective human-centred reverse engineering environments. The requirements were elicited with help from practical case studies in the area of database reverse engineering. A particular theory called "possibilistic logic" was found to best meet these requirements most comprehensively. This evaluation highlights important challenges to the designers of knowledge management techniques, and should help reverse engineering tool implementers select appropriate technologies.
Conference Paper
Computing educators often rely on black-box analysis to assess students' work automatically and give feedback. This approach does not allow analyzing the quality of programs and checking if they implement the required algorithm. We introduce an instrument for recognizing and classifying algorithms (Aari) in terms of white-box testing to identify authentic students' sorting algorithm implementations in a data structures and algorithms course. Aari uses machine learning techniques to classify new instances. The students were asked to submit a program to sort an array of integers in two rounds: at the beginning of the course before sorting algorithms were introduced, and after taking a lecture on sorting algorithms. We evaluated the performance of Aari with the implementations of each round separately. The results show that the sorting algorithms, which Aari has been trained to recognize, are recognized with an average accuracy of about 90%. When considering all the submitted sorting algorithm implementations (including the variations of the standard algorithms), Aari achieved an overall accuracy of 71% and 81% for the first and second round, respectively. In addition, we analyzed the students' implementations manually to gain a better understanding of the reasons of failure in the recognition process. This analysis revealed that students have many misconceptions related to sorting algorithms, which results in problematic implementations that are more inefficient compared with those of standard algorithms. We discuss these variations along with the application of the tool in an educational context, its limitations and some directions for future work.
Article
In this study, we examined freshmen students' sorting algorithm implementations in data structures and algorithms' course in two phases: at the beginning of the course before the students received any instruction on sorting algorithms, and after taking a lecture on sorting algorithms. The analysis revealed that many students have insufficient understanding of implementing sorting algorithms. For example, they include unnecessary swaps in their Insertion or Selection sort implementations resulting in more complicated and inefficient code. Based on the data, we present a categorization of these types of variations and discuss the implications of the results. In addition, we introduce an instrument to recognize these algorithms automatically. This is done in terms of white-box testing. Our aim is to develop an automatic assessment system to help teachers in the burden of marking students' assignments and give feedback to the students on their algorithmic solutions. We outline how the presented results can be used to develop the instrument further.
Article
This paper presents a method of related subject of Tibetan web integrating content evaluation and link analysis. The analysis of related subject of Tibetan web is the most important part of the special Tibetan Search. It guide web crawler download pages accurate and efficient. The content evaluation of this method extends the VSM based on keyword, it consider that the keywords in the page have different weight in the different position;Integrating the principles of Page-Rank, link analysis also considers that anchor text and website of the web page relevant with the theme.
Article
Applying cased-based reasoning (CBR) method in program understanding provides a practical route towards more powerful software engineering technology. A CBR approach to the recognition of model component is presented, and the whole reasoning process of the recognition is presented, including a case representation method and a matching algorithm. A prototype system named process model component recognition & reuse (PMCRR) is developed to implement model transformation and reconstruction. At last, an example is illustrated to check the efficiency of CBR method
Article
ViLLE is a visualization tool for teaching programming to novice programmers. It has an extendable support for multiple programming languages which enables language-independent learning of programming. As a new feature, ViLLE supports automatically assessed exercises. The exercises can be easily integrated into a programming course by using the TRAKLA2 web environment.
Article
In developing countries higher learning institutions have been at the forefront of acquisition developing and use of technology. Most of institutions influence the dissemination and therefore accessibility of technology in their respective regions. Moreover the higher learning institutions have been at the forefront in producing the ICT skills in need within the respective communities and hence influence the knowledge and its application within the society. In developing countries such ICT skills being produced mostly focus on satisfying the various industrial ICT demands. In this study we try to analyze the role of the higher learning institutions in the provision of ICT education and therefore assess the application of such educational curriculum to stimulate development and information access to information divisive communities in the developing countries.
Article
Conflictive animations is an approach to use animations in programming education which was introduced at last year's Koli Calling [4]. Conflictive animations are created so that they do not animate faithfully what the programs intend to do. They aim to compel the student to critically review the animation by asking them to spot possible errors or mistakes in the animation. Thus, students take a new role in their relation to educational tools, which are now prone to fail.
Article
Full-text available
The aim of higher education is to enable students to acquire knowledge and to exercise cognitive skills in order support them in their preparation for a professional career. Rather than transferring knowledge in face-to-face contact the modern teacher has to design a stimulating learning environment. The success of educational models, like Problem-Based-Learning and Active Learning is often explained by the motivating effect of discussing real-life problems in small groups of students. The technology of virtual reality provides new possibilities to involve students in learning activities. No longer do groups of students (and their teacher) have to meet at a fixed time and place. Simulations and gaming can motivate students to engage in activities that make them learn. The biggest challenge for the teacher is to imagine what is motivating for a present day student.
Article
Full-text available
The optional maturity programming exam is considered as an outcome of the secondary curriculum on information technologies in Lithuania. The most important part of the exam is the evaluation of the students' programs. A special application was developed for automatic and manual evaluation of programs. It evaluates program correctness, programming constructs and style. The application proposes a score to evaluators who make the final decision. A comparison of evaluations shows potency in this area.
Article
JLS and JLSCircuitTester are logic design, simulation and testing tools that meet the needs of instructors and students in logic design and computer organization courses. They were designed and implemented by instructors of such courses expressly to lecture with, to do student projects, and to subsequently grade those assignments. They are free, portable, easy to install and easy to learn and use, yet powerful enough to create and test circuits ranging from simple collections of gates to complete CPUs. They come with on-line tutorials, help, and pre-made circuits taken directly from the pages of several commonly used computer organization textbooks.
Article
The range of available visualization tools for programming education is impressive but the research on them is biased mainly on testing the pedagogical effectiveness of the visualization tools. Most of the studies apply empirical techniques in controlled experimentation situations. The results on the field are summarized to be "markedly mixed". As learning, in constructivist point of view, is seen as a process affected by the individual also the use of visualizations in learning programming depends on the learner. Instead of only studying whether visualizations in general are effective for learning, we should also study in which conditions visualizations are effective for certain kinds of learners. Controlled experimentation is also critizised as a method of studying learning since it creates artificial learning situations that do not reveal the real needs of the learner. This article presents a literature review on the work carried out in the field of visualizations and analyzes the situation. On the basis of related work, we propose research questions for future work and discussion about research settings and methodology for achieving useful results for developing the field of visualizations further. The aim is that with this ground work we could better utilize the earlier work: visualization tools that have already been developed and the research results related to these tools.
Article
Full-text available
The aim of this paper is to discuss our experience with, and some broader thoughts on, the use of student-produced podcasts as a means of supporting and assessing learning. The results of an assessment using this medium are reported, and student evaluation of the assessment presented and discussed.
Conference Paper
Several studies have reported positive experiences with Test-Driven Development (TDD) but the results still diverge. In this study we aim to improve understanding on TDD in educational context. We conducted two experiments on TDD in a master's level university course. The research setting was slightly changed in the second experiment and this paper focuses on comparing the differences between the two rounds. We analyzed the students' perceptions and the difficulties they faced with TDD. The given assignment clearly affected the students' reflections so that the more difficult assignment evoked a richer discussion among the students. Additionally, some insights into teaching TDD are discussed.
Article
From qualitative analysis of student interviews emerged three sets of categories, or outcome spaces, describing introductory students' understandings of variables. One outcome space describes different ways of understanding primitive variables. Another describes different understandings of object variables. The third outcome space describes the relationship between the primitive and object variables, again from the point of view of the student cohort. The results show that learners create various kinds of mental models of programming concepts, and that the concept of variable, which is fundamental to most types of programming, is understood in various non-viable ways. With the help of the outcome spaces, teaching materials and tools can be developed to explicitly address potential pitfalls and highlight educationally critical aspects of variables to students. A software tool, which would engage students to interact with and manipulate a visualization of a notional machine, suggests itself as an intriguing avenue for future work.
Article
Software systems are increasingly becoming more and more complex. The functions they aresupposed to support are increasing both in number and in complexity. Because of this, the sizeof software systems of today is constantly increasing.
Conference Paper
Full-text available
Examining the behavior of a large legacy software system helps understand its functionality. Dynamic analysis techniques are well suited for this purpose. Run- time information is typically represented in the form of execution traces; however, the amount of information contained in a trace, of even a small program, can be very large and usually overwhelming. It becomes important to filter these traces and present only the information that adds value to the comprehension process. Many researchers agree that analyzing recurrent patterns in a trace can be useful to bridge the gap between low-level system components and high-level domain concepts. This paper introduces an efficient algorithm that extracts patterns of procedure calls of large execution traces. We also present a set of matching criteria that can be used in procedural as well as object oriented software systems to decide when two patterns can be considered equivalent.
Article
Our research revealed creativity as a pathway to computer science in the biographies of CS freshman. Furthermore the application of creativity in CS classes was found to be a powerful instrument to address students' motivation and interest. This poster concludes the findings of the research projects by giving concrete teaching tips of how creativity could be regarded when planning and conducting CS lessons.
Article
Full-text available
We use empirical studies of how students understand concurrent programming and write concurrent programs to determine problem areas in students' understandings and approaches. We then suggest ways to deal with these problems to help students understand what is wrong with their concurrent programs. These include testing and visual debugging tools to help students find and understand their errors as well as feedback from teachers that makes use of these tools and knowledge of the students' understandings to clearly explain to students where they have gone wrong.
Thesis
Full-text available
Le contrôle de l’évolution des logiciels exige une compréhension profonde des changements et leur impact sur les différents artefacts du système. Nous proposons une approche de multi-modélisation pour l’analyse d’im- pact du changement pour une compréhension des effets des modifications prévus ou réels dans les systèmes distribués. Ce travail consiste à élaborer une modélisation des artefacts logiciels et de leur différents liens d’interdépendance pour construire un système à base de connaissances permettant, entre autres, d’assister les développeurs et les chargés de l’évolution des logiciels pour établir une évaluation a priori de l’impact des modifications. La modélisation que nous élaborons intègre deux descriptions majeures des logiciels, dans un premier temps, la description structurelle sous-jacente qui englobe l’ensemble des niveaux granulaires et l’abstraction des constituants logiciels, et ensuite la description qualitative conçue pour s’intégrer à la description précédente.Deux modèles, d’abord élaborés individuellement pour les deux descriptions respectives, ont été intégrés ou mis en correspondance dans l’objectif d’étudier l’impact de toute modification et sa potentielle propagation à travers les constituants logiciels concernés. Lors de chaque modification, il devient alors possible d’établir un bilan qualitatif de son impact. La modélisation intégrée est élaborée pour se prêter à un raisonnement à base de règles expertes. La modélisation proposée est en cours d’expérimentation et validation à travers le développement d’une plate-forme d’implémentation basée sur l’environnement Eclipse.
Article
JACKAL is a clich?-based program understanding tool that relies on the combination of a stylized abstract language (AL) representation, derived solely from simple syntactic analysis, along with a pair of matching algorithms (one linear and one tree), and a library of clich?s to perform its analysis. Developing a clich? library involves constructing the library infrastructure, general-purpose tools for clich? development, and a methodology for creating clich?s.In this paper, we describe the two major tasks of 1) building the physical structure of the library, along with the capabilities for managing and maintaining that structure and 2) identifying code segments as potentially useful clich?s and transforming those code segments into clich?s capable of matching as general a code set as possible.
Conference Paper
Full-text available
Automatic program comprehension (PC) has been extensively studied for decades. It has been studied mainly from two different points of view: understanding the functionality of a program and understanding program structure. In this paper, we address the problem of automatic algorithm recognition and introduce a method based on static analysis to recognize algorithms. We discuss the applications of the method in the context of automatic assessment to widen the scope of programming assignments that can be checked automatically.
Conference Paper
Full-text available
TRAKLA2 is an online practicing environment for data structures and algorithms. The system includes visual algorithm simulation exercises, which promote the understanding of the logic and behaviour of several basic data structures and algorithms. One of the key features of the system is that the exercises are graded automatically and feedback for the learner is provided immediately. The system has been used by many institutes worldwide. In addition, several studies conducted with the system have revealed that the learning results are similar to those obtained in closed labs if the tasks are the same. Thus, automatic assessment of visual algorithm simulation exercises provides a meaningful way to reduce the workload of grading the exercises while still maintaining good learning results.
Conference Paper
Full-text available
Explanograms provide "a sketch or diagram that students can play" [10]. They are a directly recorded multi-media resource that can be viewed dynamically. Often they are used in teaching situations to provide animated explanations of concepts or processes. Explanograms were initially based upon proprietary paper and digital pen technology. The project outlined here augments that design by using a tablet PC as a mobile, general purpose capture platform which will interoperate with the existing server based system developed in Sweden. The design of this platform is intended to achieve both learning and research outcomes, in a research linked learning model for global software development. The project has completed an initial development phase during which a prototype has been built, and a consolidation, extension and evaluation phase is now underway. The origins and goals of the research, the methodology adopted, the design of the application and the challenges that the New Zealand based team have faced are presented.
Article
We present a new program slicing process for identifying and extracting code fragments implementing functional abstractions. The process is driven by the specification of the function to be isolated, given in terms of a precondition and a postcondition. Symbolic execution techniques are used to abstract the preconditions for the execution of program statements and predicates. The recovered conditions are then compared with the precondition and the postcondition of the functional abstraction. The statements whose preconditions are equivalent to the pre and postconditions of the specification are candidate to be the entry and exit points of the slice implementing the abstraction. Once the slicing criterion has been identified the slice is isolated using algorithms based on dependence graphs. The process has been specialized for programs written in the C language. Both symbolic execution and program slicing are performed by exploiting the Combined C Graph (CCG), a fine-grained dependence based program representation that can be used for most software maintenance tasks.The work described in this paper is part of RE2, a research project aiming to explore reverse engineering and re-engineering techniques for reusing software components from existing systems.
Article
The large size and high-percentage of domain-specific code in most legacy systems makes it unlikely that automated tools will be able to extract a complete underlying design. Yet automated tools can clearly recognize portions of the design. This suggests exploring environments in which programmer and system work together to understand legacy software. DECODE is such an environment. It supports programmer and system co-operation to extract design information from legacy software systems. DECODE's automated program understanding component recognizes standard implementations of domain independent plans to produce an initial knowledge base of object-orientated design elements. DECODE's structured notebook component provides the user with a graphical view of the initial understanding, which the user can extend by linking arbitrary source code fragments to either existing or new design elements, and then uses this design information to support conceptual queries about the program's code and design.
Conference Paper
In traditional way, software plans are represented explicitly by some semantic schemas. However, semantic contents, constrains and relations of plans are hard for explicit presentation. Besides, it is a heavy and error-prone work to build such a library of plans. Algorithms of recognition of such plans demand exact matching by which semantic denotation is obvious itself. We thus present a novel approach of applying neural network in the presentation and recognition of plans via asymmetric Hebbian plasticity and non-linear auto-regressive with exogenous inputs (NARX) to learn and recognize plans. Semantics of plans are represented implicitly and error-tolerant. The recognition procedure is also error-tolerant because it tends to match fuzzily like human. Models and relevant limitations are illustrated and analyzed in this article
Article
Modernizing heavily evolved and badly documented information systems is a central software engineering problem in our current IT industry. Often, existing legacy information systems cannot simply be replaced by new systems, because they maintain legacy data of critical importance to the company's mission. Thus, reverse engineering the design documentation of such legacy systems is often a necessity. Several interactive CASE tools have been developed to support this cognitive and human-intensive process. However, practical experience indicates that their applicability is limited because they do not consider imperfect knowledge about legacy systems. Our research tries to overcome this problem by integrating reverse engineering tools with formal theories for approximate reasoning. This article makes three main contributions: it elaborates on the necessary properties that such a theory should have for successful applications to software reverse engineering, it introduces Generic Fuzzy Reasoning Nets as an example of such a theory, and it presents an example application of this theory that clearly demonstrates the benefits of the approach.
Article
Current approaches to parallelizing compilation perform a purely structural analysis of the sequential code. Conversely, a semantic analysis performing concept assignment for code sections, can support the recognition of the algorithms that the code implements. This can considerably help the parallelization process, by allowing the introduction of heuristics and an extensive pruning of the search space, and thus enabling the application of more aggressive code transformations. It can play an important role in overcoming the current limitations to Automatic Parallelization. In this paper we discuss the applicability of concept comprehension to the parallelization process, and we present a novel technique for automatic algorithmic recognition we have designed and implemented. We are currently developing a reverse engineering tool supporting the translation of sequential Fortran code into HPF, which is based on the recognition technique we have developed. Its working criteria are illustrated and discussed.
Article
Many organizations today are facing the problem of software migration: porting existing code to new architectures and operating systems. In many cases, such legacy code is written in a mainframe-specific assembly language and needs to be translated to a high-level language in order to be run on different architectures. Our research addresses this problem in a large-scale, real-life case study. We built an automatic tool, called Bogart, that translates IBM 370 assembly language programs to C. Bogart is based on Artificial Intelligence tools and techniques such as the Plan Calculus, translation by abstraction and re-implementation, program transformations, constraint propagation, and pattern recognition.
Article
An abstract is not available.
Article
A sufficiency theory is presented of the process by which a computer programmer attempts to comprehend a program. The theory is intended to explain four sources of variation in behavior on this task: the kind of computation the program performs, the intrinsic properties of the program text, such as language and documentation, the reason for which the documentation is needed, and differences among the individuals performing the task. The starting point for the theory is an analysis of the structure of the knowledge required when a program is comprehended which views the knowledge as being organized into distinct domains which bridge between the original problem and the final program. The program comprehension process is one of reconstructing knowledge about these domains and the relationship among them. This reconstruction process is theorized to be a top-down, hypothesis driven one in which an initially vague and general hypothesis is refined and elaborated based on inf ormation extracted from the program text and other documentation.
Article
The recognition of familiar computational structures in a program can help an experienced programmer to understand a program. Automating this recognition process will facilitate many tasks that require program understanding, e.g., maintenance, translation, and debugging. This paper describes a prototype recognition system which demonstrates the feasibility of automating program recognition. The prototype system automatically identifies occurrences of stereotyped algorithmic fragments and data structures, called clichés, in programs. It does so even though the clichés may be expressed in a wide range of syntactic forms and may be in the midst of unfamiliar code. Based on the known behaviors of these clichés and the relationships between them, the system generates a hierarchical description of a plausible design of the program. It does this systematically and exhaustively, using a parsing technique. This work is built on two previous advances: a graphical, programming-language-independent representation for programs, called the Plan Calculus, and an efficient graph parsing algorithm.
Article
Understanding how a program is constructed and how it functions are significant components of the task of maintaining or enhancing a computer program. We have analyzed vidoetaped protocols of experienced programmers as they enhanced a personnel data base program. Our analysis suggests that there are two strategies for program understanding, the systematic strategy and the as-needed strategy. The programmer using the systematic strategy traces data flow through the program in order to understand global program behavior. The programmer using the as-needed strategy focuses on local program behavior in order to localize study of the program. Our empirical data show that there is a strong relationship between using a systematic approach to acquire knowledge about the program and modifying the program successfully. Programmers who used the systematic approach to study the program constructed successful modifications; programmers who used the as-needed approach failed to construct successful modifications. Programmers who used the systematic strategy gathered knowledge about the causal interactions of the program's functional components. Programmers who used the as-needed strategy did not gather such causal knowledge and therefore failed to detect interactions among components of the program.
Article
Various models of program undestanding have been developed from the Schema Theory. To data, the authors have sought to identify the knowledge that programmers have and use in understanding programs, i.e. Programming Plans and Rules of Discourse. However, knowledge is only one aspect of program understanding. The other aspect is the cognitive mechanisms that use knowledge. The contribution of this study is the identification of different mechanisms involved in program understanding by experts, specifically the mechanisms which cope with novelty. An experiment was conducted to identify and describe the expert's strategies involved in understanding usual (plan-like) and unusual (unplan-like) programs. While performing a fill-in-a-blank task, subjects were asked to talk aloud. The analysis of verbal protocols allowed the identification of four different strategies of understanding. Under “normal” conditions the strategy of sympbolic simulation is involved. But when failures occur additional strategies are required. The authors identified three types of understanding failures the subject may experience (no expectation, expectation clashes, insufficient expectations) and the additional strategies invoked in those cases: (1) reasoning according to rules of discourse and principles of the taks domain; (2) reasoning with plan constrains; (3) concrete simulation. The authors develop an operational description of these strategies and discuss the control structure of program understanding in the framework of schema theory.
Article
Program debugging is an important part of the domain expertise required for intelligent tutoring systems that teach programming languages. This article explores the process by which student programs can be automatically debugged in order to increase the instructional capabilities of these systems. The research presented provides a methodology and implementation for the diagnosis and correction of nontrivial recursive programs. In this approach, recursive programs are debugged by repairing induction proofs in the Boyer‐Moore logic. The induction proofs constructed and debugged assert the computational équivalence of student programs to correct exemplar solutions. Exemplar solutions not only specify correct implementations but also provide correct code to replace buggy student code. Bugs in student code are repaired with heuristics that attempt to minimize the scope of repair. The automated debugging of student code is greatly complicated by the tremendous variability that arises in student solutions to nontrivial tasks. This variability can be coped with, and debugging performance improved, by explicit reasoning about computational semantics during the debugging process. This article supports these claims by discussing the design, implementation, and evaluation of Talus, an automatic debugger for LISP programs, and by examining related work in automated program debugging. Talus relies on its abilities to reason about computational semantics to perform algorithm recognition, infer code teleology, and to automatically detect and correct nonsyntactic errors in student programs written in a restricted, but nontrivial, subset of LISP. Solutions can vary significantly in algorithm, functional decomposition, role of variables, data flow, control flow, values returned by functions, LISP primitives used, and identifiers used. Solutions can consist of multiple functions, each containing multiple bugs. Empiricial evaluation demonstrates that Talus achieves high performance in debugging widely varying student solutions to challenging tasks.
Article
Printout. Thesis (Ph. D.)--University of Illinois at Urbana-Champaign, 1989. Vita. Includes bibliographical references (leaves 173-178). Available on microfilm from University Microfilms.
Article
Recognizing standard computational structures (cliches) in a program can help an experienced programmer understand the program. We develop a graph parsing approach to automating program recognition in which programs and cliches are represented in an attributed graph grammar formalism and recognition is achieved by graph parsing. In studying this approach, we evaluate our representation's ability to suppress many common forms of variation which hinder recognition. We investigate the expressiveness of our graph grammar formalism for capturing programming cliches. We empirically and analytically study the computational cost of our recognition approach with respect to two medium-sized, real-world simulator programs.
Conference Paper
A hybrid approach to program understanding is presented. It uses an indexed, hierarchical organization of the plan library to limit the number of candidate plans considered during program understanding. This approach is based on observations made from studying the attempts of student programmers to perform bottom-up understanding on geometrically oriented C functions and relies on a highly organized plan library, where each plan has indexing, specialization, and implication links to other plans. It uses an algorithm that takes advantage to these indices to suggest general candidate plans to match top-down against the code, specializations to refine these general plans once they are recognized, and implications to recognize other, related plans without performing further matching
Conference Paper
The author presents a practical method for automatic control concept recognition in large, unstructured imperative programs. Control concepts are abstract notions about interactions between control flow, data flow, and computation, e.g., read-process loops. They are recognized by comparing a language-independent abstract program representation against standard implementation plans. Recognition is efficient and scalable because the program representation is hierarchically decomposed by propers (single entry/exit control flow subgraphs). A recognition experiment using the UNPROG program understander shows the method's performance, the role of proper decomposition, and the ability to use standard implementations in a sample of programs. How recognized control concepts are used to perform Cobol restructuring with quality not possible with existing syntactic methods is described
Article
The research described is based on a model of the cognition involved in program understanding. In the model, program understanding is viewed as a process of recognizing plans in the code. The central claim is that the likelihood of a reader's correctly recognizing a plan in a program decreases when the lines of code are spread out or delocalized in the text of the program instead of being closely grouped. If the lines of code implementing a plan are close together, readers tend to have no trouble recognizing the plan. Examples are presented from protocol studies of expert programmers, illustrating certain common kinds of comprehension errors that can occur in the reading of code during maintenance.
Article
The automated recognition of abstract high-level conceptual information or concepts, which can greatly aid the understanding of programs and therefore support many software maintenance and reengineering activities, is considered. An approach to automated concept recognition and its application to maintenance-related program transformations are described. A unique characteristic of this approach is that transformations of code can be expressed as transformations of abstract concepts. This significantly elevates the level of transformation specifications
\Recognition In A Program Understanding System
  • S Fickas
  • R Brooks
S. Fickas and R. Brooks. \Recognition In A Program Understanding System", in Proceedings of the 6th IJCAI, Tokyo, Japan, 1979.
References 1] R. Brooks. \Toward a Theory of the Comprehension of Computer Programs
References 1] R. Brooks. \Toward a Theory of the Comprehension of Computer Programs", International Journal of Man-Machine Studies, 18, 1983.
of the comprehensiom of computer programs
  • R Brooks
  • Ihcor
RecogmtioJ, in a pragram understanding system
  • S Fickas
  • R Brooks
Itterttion Based Diagno- .sis .g Novice Programmibng Errors. Morgan Kaufman, los Altos , Calif. , 1986 . joimson, W.I .. Itterttion Based Diagno- .sis .g Novice Programmibng Errors
  • W I Joimson
  • D Littman
  • J Pinto
  • S Letovsky
  • E Soloway
D. Littman, J. Pinto, S. Letovsky, and E. Soloway, \Mental Models and Software Maintenance", in Empirical Studies of Programmers, E. Soloway and S. Iyengar (editors), Ablex, Norwood, NJ, 1986.
\Empirical Studies of Programming Knowledge
  • E Soloway
  • K Erdlich
E. Soloway and K. Erdlich. \Empirical Studies of Programming Knowledge", IEEE Transactions on Software Engineering, 10(5), 1984.
A know.ledgedased approach to automatic program analysis Ph.D. thesis University of illinois 1989. Ning. J.Q. A know.ledgedased approach to automatic program analysis Ph
  • J Q Ning
Plan analysis of programs. Ph.I). thesis Yale University New Haven Conn. I988. Letovsky S. Plan analysis of programs. Ph.I)
  • S Letovsky
Th e .Pr0gmmme)" .4pprentice. Addtison WesIey Reading Mass . 1990 Rich C. and. Waters R. Th e .Pr0gmmme)" .4pprentice. Addtison WesIey Reading Mass
  • C Rich
  • R Waters
em- Pirically-derived control structme fin the process of program understando in lnt
  • L Detiennne
  • E Soloway
  • An
Atltoinate program rec-. ognition by graph parsing
  • L M Wills
  • Wills L.M.
Fickas, S. a.nd Brooks. R. RecogmtioJ, in a pragram understanding system
  • S Fickas
  • R Brooks
  • Recogmtioj
  • Fickas S.
Hartman J. undersatning nastuarl programs using proper decomposition
  • J Hartman
  • Hartman J.
Pr0gmmme)",.4pprentice
  • C Rich
  • R Waters
  • Th
  • Rich C.