Conference Paper

Where is the business logic?

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

One of the challenges in maintaining legacy systems is to be able to locate business logic in the code, and isolate it for different purposes, including implementing requested changes, refactoring, eliminating duplication, unit testing, and extracting business logic into a rule engine. Our new idea is an iterative method to identify the business logic in the code and visualize this information to gain better understanding of the logic distribution in the code, as well as developing a domain-specific business vocabulary. This new method combines and extends several existing technologies, including search, aggregation, and visualization. We evaluated the visualization method on a large-scale application and found that it yields useful results, provided an appropriate vocabulary is available.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Dubinsky et al. [13] proposed a method to identify business rules in the code using information retrieval techniques. They found that the quality of their technique depended on terms used in identifiers and comments. ...
... Dubinsky et al. [13] proposed a method to identify business rules in the code using information retrieval techniques. They found that the quality of their technique depended on terms used in identifiers and comments. ...
Conference Paper
Full-text available
For the maintenance of a business system, developers must understand the business rules implemented in the system. One type of business rules defines computational business rules; they represent how an output value of a feature is computed from the valid inputs. Unfortunately, understanding business rules is a tedious and error-prone activity. We propose a program-dependence analysis technique tailored to understanding computational business rules. Given a variable representing an output, the proposed technique extracts the conditional statements that may affect the computation of the output. To evaluate the usefulness of the technique, we conducted an experiment with eight developers in one company. The results confirm that the proposed technique enables developers to accurately identify conditional statements corresponding to computational business rules. Furthermore, we compare the number of conditional statements extracted by the proposed technique and program slicing. We conclude that the proposed technique, in general, is more effective than program slicing. key words: static analysis, control-flow analysis, data-dependence analysis, reverse engineering, Java
... Dubinsky et al. [13] proposed a method to identify business rules in the code using information retrieval techniques. They found that the quality of their technique depended on terms used in identifiers and comments. ...
Conference Paper
In the maintenance of a business system, developers must understand the computational business rules implemented in the system. Computational business rules define how an output value of a feature is computed from inputs, the rules are represented by conditional statements in the source code. Unfortunately, understanding business rules is a tedious and error-prone activity. Since a feature computes various outputs, developers must analyze the implementation of the feature and extract the conditional statements relevant to a particular output. In this paper, we propose a program dependence analysis technique tailored for understanding business rules. Given a variable representing an output, our approach extracts conditional statements that may affect the computation of the output. To evaluate the usefulness of the approach, we conducted an experiment with eight developers in a company. The results showed that our approach enables developers to accurately identify conditional statements relevant to business rules.
Conference Paper
Software Analytics (SA) is a new branch of big data analytics that has recently emerged (2011). What distinguishes SA from direct software analysis is that it links data mined from many different software artifacts to obtain valuable insights. These insights are useful for the decision-making process throughout the different phases of the software lifecycle. Since SA is currently a hot and promising topic, we have conducted a systematic literature review, presented in this paper, to identify gaps in knowledge and open research areas in SA. Because many researchers are still confused about the true potential of SA, we had to filter out available research papers to obtain the most SA-relevant work for our review. This filtration yielded 19 studies out of 135. We have based our systematic review on four main factors: which software practitioners SA targets, which domains are covered by SA, which artifacts are extracted by SA, and whether these artifacts are linked or not. The results of our review have shown that much of the available SA research only serves the needs of developers. Also, much of the available research uses only one artifact which, in turn, means fewer links between artifacts and fewer insights. This shows that the available SA research work is still embryonic leaving plenty of room for future research in the SA field.
Conference Paper
Full-text available
Software evolution often requires the untangling of code. Particularly challenging and error-prone is the task of separating computations that are intertwined in a loop. The lack of automatic tools for such transformations complicates maintenance and hinders reuse. We present a theory and implementation of fine slicing, a method for computing executable program slices that can be finely tuned, and can be used to extract non-contiguous pieces of code and untangle loops. Unlike previous solutions, it supports temporal abstraction of series of values computed in a loop in the form of newly-created sequences. Fine slicing has proved useful in capturing meaningful subprograms and has enabled the creation of an advanced computation-extraction algorithm and its implementation in a prototype refactoring tool for Cobol and Java.
Conference Paper
Full-text available
The representation of traceability links in requirements knowledge is vital to improve the general understanding of requirements as well as the relevance and consequences of relations between requirements artifacts and other artifacts in software engineering. Various visualization techniques have been developed to support the representation of traceability information, e.g. traceability matrices, graphs and tree structures. However, these techniques do not scale well on large amounts of artifacts and often do not provide additional functionality to present supplementary data. In this paper, we use Sunburst and Netmap visualizations as alternative visualization techniques. These techniques perform well even on large amounts of artifacts and traceability links. Moreover, they provide the ability to present derivative data. An implementation of the visualizations was developed in conjunction with a requirements plugin for the Redmine project management platform. In this paper, the applicability of Sunburst and Netmap visualizations for requirements engineering knowledge is illustrated by applying it to an example project and the results are compared to traditional visualization techniques.
Article
Full-text available
The main drawback of existing software artifact management systems is the lack of automatic or semi-automatic traceability link generation and maintenance. We have improved an artifact management system with a traceability recovery tool based on Latent Semantic Indexing (LSI), an information retrieval technique. We have assessed LSI to identify strengths and limitations of using information retrieval techniques for traceability recovery and devised the need for an incremental approach. The method and the tool have been evaluated during the development of seventeen software projects involving about 150 students. We observed that although tools based on information retrieval provide a useful support for the identification of traceability links during software development, they are still far to support a complete semi-automatic recovery of all links. The results of our experience have also shown that such tools can help to identify quality problems in the textual description of traced artifacts.
Article
Full-text available
Software system documentation is almost always expressed informally in natural language and free text. Examples include requirement specifications, design documents, manual pages, system development journals, error logs, and related maintenance reports. We propose a method based on information retrieval to recover traceability links between source code and free text documents. A premise of our work is that programmers use meaningful names for program items, such as functions, variables, types, classes, and methods. We believe that the application-domain knowledge that programmers process when writing the code is often captured by the mnemonics for identifiers; therefore, the analysis of these mnemonics can help to associate high-level concepts with program concepts and vice-versa. We apply both a probabilistic and a vector space information retrieval model in two case studies to trace C++ source code onto manual pages and Java code to functional requirements. We compare the results of applying the two models, discuss the benefits and limitations, and describe directions for improvements.
Article
Full-text available
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized model is able to deal with domain#speci#c synonymy as well as with polysemous words. In contrast to standard Latent Semantic Indexing #LSI# by Singular Value Decomposition, the probabilistic variant has a solid statistical foundation and de#nes a proper generative data model. Retrieval experiments on a number of test collections indicate substantial performance gains over direct term matching methodsaswell as over LSI. In particular, the combination of models with di#erent dimensionalities has proven to be advantageous. 1
Article
This position paper discusses the situations when visualizing traceability links is opportune, as well as what information pertaining to these links should be visualized and how. It also presents a prototype tool, which is used to visualize traceability links to provide support for the user during recovery, maintenance, and browsing of such links.
Article
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.
Conference Paper
The complexity of software systems is continuously growing across a wide range of application domains. System architects are often faced with large complex systems and systems whose semantics may be difficult to understand, hidden, or even still evolving. Raising the level of abstraction of such systems can significantly improve their usability. We introduce System Grokking - a software architect assistance technology designed to support incremental and iterative user-driven understanding, validation, and evolution of complex software systems through higher levels of abstraction. The System Grokking technology enables semi-automatic discovery, manipulation, and visualization of groups of domain-specific software elements and the relationships between them to represent high-level structural and behavioral abstractions.
Article
IntroductionThe traditional approach to representing tree structures is as a rooted, directed graph with theroot node at the top of the page and children nodes below the parent node with linesconnecting them (Figure 1). Knuth (1968, p. 305-313) has a long discussion about thisstandard representation, especially why the root is at the top and he offers several alternativesincluding brief mention of a space-filling approach. However, the remainder of hispresentation and most other...
Conference Paper
Traceability in software involves discovering links between different artifacts, and is useful for a myriad of tasks in the software life cycle. We compare several different Information Retrieval techniques for this task, across two datasets involving real-world software with the accompanying specifications and documentation. The techniques compared include dimensionality reduction methods, probabilistic and information theoretic approaches, and the standard vector space model.
Conference Paper
An information retrieval technique, latent semantic indexing, is used to automatically identify traceability links from system documentation to program source code. The results of two experiments to identify links in existing software systems (i.e., the LEDA library, and Albergate) are presented. These results are compared with other similar type experimental results of traceability link identification using different types of information retrieval techniques. The method presented proves to give good results by comparison and additionally it is a low cost, highly flexible method to apply with regards to preprocessing and/or parsing of the source code and documentation.
Article
An information retrieval technique, latent semantic indexing, is used to automatically identi traceability links from system documentation to program source code. The results of two experiments to identi links in existing software systems (i.e., the LEDA library, and Albergate) are presented. These results are compared with other similar type experimental results of traceability link identification using different types of information retrieval techniques. The method presented proves to give good results by comparison and additionally it is a low cost, highly flexible method to apply with regards to preprocessing and/or parsing of the source code and documentation.
IBM Rapidly Adaptive Visualization Engine (RAVE)
  • Ibm Corp
IBM Corp. IBM Rapidly Adaptive Visualization Engine (RAVE). http://www-01.ibm.com/software/ analytics/many-eyes/index.html, 2013.
A vector space model for automatic indexing
  • X Chen
  • J Hosking
  • J Grundy
X. Chen, J. Hosking, and J. Grundy. A vector space model for automatic indexing. In IEEE Symp. Visual Languages and Human-Centric Computing, 2012.
When and how to visualize traceability links
  • A Marcus
  • X Xie
  • D Poshyvanyk
A. Marcus, X. Xie, and D. Poshyvanyk. When and how to visualize traceability links. In Proc. 3rd Int'l Workshop on Traceability in Emerging Forms of Software Engineering, pages 55-61, 2005.
  • J Sayles
  • C Rayns
  • V Sankar
  • J Milne
  • D Stein
  • D Dash
  • O Pharhi
J. Sayles, C. Rayns, V. Sankar, J. Milne, D. Stein, D. Dash, and O. Pharhi. z/OS Traditional Application Maintenance and Support. IBM Corp., 2011.