Article

The darker side of metrics

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Our point of departure from the traditional software measurement literature is in the extent to which we ask (a) whether commonly used measures are valid, (b) how we can tell, and (c) what kinds of side effects are we likely to encounter if we use a measure? Even though many authors mention these issues, we don't think they've been explored in enough depth or that enough people are thoughtfully enough facing the risks associated with using a poor software measure (Austin, 1996;Hoffman, 2000). After laying the groundwork, the chapter focuses on a typical test management problem--How can we measure the extent of the testing done on a product? ...
... Significant side effects (impacts on the test planning, bug hunting, bug reporting, and bug triage processes) seem to be common, even though they are not often discussed in print. For a recent discussion that focuses on bug counts, see Hoffman (2000). For an excellent but more general discussion, see Austin (1996).. ...
Article
Full-text available
this paper to describe a nominal scale. Suppose that you had hundreds ofdifferent pieces of cloth and you sorted them by color. Here is the brown pile. There isthe pink pile. This is the green pile. By sorting, you have nominally (or categorically)scaled the cloths. For those who insist that measurement is the assignment of number toan attribute according to a clear cut rule, make brown color 1, pink color 2 and greencolor 3
... Code metrics have to be chosen on a case by case basis as no single set of metrics can fit all use cases and contexts. As a basis for selecting metrics we propose Kaner's "Ten Measurement Factors" [26] for software metrics as well as further literature concerning the most relevant metrics for software design [27]- [30]. We employed metrics that measure independent aspects, used different approaches and incorporated different code parts in order to compute their results [31]. ...
Conference Paper
Full-text available
In any sufficiently complex software system there are experts, having a deeper understanding of parts of the system than others. However, it is not always clear who these experts are and which particular parts of the system they can provide help with. We propose a framework to elicit the expertise of developers and recommend experts by analyzing complexity measures over time. Furthermore, teams can detect those parts of the software for which currently no, or only few experts exist and take preventive actions to keep the collective code knowledge and ownership high. We employed the developed approach at a medium-sized company. The results were evaluated with a survey, comparing the perceived and the computed expertise of developers. We show that aggregated code metrics can be used to identify experts for different software components. The identified experts were rated as acceptable candidates by developers in over 90% of all cases.
... Some of these actions seriously hamper productivity and can effectively reduce quality." [115] As one supporting example of 'strange' behavior Hoffman describes a case where testers withheld defect reports to help developers who were under pressure to get the open defect count down. This is in some ways analogous to the "observer effect." ...
... This variability among TCs should be taken into account when defining effort estimation model parameters.Table 1 shows that TCs can greatly vary, for example by complexity, size (whether containing many asserts or one assert), or by origin (requirements or other), hence cannot be unified as indicating a singular metric. A further warning in this regard has been advocated by Hoffman [28], who pointed at the possibility that definitions of TCs, as well as their number and content, might change during the course of the project, jeopardizing the validity of metrics based on these TCs. ...
Conference Paper
Full-text available
Since the 1980s the term "Test Case" (TC) has been recognized as a building block for describing testing items, widely used as a work unit, metric and documentation entity. In light of the centrality of the TC concept in testing processes, the questions this paper attempts to answer are: What are the uses of TC in software testing? Is there a general, commonly agreed-upon definition of a TC? If not, what are the implications of this situation? This article reviews and explores the history, use and definitions of TCs, showing that while extensively used in research and practice, there is no one formal agreed upon definition of a TC. In this paper we point at undesirable implications of this situation, suggest four criteria for a ’good’ TC definition, and discuss the benefits accrued from such a definition. We conclude by urging the academic and professional community to formalize a TC definition for the benefits of the industry and its customers, and strongly believe that this review paves the way to articulating a formal TC definition. Such a definition, when widely accepted, will clarify some of the ambiguity currently associated with TC interpretation, hence with software testing assessment which relies on TCs as metrics. Furthermore, a formal definition can advance automation of TC generation and management.
... • Software that was not designed for testability will be more difficult and thus more time consuming to test. • Some corporate metrics projects waste time on the data collection, the data fudging (see Kaner, 2001;Hoffman, 2000), and the gossiping about the dummies in head office who rely on these stupid metrics. We are not suggesting that metrics efforts are necessarily worthless. ...
Article
Full-text available
One of the common test management questions is what is the right ratio of testers to other developers. Perhaps a credible benchmark number can provide convenience and bargaining power to the test manager working with an executive who has uninformed ideas about testing or whose objective is to spend the minimum necessary to conform to an industry standard. We focused on staffing ratios and related issues for two days at the Fall 2000 meeting of the Software Test Managers Roundtable (STMR 3).
Conference Paper
Full-text available
Producing a new framework is important in healthcare framework environment for accessing patient health records through multi-channel devices. This is especially important, as users of future health care systems will be increasingly characterized by diversity. By relying only on highly experienced and technology-prone user groups, which might have been typical users in the last decades, is not sufficient anymore. A design of framework for pervasive healthcare is essential to provide different types of medical services and to support users individually (according to user profiles), adaptively (according to the course of disease) and sensitively (according to living conditions) in order to successfully design new medical technologies.
Article
The area of applications development for government purposes can be characterized to be task specific. In this context, development projects are usually more complex and there are some differences in comparison with commercial projects. The mission of the proposed chapter is an explanation of methods of project complexity evaluation based on analogy, crisp and fuzzy expert estimation and measure models. The selected methods for aggregation of expert's estimations are also presented. Further the chapter introduces selected methods designed for complexity estimation. All the introduced methods are widely known except one that was designed by the lead author of the chapter. The method is called BORM Points and is developed for an IS project designed in BORM method (Business Object Relation Modeling). Each method is introduced first, then its step-by-step computation procedure is described and finally suggestion of software, which is supported method computation procedure. The results of the methods are in non-dimensional numbers and it is necessary to set up the relationship between complexity and effort, and introduces COCOMO model and its variants. Efforts are given about the implementation of this form of estimation approach in the area of ICT governance, especially at the grass roots e-governance.
Article
Mechanisms for controlling and regulating business flexibility are no longer limited to adjusting the number of pieces or changing item output in production processes. Information infrastructures, systems and web technologies have become an important factor for flexible business alignment: The company’s scope of action comes along with communication and collaboration among their partners. This paper examines flexibility impacts of new forms of information and communication technology on daily business use, e.g. flexibility through social virtual platforms, blogs, wikis and other emerging web-based techniques. A model of flexibility for web-based communication and collaboration in a company’s context is introduced to address technical, organisational, operational, legal and social impacts of flexibility. For each aspect specific key indicators are analysed and their practicability verified by expert interviews. As an outcome an index for determining the level of web-induced flexibility for enterprises is presented.
Article
Full-text available
CONTEXT – Global Software Development (GSD) is a modern software engineering paradigm adopted by many client organisations in developed countries to get high quality product at low cost in low wage countries. Production of high quality software is considered as one of the key factor in the rapid growth of GSD. However GSD projects have put new challenges to practitioners and researchers. In order to address these challenges Software Quality Metrics (SQMs) are frequently used in organisations to fabricate high quality products. OBJECTIVE -The objective of this SLR protocol is to identify and assess strengths and weaknesses of the existing SQMs used in GSD to assist vendor organisations in choosing appropriate SQMs for measuring software quality. METHOD – Systematic Literature Review (SLR) will be used for the identification of the existing SQMs in GSD. SLR is based on a structured protocol, and is therefore, different from ordinary review. EXPECTED OUTCOME – We have developed the SLR protocol and are currently in process of its implementation. The expected outcomes of this review will be the identification of different SQMs for GSD along with their SWOT analysis to assist vendor organisations in choosing appropriate SQMs at the right time to produce a high quality product.
This information was first generated for presentation and discussion at the Eighth Los Altos Workshop on Software Testing in December, 1999. I thank the LAWST attendees
  • James Agruss
  • Jaya Bach
  • Rocky Carl
  • Grober
• This information was first generated for presentation and discussion at the Eighth Los Altos Workshop on Software Testing in December, 1999. I thank the LAWST attendees, Chris Agruss, James Bach, Jaya Carl, Rocky Grober, Payson Hall, Elisabeth Hendrickson, Bob Johnson, Mark Johnson, Cem Kaner, Brian Lawrence, Brian Marick, Hung Quoc Nguyen, Bret Pettichord, Melora Svoboda, and Scott Vernon, for their participation and ideas.