Article

Rapid Quality Assurance with Requirements Smells

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Context: Bad requirements quality can cause expensive consequences during the software development lifecycle, especially if iterations are long and feedback comes late. Objectives: We aim at a light-weight static requirements analysis approach that allows for rapid checks immediately when requirements are written down. Method: We transfer the concept of code smells to Requirements Engineering as Requirements Smells. To evaluate the benefits and limitations, we define Requirements Smells, realize our concepts for a smell detection in a prototype called Smella and apply Smella in a series of cases provided by three industrial and a university context. Results: The automatic detection yields an average precision of 59% at an average recall of 82% with high variation. The evaluation in practical environments indicates benefits such as an increase of the awareness of quality defects. Yet, some smells were not clearly distinguishable. Conclusion: Lightweight smell detection can uncover many practically relevant requirements defects in a reasonably precise way. Although some smells need to be defined more clearly, smell detection provides a helpful means to support quality assurance in Requirements Engineering, for instance, as a supplement to reviews.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Requirement smells are "an indicator of a quality violation, which may lead to a defect, with a concrete location and a concrete detection mechanism" [3]. In software engineering, different kinds of smells are introduced such as code smell, requirement smell, architecture smell, test smell, etc. and code smell was introduced first [11]. According to ISO/IEE/IEC 29148, there is a classification of requirement quality as smell and quality characteristics. ...
... But it is difficult to say that specific rule and metrics-based approach should only be used, as it does not overcome the experience of requirement engineers and it needs deep knowledge [10]. NLP techniques like POS tagging, dictionaries, morphological analysis, etc. recorded poor performance [8], [11]. Thus, it is important to apply ML classification algorithms that are scalable and that can learn from the information provided by the expert. ...
... Due to the gap in lightweight smell detection approach [11] proposed an NLP-based tool called smella. The study 50 requirements are labeled by software engineering staff members from Wollo University who have experience of six years and above to enhance the reliability of data. ...
... Therefore, various authors have focused on automatically detecting quality defects, such as ambiguous language (i.a. [7], [8], [9], [10]) or cloning [11]. However, it is still an open question to what degree quality defects can be detected automatically or require human expertise (i.e. ...
... manual work). In previous work [10], we took a bottom-up perspective by qualitatively analyzing which of the quality review results could be automatically detected. ...
... We want to give only a brief, non-exhaustive summary here. Please refer to our previous work [10] for a more detailed analysis. Defect types: Most works in this area focus on the detection of various forms of ambiguity, e.g. ...
Preprint
Full-text available
[Context] The quality of requirements engineering artifacts, e.g. requirements specifications, is acknowledged to be an important success factor for projects. Therefore, many companies spend significant amounts of money to control the quality of their RE artifacts. To reduce spending and improve the RE artifact quality, methods were proposed that combine manual quality control, i.e. reviews, with automated approaches. [Problem] So far, we have seen various approaches to automatically detect certain aspects in RE artifacts. However, we still lack an overview what can and cannot be automatically detected. [Approach] Starting from an industry guideline for RE artifacts, we classify 166 existing rules for RE artifacts along various categories to discuss the share and the characteristics of those rules that can be automated. For those rules, that cannot be automated, we discuss the main reasons. [Contribution] We estimate that 53% of the 166 rules can be checked automatically either perfectly or with a good heuristic. Most rules need only simple techniques for checking. The main reason why some rules resist automation is due to imprecise definition. [Impact] By giving first estimates and analyses of automatically detectable and not automatically detectable rule violations, we aim to provide an overview of the potential of automated methods in requirements quality control.
... An operator provides a definition of how a requirement is analysed w.r.t. the associated quality attribute. Examples of operators are metrics [8,11], requirement smells [9] or rules and constraints on how to formulate requirements. An operator can be implemented by either a person or a computer program (or both). ...
... Each operator is associated with a level of intrinsic cognitive load, describing the complexity of applying the operator on a single requirement or a complete specification. For example, if the operator is the ambiguous adverbs requirements smell [9], then the intrinsic cognitive load is determined by the number of ambiguous terms one has to remember to detect these terms in the requirements text. Since cognitive load is additive [18], there are (individual) limits to the efficiency of applying operators, and is therefore one determinant for the effective cost of RQ assurance. ...
... If an operator is realized through machine-based processing of information, we characterize this realization by its automation complexity. Continuing with the example of ambiguous adverbs, the automation complexity of this operator is low as it can be implemented with a dictionary [9]. On the other hand, some of the requirements writing rules found in STA are rather complex. ...
Preprint
Full-text available
Context and Motivation: Natural language is the most common form to specify requirements in industry. The quality of the specification depends on the capability of the writer to formulate requirements aimed at different stakeholders: they are an expression of the customer's needs that are used by analysts, designers and testers. Given this central role of requirements as a mean to communicate intention, assuring their quality is essential to reduce misunderstandings that lead to potential waste. Problem: Quality assurance of requirement specifications is largely a manual effort that requires expertise and domain knowledge. However, this demanding cognitive process is also congested by trivial quality issues that should not occur in the first place. Principal ideas: We propose a taxonomy of requirements quality assurance complexity that characterizes cognitive load of verifying a quality aspect from the human perspective, and automation complexity and accuracy from the machine perspective. Contribution: Once this taxonomy is realized and validated, it can serve as the basis for a decision framework of automated requirements quality assurance support.
... Various approaches have been proposed to improve the quality of NL requirements by detecting semantic and syntactic problems, that are often referred as "smells" [7,8,9,10,11]. For example, Femmer et al. [9] introduced Smella, an automated tool for detecting requirement smells such as ambiguous adverbs and vague pronouns. ...
... Various approaches have been proposed to improve the quality of NL requirements by detecting semantic and syntactic problems, that are often referred as "smells" [7,8,9,10,11]. For example, Femmer et al. [9] introduced Smella, an automated tool for detecting requirement smells such as ambiguous adverbs and vague pronouns. Smella relies on part-of-speech (POS) tagging, dictionaries, and lemmatization. ...
... However, these approaches do not provide recommendations to analysts on how to rewrite requirements in a disciplined manner to improve their quality. Furthermore, existing work [9,10], which detects a set of quality problems in a requirement, still requires further research to account for many of the recurrent problems faced by analysts. For example, analysts sometimes describe multiple functions in a single requirement (i.e., non-atomic requirement), miss essential words (e.g., actors and verbs) or even phrases (e.g., system responses), or write a requirement following an ambiguous structure (e.g., a system response between conditions). ...
Preprint
Requirement specifications are typically written in natural language (NL) due to its usability across multiple domains and understandability by all stakeholders. However, unstructured NL is prone to quality problems (e.g., ambiguity) in writing requirements, which can result in project failures. To address this issue, we present a tool, named Paska, that automatically detects quality problems as smells in NL requirements and offers recommendations to improve their quality. Our approach relies on natural language processing (NLP) techniques and, most importantly, a state-of-the-art controlled natural language (CNL) for requirements (Rimay), to detect smells and suggest recommendations using patterns defined in Rimay to improve requirement quality. We evaluated Paska through an industrial case study in the financial domain involving 13 systems and 2725 annotated requirements. The results show that our tool is accurate in detecting smells (precision of 89% and recall of 89%) and suggesting appropriate Rimay pattern recommendations (precision of 96% and recall of 94%).
... Anomalias de requisitos são indicadores de que há "algo errado" com a descrição do requisito. Alguns exemplos de anomalias são termos comparativos, tal como "melhor que", e superlativos, tais como "melhor desempenho" ou "menor tempo de resposta", quando aplicadosà especificação de um requisito de software [Femmer et al. 2017]. Esses termos são considerados anomalias, pois trazem subjetividade ao texto do DR, prejudicando sua compreensão e interpretação. ...
... Na Seção 5, as informações sobre o estado atual desta pesquisa são apresentadas e, por fim, na Seção 6, são apresentados os resultados esperados e as considerações finais deste trabalho. Para Femmer et al. (2017), as anomalias de requisitos são sintomas de defeitos na especificação de requisitos de um software, com uma localização concreta e um mecanismo de detecção. Em geral, tais anomalias são definidas com base em normas confeccionadas para auxiliar profissionais quantoà especificação de requisitos, tais como as normas IEEE 830 (1998) e ISO/IEEE 29148 (2018). ...
... Em geral, tais anomalias são definidas com base em normas confeccionadas para auxiliar profissionais quantoà especificação de requisitos, tais como as normas IEEE 830 (1998) e ISO/IEEE 29148 (2018). Por terem uma localização concreta no DR (por exemplo, uma palavra ou uma frase), as anomalias de requisitos podem ser detectadas por meio de técnicas de Processamento de Linguagem Natural, tais como POS tagging (Part-Of-Speech tagging), análise morfológica, dicionários e lematização [Femmer et al. 2017]. No trabalho de Femmer et al. (2017),é feita a apresentação do conceito de anomalias de requisitos e, após isso,é apresentada uma abordagem para identificação de anomalias extraídas a partir da norma ISO/IEEE 29148 (2018). ...
Conference Paper
O desconhecimento dos pontos fortes e fracos das abordagens preexistentes sobre anomalias de requisitos é uma carência identificada na literatura vigente. Por conseguinte, este trabalho candidata-se a realizar uma análise comparativa das abordagens para identificação de anomalias de requisitos preexistentes com o intuito de se conhecer sua efetividade (cobertura e precisão), bem como seus pontos fortes e fracos. Além disso, pretende-se especificar recomendações de melhorias para as ferramentas analisadas.
... Quality assessment tasks are concerned with detecting defects in software requirements specifications. [102,103]. We recognized 17 papers focusing on tasks related to the following quality assessment tasks: ...
... • General Assessment: Other papers suggest approaches providing a general assessment for different aspects of requirements quality. Three papers can be classified under this type [118,102,103] ...
... -The majority of papers focused on lexical and syntactic quality-related issues such as lexical ambiguity and conformance with templates. Many significant contributions provide experiments that indicate the effectiveness of using lexical and syntactic features to represent requirements when handling this category of tasks [103,102,114,144]. ...
Article
Full-text available
Natural Language Processing (NLP) is widely used to support the automation of different Requirements Engineering (RE) tasks. Most of the proposed approaches start with various NLP steps that analyze requirements statements, extract their linguistic information, and convert them to easy-to-process representations, such as lists of features or embedding-based vector representations. These NLP-based representations are usually used at a later stage as inputs for machine learning techniques or rule-based methods. Thus, requirements representations play a major role in determining the accuracy of different approaches. In this paper, we conducted a survey in the form of a systematic literature mapping (classification) to find out (1) what are the representations used in RE tasks literature, (2) what is the main focus of these works, (3) what are the main research directions in this domain, and (4) what are the gaps and potential future directions. After compiling an initial pool of 2,227 papers, and applying a set of inclusion/exclusion criteria, we obtained a final pool containing 104 relevant papers. Our survey shows that the research direction has changed from the use of lexical and syntactic features to the use of advanced embedding techniques, especially in the last two years. Using advanced embedding representations has proved its effectiveness in most RE tasks (such as requirement analysis, extracting requirements from reviews and forums, and semantic-level quality tasks). However, representations that are based on lexical and syntactic features are still more appropriate for other RE tasks (such as modeling and syntax-level quality tasks) since they provide the required information for the rules and regular expressions used when handling these tasks. In addition, we identify four gaps in the existing literature, why they matter, and how future research can begin to address them.
... This satisfies the need for detecting potential defects in textual requirements at an early stage, as the cost for addressing these defects increases the longer they stay undetected, putting the project success at risk when treated poorly [2]. The applicability of quality factors is corroborated by the plethora of existing tools which automate their detection [3], [4]. Among the popular requirements quality factors are passive voice [4], where the use of a verb in passive voice is associated with ambiguity of a requirement due to the omission of the subject within a sentence, and sentence length [5], where exceeding a specific threshold of words or characters in a sentence is associated with complexity due to the sentence becoming increasingly hard to comprehend. ...
... The applicability of quality factors is corroborated by the plethora of existing tools which automate their detection [3], [4]. Among the popular requirements quality factors are passive voice [4], where the use of a verb in passive voice is associated with ambiguity of a requirement due to the omission of the subject within a sentence, and sentence length [5], where exceeding a specific threshold of words or characters in a sentence is associated with complexity due to the sentence becoming increasingly hard to comprehend. ...
... The concept of requirements quality factors has been implicitly used in many publications over the last years: Femmer et al. [4], for instance, introduce nine requirements smells, which indicate quality violations in textual requirements. Din and Rine [9] propose a metric for requirements complexity, which is referred to as a requirements indicator. ...
Preprint
Full-text available
Quality factors like passive voice or sentence length are commonly used in research and practice to evaluate the quality of natural language requirements since they indicate defects in requirements artifacts that potentially propagate to later stages in the development life cycle. However, as a research community, we still lack a holistic perspective on quality factors. This inhibits not only a comprehensive understanding of the existing body of knowledge but also the effective use and evolution of these factors. To this end, we propose an ontology of quality factors for textual requirements, which includes (1) a structure framing quality factors and related elements and (2) a central repository and web interface making these factors publicly accessible and usable. We contribute the first version of both by applying a rigorous ontology development method to 105 eligible primary studies and construct a first version of the repository and interface. We illustrate the usability of the ontology and invite fellow researchers to a joint community effort to complete and maintain this knowledge repository. We envision our ontology to reflect the community's harmonized perception of requirements quality factors, guide reporting of new quality factors, and provide central access to the current body of knowledge.
... Quality assessment tasks are concerned with detecting defects in software requirements specifications. [102,103]. We recognized 17 papers focusing on tasks related to the following quality assessment tasks: ...
... • General Assessment: Other papers suggest approaches providing a general assessment for different aspects of requirements quality. Three papers can be classified under this type [118,102,103] ...
... -The majority of papers focused on lexical and syntactic quality-related issues such as lexical ambiguity and conformance with templates. Many significant contributions provide experiments that indicate the effectiveness of using lexical and syntactic features to represent requirements when handling this category of tasks [103,102,114,144]. ...
Preprint
Full-text available
Natural Language Processing (NLP) is widely used to support the automation of different Requirements Engineering (RE) tasks. Most of the proposed approaches start with various NLP steps that analyze requirements statements, extract their linguistic information, and convert them to easy-to-process representations, such as lists of features or embedding-based vector representations. These NLP-based representations are usually used at a later stage as inputs for machine learning techniques or rule-based methods. Thus, requirements representations play a major role in determining the accuracy of different approaches. In this paper, we conducted a survey in the form of a systematic literature mapping (classification) to find out (1) what are the representations used in RE tasks literature, (2) what is the main focus of these works, (3) what are the main research directions in this domain, and (4) what are the gaps and potential future directions. After compiling an initial pool of 2,227 papers, and applying a set of inclusion/exclusion criteria, we obtained a final pool containing 104 relevant papers. Our survey shows that the research direction has changed from the use of lexical and syntactic features to the use of advanced embedding techniques, especially in the last two years. Using advanced embedding representations has proved its effectiveness in most RE tasks (such as requirement analysis, extracting requirements from reviews and forums, and semantic-level quality tasks). However, representations that are based on lexical and syntactic features are still more appropriate for other RE tasks (such as modeling and syntax-level quality tasks) since they provide the required information for the rules and regular expressions used when handling these tasks. In addition, we identify four gaps in the existing literature, why they matter, and how future research can begin to address them.
... Previous research found that test specifications written in natural language contain a rather high degree of cloning and bad structure [2], [3] which can influence the cost of maintaining and executing test cases. For requirements [4], [4], [5] as well as for system test cases [3] written in natural language, so called bad smells have been established as indicators to identify poorly written natural language artifacts. ...
... Previous research found that test specifications written in natural language contain a rather high degree of cloning and bad structure [2], [3] which can influence the cost of maintaining and executing test cases. For requirements [4], [4], [5] as well as for system test cases [3] written in natural language, so called bad smells have been established as indicators to identify poorly written natural language artifacts. ...
... Related to quality assurance of other natural language artifacts such as requirements, several researchers have focused on detection of clones [6], requirement similarity [7] and ambiguity [8]. Femmer et al. [4] proposed to detect issues based on requirement smells based on the quality attributes of natural language requirements in ISO/IEC/IEEE 29148. Hauptmann et al [3] is the first to study smells in natural language test specification. ...
Preprint
Full-text available
In large-scale embedded system development, requirement and test specifications are often expressed in natural language are playing an important role in software quality. In the context of developing such products, requirement review is performed in many cases manually using these specifications as a basis for quality assurance. Low-quality specifications can have expensive consequences during the requirement engineering process. Especially, if feedback loops during requirement engineering are long, leading to artifacts that are not easily maintainable, are hard to understand, and are inefficient to port to other system variants. We use the idea of smells to specifications expressed in natural language, defining a set of specifications for bad smells. We developed a tool called NALABS (NAtural LAnguage Bad Smells), available on https://github.com/eduardenoiu/NALABS and used for automatically checking specifications. We discuss some of the decisions made for its implementation, and future work.
... All the requirement documents have to abide to strict quality criteria and the requirement review is therefore a crucial activity to identify quality defects and it is traditionally Introduction performed manually, thus it is time consuming and error prone. Rule-based Natural Language Processing (NLP) techniques [117,12,62,61,111,6,46] have been developed to automatically perform this task. However, the literature is lacking empirical studies on the application of these techniques in industrial settings. ...
... Automated NLP approaches for defect detection can be be categorised into those that use rule-based techniques [117,12,62,61,111,6,46] and those that leverage artificial intelligence techniques [28,118,50]. Our contribution falls into the first category, which collects all the works in which defects are identified based on linguistic patterns. ...
... All these works, and in particular the ones employing rule-based techniques, were used as fundamental references to define the defect detection patterns of our study. On the other hand, all the listed works provide limited validation in real industrial contexts, as noted also in [46]. Large data-sets annotated by experts were considered in [44]. ...
Thesis
Full-text available
The increased complexity and ubiquity of cyber-physical systems in recent times demands for more e�cient and cost effective techniques to analyze software and hardware correctness, as well as to assess their performance at a given time in the future. Two disciplines that deal with these aspects of system development are veri�fication and performance evaluation. During this thesis work we focused in methods for improving quality in both of these areas in the context of railway safety-critical domain. Verifying a system means to prove or disprove that the system is the correct implementation of a specifi�cation, often expressed as a collection of properties {the Requirements) written in a given language. In the railway safety-critical domain the requirements play a key role in the product lifecycle as the system is developed and verifi�ed according to them; they are often expressed in natural language (which is flexible, but inherently ambiguous) albeit the strong needs of clearness and precision of the context. The requirements have to abide to strict quality criteria and the requirement review is therefore a very important activity to indentify quality defects and it is traditionally performed manually. Rule-based natural language processing (NLP) techniques have been developed to automatically perform this task. However, the literature is lacking empirical studies on the application of these techniques in industrial settings. This thesis mainly focuses on investigating to which extent NLP can be practically applied to detect defects in the requirements documents of a railway signalling manufacturer. The contribution is in carrying out one of the �rst works in which NLP techniques for defect detection are applied on a large set of industrial requirements annotated by domain experts. We contribute with a comparison between traditional manual techniques used in industry for requirements analysis, and analysis performed with NLP. Our experience shows that several discrepancies can be observed between the two approaches. The analysis of the discrepancies o�ers hints to improve the capabilities of NLP techniques with company speci�c solutions, and suggests that also company practices need to be modifi�ed to effectively exploit NLP tools. For what concerns the performance evaluation area we had the opportunity to focus on the system availability in the context of a different project of the laboratory. With the increased city population, the integration of public and private transport flows introduces new challenges, especially in urban transport. As it is often the case in scientifi�c and engineering problems, the object of study is a model of the system, rather than the system itself. We provide one modeling and analysis method using stochastic Time Petri Nets for those city intersections where public and private transport flows integration is often cause of traffic congestion leading to train delays and even run deletion. The use of the STPN instead of simulation techniques provides a more effective way to set timing for traffic lights and train timetables in order to improve system availability.
... Another approach would be to build a tool to enhance requirements quality automatically. Although there is limited research in this specific area [10], [11] and [12] are reasonable attempts using NLP in this direction. However, both studies are presenting a low precision and recall. ...
... Therefore, this research plans to create a rich, manually labeled dataset; each language criterion defined in ISO 29148 is considered a label, and deploying Deep Learning to improve the previously observed precision and recall in [11], [12], which increases practitioners' trust in using this tool. Once a Deep Learning model is trained, it usually makes better predictions than classical Machine Learning approaches or classical NLP [20], which could overcome the poor generalization capability of [11], [12] since these models are less sensitive to domain. ...
... Therefore, this research plans to create a rich, manually labeled dataset; each language criterion defined in ISO 29148 is considered a label, and deploying Deep Learning to improve the previously observed precision and recall in [11], [12], which increases practitioners' trust in using this tool. Once a Deep Learning model is trained, it usually makes better predictions than classical Machine Learning approaches or classical NLP [20], which could overcome the poor generalization capability of [11], [12] since these models are less sensitive to domain. Besides, most recent studies reveal that applying transfer learning which is a technique used in Deep Learning, is successful since RE suffers from a lack of datasets [13]- [19], [21]. ...
... Another approach would be to build a tool to enhance requirements quality automatically. Although there is limited research in this specific area [10], [11] and [12] are reasonable attempts using NLP in this direction. However, both studies are presenting a low precision and recall. ...
... Therefore, this research plans to create a rich, manually labeled dataset; each language criterion defined in ISO 29148 is considered a label, and deploying Deep Learning to improve the previously observed precision and recall in [11], [12], which increases practitioners' trust in using this tool. Once a Deep Learning model is trained, it usually makes better predictions than classical Machine Learning approaches or classical NLP [20], which could overcome the poor generalization capability of [11], [12] since these models are less sensitive to domain. ...
... Therefore, this research plans to create a rich, manually labeled dataset; each language criterion defined in ISO 29148 is considered a label, and deploying Deep Learning to improve the previously observed precision and recall in [11], [12], which increases practitioners' trust in using this tool. Once a Deep Learning model is trained, it usually makes better predictions than classical Machine Learning approaches or classical NLP [20], which could overcome the poor generalization capability of [11], [12] since these models are less sensitive to domain. Besides, most recent studies reveal that applying transfer learning which is a technique used in Deep Learning, is successful since RE suffers from a lack of datasets [13]- [19], [21]. ...
Preprint
Full-text available
Requirements Engineering (RE) is the initial step towards building a software system. The success or failure of a software project is firmly tied to this phase, based on communication among stakeholders using natural language. The problem with natural language is that it can easily lead to different understandings if it is not expressed precisely by the stakeholders involved, which results in building a product different from the expected one. Previous work proposed to enhance the quality of the software requirements detecting language errors based on ISO 29148 requirements language criteria. The existing solutions apply classical Natural Language Processing (NLP) to detect them. NLP has some limitations, such as domain dependability which results in poor generalization capability. Therefore, this work aims to improve the previous work by creating a manually labeled dataset and using ensemble learning, Deep Learning (DL), and techniques such as word embeddings and transfer learning to overcome the generalization problem that is tied with classical NLP and improve precision and recall metrics using a manually labeled dataset. The current findings show that the dataset is unbalanced and which class examples should be added more. It is tempting to train algorithms even if the dataset is not considerably representative. Whence, the results show that models are overfitting; in Machine Learning this issue is solved by adding more instances to the dataset, improving label quality, removing noise, and reducing the learning algorithms complexity, which is planned for this research.
... An ongoing research endeavor [7] collects these quality factors and indicats their limitations. Most existing publications either fail to gauge the impact of these metrics [45] or explicitly disregard their relationship [46]. Requirements quality models [47,48] integrate these factors into larger frameworks but often remain vague on their notion of impact. ...
... Both cost and resources are reported only rarely ( 9∕57 = 15.8% and 5∕57 = 8.8% , respectively) and, if so, only hypothesized or referenced, never determined empirically. Money and time are mentioned as the resources affected by activity impact, and the cost is only estimated in terms of expected change (e.g., "reduction of the time spent" [46]) or general magnitude (e.g., "significant amounts of money" [67]). ...
Article
Full-text available
High-quality requirements minimize the risk of propagating defects to later stages of the software development life cycle. Achieving a sufficient level of quality is a major goal of requirements engineering. This requires a clear definition and understanding of requirements quality. Though recent publications make an effort at disentangling the complex concept of quality, the requirements quality research community lacks identity and clear structure which guides advances and puts new findings into an holistic perspective. In this research commentary, we contribute (1) a harmonized requirements quality theory organizing its core concepts, (2) an evaluation of the current state of requirements quality research, and (3) a research roadmap to guide advancements in the field. We show that requirements quality research focuses on normative rules and mostly fails to connect requirements quality to its impact on subsequent software development activities, impeding the relevance of the research. Adherence to the proposed requirements quality theory and following the outlined roadmap will be a step toward amending this gap.
... A smell in the context of the requirements is defined as a quality violation, which can lead to a defect with a specific location and detection mechanism [4]. ISO/IEC/IEEE 29148 defines a set of bad smells, including: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended, non-verifiable terms, superlatives, comparatives phrases, negative statements, vague pronouns, incomplete references, among others [4]. ...
... A smell in the context of the requirements is defined as a quality violation, which can lead to a defect with a specific location and detection mechanism [4]. ISO/IEC/IEEE 29148 defines a set of bad smells, including: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended, non-verifiable terms, superlatives, comparatives phrases, negative statements, vague pronouns, incomplete references, among others [4]. Currently, requirements are written in natural language; therefore, quality control on a requirement is carried out by peer reviews [3]. ...
Article
Full-text available
Una de las actividades responsables del éxito en los proyectos de desarrollo de software es la especificación de requisitos, cuyo propósito es asegurar que los deseos o necesidades del cliente representan de forma precisa lo que ellos esperan. Un proceso claro y estructurado durante la especificación de requisitos permite evitar reprocesos en etapas posteriores del ciclo de vida del proyecto, generando un beneficio en términos de estimación de tiempos para nuevas tareas, costo y esfuerzo. En este sentido, es importante contar con mecanismos o técnicas que permitan identificar y mitigar posibles errores durante la especificación de requisitos. En particular, la ingeniería de software propone el término “olor”, que se puede definir como un síntoma concreto que puede generar defectos en un requisito. Con el objetivo de establecer un estado del conocimiento más amplio en torno a la identificación, clasificación de olores presentes durante la especificación de requisitos y su impacto en la generación de un fenómeno conocido como deuda de requisitos, este artículo presenta los resultados obtenidos después de realizar un mapeo sistemático de la literatura, en el cual se describen las propuestas, iniciativas, resultados, herramientas tecnológicas, beneficios y desafíos en torno a la identificación y gestión de olores en la etapa de levantamiento de requisitos durante el desarrollo de soluciones software.
... As a result, we hypothesize that CiRA's robustness against grammatical mistakes is limited to a few errors in a sentence. We, therefore, propose to combine CiRA with requirements smell checkers (Femmer et al., 2017) in the future to automatically verify the linguistic quality of requirements before passing them into the CiRA pipeline. ...
... Since the early 1980s, NLP techniques have been applied to RE artifacts to support a variety of use cases: e.g., requirements classification (Hey et al., 2020), topic modeling (Gülle et al., 2020), and quality checks (Femmer et al., 2017). A comprehensive overview of existing NLP4RE tools is provided by Zhao et al. (2021). ...
Article
Acceptance testing is crucial to determine whether a system fulfills end-user requirements. However, the creation of acceptance tests is a laborious task entailing two major challenges: (1) practitioners need to determine the right set of test cases that fully covers a requirement, and (2) they need to create test cases manually due to insufficient tool support. Existing approaches for automatically deriving test cases require semi-formal or even formal notations of requirements, though unrestricted natural language is prevalent in practice. In this paper, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of creating the minimal set of required test cases from conditional statements in informal requirements. We demonstrate the feasibility of CiRA in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8% can be generated automatically. Additionally, CiRA discovered 80 relevant test cases that were missed in manual test case design. CiRA is publicly available at www.cira.bth.se/demo/.
... A large number of tools have since been developed, among which are SREE (Tjong and Berry [25]) for ambiguity detection and aToucan (Yue et al. [26]) for model generation. Further developments include tools detection of defects [27], smells [28] and equivalent requirements [29]. ...
... 27 Term-Document Matrix A mathematical matrix that describes the frequency of terms that occur in a collection of documents. 28 Character Counting Counts the number of characters in a line of text, page or group of text. 29 Concordance An alphabetical list of the words (especially the important ones) present in a text, usually with citations of the passages in which they are found. ...
Preprint
Full-text available
Research in applying natural language processing (NLP) techniques to requirements engineering (RE) tasks spans more than 40 years, from initial efforts carried out in the 1980s to more recent attempts with machine learning (ML) and deep learning (DL) techniques. However, in spite of the progress, our recent survey shows that there is still a lack of systematic understanding and organization of commonly used NLP techniques in RE. We believe one hurdle facing the industry is lack of shared knowledge of NLP techniques and their usage in RE tasks. In this paper, we present our effort to synthesize and organize 57 most frequently used NLP techniques in RE. We classify these NLP techniques in two ways: first, by their NLP tasks in typical pipelines and second, by their linguist analysis levels. We believe these two ways of classification are complementary, contributing to a better understanding of the NLP techniques in RE and such understanding is crucial to the development of better NLP tools for RE.
... We classify these works into three categories. Pattern-based methods use some special terms and expressions of different Part-of-Speech (PoS) and other patterns [6,18,19,23,67,72] for inconsistency detection, or heuristics to tackle coordination or anaphoric ambiguities [7,78]. Learning-based methods [5,17,56] use information retrieval (IR) techniques such as Latent Semantic Indexing (LSI) or unsupervised clustering algorithms such as K-Means. ...
... They compared different word embeddings of one identical term from different domains to estimate its potential ambiguity across the domains of interest. There are some works using special terms and expressions with different PoS or patterns [6,18,19,23,67,72]. Other works use heuristics to tackle coordination ambiguities (i.e., ambiguities brought by "and" or "or" conjunctions) [7] and anaphoric ones (i.e., ambiguities brought by pronouns) [78]. ...
Article
Full-text available
Requirements are usually written in natural language and evolve continuously during the process of software development, which involves a large number of stakeholders. Stakeholders with diverse backgrounds and skills might refer to the same real-world entity with different linguistic expressions in the natural-language requirements, resulting in requirement inconsistency. We define this phenomenon as Entity Coreference (EC) in the Requirement Engineering (RE) area. It can lead to misconception about technical terminologies, and harm the readability and long-term maintainability of the requirements. In this paper, we propose a DEEP context-wise method for entity COREFerence detection, named DeepCoref. First, we truncate corresponding contexts surrounding entities. Then, we construct a deep context-wise neural network for coreference classification. The network consists of one fine-tuning BERT model for context representation, a Word2Vec-based network for entity representation, and a multi-layer perceptron in the end to fuse and make a trade-off between two representations. Finally, we cluster and normalize coreferent entities. We evaluate our method, respectively, on coreference classification and clustering with 1853 industry data on 21 projects. The former evaluation shows that DeepCoref outperforms three baselines with average precision and recall of 96.10% and 96.06%, respectively. The latter evaluation on six metrics shows that DeepCoref can cluster coreferent entities more accurately. We also conduct ablation experiments with three variants to demonstrate the performance enhancement brought by different components of neural network designed for coreference classification.
... The data set of 3442 requirements is filtered for all requirements in final states, which are execution completed (EC) [4,6] 457 [7,9] 96 [10,12] 15 [13,15] 2 [16,18] 1 [19,21] 1 Total ...
... The small extent of the correlations and their low effect size according to the applied measures emphasize that the occurrence of causality is definitely not the only or most impactful, but certainly a considerable factor for the features of requirements. Considering the detection of causality with the approach presented in this research endeavor as a complement to other requirements quality frameworks such as for example requirements smells [12] might benefit the reliability of these quality metrics by taking positive effects on requirements into account. Future studies need to investigate this claim in further detail. ...
Article
Full-text available
Causal relations in natural language (NL) requirements convey strong, semantic information. Automatically extracting such causal information enables multiple use cases, such as test case generation, but it also requires to reliably detect causal relations in the first place. Currently, this is still a cumbersome task as causality in NL requirements is still barely understood and, thus, barely detectable. In our empirically informed research, we aim at better understanding the notion of causality and supporting the automatic extraction of causal relations in NL requirements. In a first case study, we investigate 14.983 sentences from 53 requirements documents to understand the extent and form in which causality occurs. Second, we present and evaluate a tool-supported approach, called CiRA, for causality detection. We conclude with a second case study where we demonstrate the applicability of our tool and investigate the impact of causality on NL requirements. The first case study shows that causality constitutes around 28 % of all NL requirements sentences. We then demonstrate that our detection tool achieves a macro-F1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {F}_{1}$$\end{document} score of 82 % on real-world data and that it outperforms related approaches with an average gain of 11.06 % in macro-Recall and 11.43 % in macro-Precision. Finally, our second case study corroborates the positive correlations of causality with features of NL requirements. The results strengthen our confidence in the eligibility of causal relations for downstream reuse, while our tool and publicly available data constitute a first step in the ongoing endeavors of utilizing causality in RE and beyond.
... These measures allow categorizing the magnitude of the correlation effect. [4,6] 457 [7,9] 96 [10,12] 15 [13,15] 2 [16,18] 1 [19,21] 1 total [0, 21] 3631 ...
... The small extent of the correlations and their low effect size according to the applied measures emphasize that the occurrence of causality is definitely not the only or most impactful, but certainly a considerable factor for the features of requirements. Considering the detection of causality with the approach presented in this research endeavor as a complement to other requirements quality frameworks such as for example requirements smells [12] might benefit the reliability of these quality metrics by taking positive effects on requirements into account. Future studies need to investigate this claim in further detail. ...
Preprint
Full-text available
Background: Causal relations in natural language (NL) requirements convey strong, semantic information. Automatically extracting such causal information enables multiple use cases, such as test case generation, but it also requires to reliably detect causal relations in the first place. Currently, this is still a cumbersome task as causality in NL requirements is still barely understood and, thus, barely detectable. Objective: In our empirically informed research, we aim at better understanding the notion of causality and supporting the automatic extraction of causal relations in NL requirements. Method: In a first case study, we investigate 14.983 sentences from 53 requirements documents to understand the extent and form in which causality occurs. Second, we present and evaluate a tool-supported approach, called CiRA, for causality detection. We conclude with a second case study where we demonstrate the applicability of our tool and investigate the impact of causality on NL requirements. Results: The first case study shows that causality constitutes around 28% of all NL requirements sentences. We then demonstrate that our detection tool achieves a macro-F1 score of 82% on real-world data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, our second case study corroborates the positive correlations of causality with features of NL requirements. Conclusion: The results strengthen our confidence in the eligibility of causal relations for downstream reuse, while our tool and publicly available data constitute a first step in the ongoing endeavors of utilizing causality in RE and beyond.
... Transferring the concept of code smells to requirements engineering, Femmer et al. [29] introduced a lightweight static requirements analysis approach that allows for quick checks when requirements are written down in natural language. In another work, Femmer et al. [30] derived a set of smells from the natural language criteria of the ISO/IEC/IEEE 29148 standard, showing that lightweight smell analysis can uncover many practically relevant requirements defects. ...
Preprint
Full-text available
Background: Test smells indicate potential problems in the design and implementation of automated software tests that may negatively impact test code maintainability, coverage, and reliability. When poorly described, manual tests written in natural language may suffer from related problems, which enable their analysis from the point of view of test smells. Despite the possible prejudice to manually tested software products, little is known about test smells in manual tests, which results in many open questions regarding their types, frequency, and harm to tests written in natural language. Aims: Therefore, this study aims to contribute to a catalog of test smells for manual tests. Method: We perform a two-fold empirical strategy. First, an exploratory study in manual tests of three systems: the Ubuntu Operational System, the Brazilian Electronic Voting Machine, and the User Interface of a large smartphone manufacturer. We use our findings to propose a catalog of eight test smells and identification rules based on syntactical and morphological text analysis, validating our catalog with 24 in-company test engineers. Second, using our proposals, we create a tool based on Natural Language Processing (NLP) to analyze the subject systems' tests, validating the results. Results: We observed the occurrence of eight test smells. A survey of 24 in-company test professionals showed that 80.7% agreed with our catalog definitions and examples. Our NLP-based tool achieved a precision of 92%, recall of 95%, and f-measure of 93.5%, and its execution evidenced 13,169 occurrences of our cataloged test smells in the analyzed systems. Conclusion: We contribute with a catalog of natural language test smells and novel detection strategies that better explore the capabilities of current NLP mechanisms with promising results and reduced effort to analyze tests written in different idioms.
... However, their experiments show that the metrics extracted cannot be used for assessment of the quality of a test suite. (Femmer et al. 2014;Femmer et al. 2017) introduce the concept of requirement smells and conducted an empirical evaluation of their approach using industrial projects. In their work, the author present an automated static analysis technique relying on natural language processing (NLP) to detect smells in the requirements. ...
Article
Full-text available
Test smells are known as bad development practices that reflect poor design and implementation choices in software tests. Over the last decade, there are few attempts to study test smells in the context of system tests that interact with the System Under Test through a Graphical User Interface. To fill the gap, we conduct an exploratory analysis of test smells occurring in System User Interactive Tests (SUIT). We thus, compose a catalog of 35 SUIT-specific smells, identified through a multi-vocal literature review, and show how they differ from smells encountered in unit tests. We also conduct an empirical analysis to assess the diffuseness and removal of these smells in 48 industrial repositories and 12 open-source projects. Our results show that the same type of smells tends to appear in both industrial and open-source projects, but they are not addressed in the same way. We also find that smells originating from a combination of multiple code locations appear more often than those that are localized on a single line. This happens because of the difficulty to observe non-local smells without tool support. Furthermore, we find that smell-removing actions are not frequent with less than 50% of the affected tests ever undergoing a smell removal. Interestingly, while smell-removing actions are rare, some smells disappear while discarding tests, i.e., these smells do not appear in follow-up tests that replace the discarded ones.
... RS impacts entire software development project stages [1] as it may specify the Business Requirements Specification (BRS), User Requirements Specification (URS), and Software Requirements Specification (SRS) that define the expectation of the stakeholder of a system to be developed. RS is commonly expressed in Natural language [2], which is exposed to requirement smell [3] such as ambiguity, inconsistency, incompleteness, etc. [4]. The requirement smells impact the requirements" quality which may lead to low product quality. ...
... The NaPiRE [8,9] project, which involves more than 200 companies in 10 countries, has mapped several kinds of bad requirements with factors for project failure or linked these requirements problems with project delays or budget overruns. Similarly, several studies (e.g., [4,12,13,20]) has tried to find relationships between requirements quality to requirements (i.e., requirements smells [6]). However, what is still unclear is what the impact of these smells would be. ...
Preprint
Requirements are key artefacts to describe the intended purpose of a software system. The quality of requirements is crucial for deciding what to do next, impacting the development process's effectiveness and efficiency. However, we know very little about the connection between practitioners' perceptions regarding requirements quality and its impact on the process or the feelings of the professionals involved in the development process. Objectives: This study investigates: i) How software development practitioners define requirements quality, ii) how the perceived quality of requirements impact process and stakeholders' well-being, and iii) what are the causes and potential solutions for poor-quality requirements. Method: This study was performed as a descriptive interview study at a sub-organization of a Nordic bank that develops its own web and mobile apps. The data collection comprises interviews with 20 practitioners, including requirements engineers, developers, testers, and newly employed developers, with five interviewees from each group. Results: The results show that different roles have different views on what makes a requirement good quality. Participants highlighted that, in general, they experience negative emotions, more work, and overhead communication when they work with requirements they perceive to be of poor quality. The practitioners also describe positive effects on their performance and positive feelings when they work with requirements that they perceive to be good.
... In the Tournify case, the use of a wizard to express user stories led to shorter and crisper ideas than those we derived from the KMar cases. Future work should investigate more thoroughly how authoring tools (similar to those that support requirements engineers [58]) can assist crowd participants in the task of expressing high-quality requirements. The wizard can also be extended to more interactive techniques such as requirements bots [59]. ...
Article
Full-text available
Crowd-based Requirements Engineering (CrowdRE) promotes the active involvement of a large number of stakeholders in RE activities. A prominent strand of CrowdRE research concerns the creation and use of online platforms for a crowd of stakeholders to formulate ideas, which serve as an additional input for requirements elicitation. Most of the reported case studies are of small size, and they analyze the size of the crowd, rather than the quality of the collected ideas. By means of an iterative design that includes three case studies conducted at two organizations, we present the CREUS method for crowd-based elicitation via user stories. Besides reporting the details of these case studies and quantitative results on the number of participants, ideas, votes, etc., a key contribution of this paper is a qualitative analysis of the elicited ideas. To analyze the quality of the user stories, we apply criteria from the Quality User Story framework, we calculate automated text readability metrics, and we check for the presence of vague words. We also study whether the user stories can be linked to software qualities, and the specificity of the ideas. Based on the results, we distill six key findings regarding CREUS and, more generally, for CrowdRE via pull feedback.
... The rest of the complexity builds up accidentally, due to the inept use of the language. To reduce this complexity several techniques were proposed in literature [17], [18], [19]. In short, these techniques promote standardized patterns for writing simple requirements and recommend language constructs that decrease complexity. ...
Preprint
It is generally accepted that code complexity degrades code quality; Therefore, software engineers always strive to understand and reduce it. To help in this pursuit, over the past 50 years, researchers have created a number of metrics in attempt to quantify code complexity. Developing a metric for complexity is an admirable undertaking, unthinkable in many other disciplines. But as the evaluations have revealed, the existing code complexity metrics are of little help for complexity reduction in practice. There are two pivotal problems thus far: 1) the metrics fail to quantify the true magnitude of code complexity and 2) are incapable of distinguishing essential and accidental complexities. In this paper I propose a theory, which solves these two problems to a pragmatic degree. To develop the theory, I used the epistemological foundations of system complexity, the existing knowledge of software measurement, and the theory of cognitive psychology.
... Using 16 CVEs from first.org [13], cues were identified through a set of keywords that were generated using an existing technique for identifying smells within requirement specification documents [12]. Their results suggest that the presence of certain keywords within vulnerability descriptions (that could be linked to the attack and vulnerability type) are effective in reducing the error on exploitability metrics. ...
Conference Paper
Full-text available
Common Vulnerability and Exposure (CVE) reports published by Vulnerability Management Systems (VMSs) are used to evaluate the severity and exploitability of software vulnerabilities. Public vulnerability databases such as NVD uses the Common Vulnerability Scoring System (CVSS) to assign various scores to CVEs to evaluate their base severity, impact, and exploitability. Previous studies have shown that vulnerability databases rely on a manual , labor-intensive and error-prone process which may lead to inconsistencies in the CVE data and delays in the releasing of new CVEs. Furthermore, it was shown that CVSS scoring is based on complex calculations and may not be accurate enough in assessing the potential severity and exploitability of vulnerabilities in real life. This work uses Convolutional Neural Networks (CNN) to train text classification models to automate the prediction of the severity and exploitability of CVEs, and proposes a new exploitabil-ity scoring method by creating a Product Hygiene Index based on the Common Product Enumeration (CPE) catalog. Using CVE descriptions published by the NVD and the exploits identified by exploit databases, it trains CNN models to predict the base severity and exploitability of CVEs. Preliminary experiment results and the conducted case study indicate that the severity of CVEs can be predicted automatically with high confidences, and the proposed exploitability scoring method achieves better results compared to the exploitability scoring provided by the NVD. CCS CONCEPTS • Security and privacy → Software and application security.
... The use of paper leaves physical reminders of any mistakes made by its user, however minor they may be, even if an eraser is used. These reminders can serve as evocative aides-mémoires of the quote "to err is human", RE smells or anti-patterns introduced and removed after 'iterative improvement' [26], and/or acknowledgement of 'lesson learned', hoping to not repeat the same or similar types of mistakes again. This-embracing and learning to live with one's mistakes-is crucial to lifelong learning of students. ...
Conference Paper
Full-text available
It is broadly accepted that requirements engineering is one of the most important phases of a software project, and requires tools to be effective. For a variety of reasons, paper as a tool has lasted for millennia and remains ubiquitous. This paper makes a case for a contextual, conscientious, and evidence-based use of paper in a competency-oriented approach to software requirements engineering education (REE). It argues that the prophecies for the obsolescence of paper are premature, there are unique benefits in the use of paper, and the decision to use paper should be based on [0, 1] rather than {0, 1}. In this regard, a need-centered conceptual model for human-paper interaction is proposed. The characteristics of paper that make it historically unique are reported and the affordances of paper relevant to REE are discussed. The REE-related activities that benefit from viewing paper as a boundary object and using different types of paper are highlighted and illustrated by means of examples. In advocating polyliteracy, the potential for a convergence of paper and digital media towards a harmonic coexistence is underscored.
... A study was aimed at employing the rule based NLP techniques to find quality defects from industrial based natural language requirements that are annotated by domain experts [6]. Another approach refers to the application of the code smells concept to requirements engineering for identification of defects in requirements [7] but is subjective and employs multiple reviews. One of the main contributions of most NLP tools is to automate the extraction and translation of natural language specified requirements into conceptual models so they can be manually validated by analysts themselves [8], [9]. ...
Article
Full-text available
Specification of customer's need as software requirements in natural language create ambiguities in requirements and may also lead to failure of the software project. Generally, customers are unable to define their needs due to lack of domain understanding, technological constraints and knowledge gap between stakeholders and requirements analysts. One of the most effective approaches to minimize these gaps and ambiguities is the usage of ontologies for requirements specification and validation. However, the current approaches are mostly limited only to the translation of ambiguous software requirements. In this paper, we have discussed, analyzed and compared the current usage of these ontologies and found that these approaches are time-consuming and create complexities in the overall development process. We have presented a requirements specification ontology (ReqSpecOnto), bypassing the need for creating an ambiguous Software Requirement Specification (SRS). The upper software requirements ontology is defined in Ontology Web Language (OWL) that can be applied in different software scenarios. A case study of budget and planning system for a state physics lab is selected to specify its requirements as derived ontology from the upper ontology created. Results are validated through HermiT and Pellet reasoners to verify defined relationships and constraints. Finally, SPARQL queries are used to obtain the necessary requirements.
... Other works suggested providing tools to assist in requirements authoring process. Femmer et al. [39] proposed a lightweight requirements analysis approach named Smella based on the natural language criteria of ISO 29148 [40] which imposes constraints for "good requirements specifications". ...
Article
Full-text available
Requirements are textual representations of the desired software capabilities. Many templates have been used to standardize the structure of requirement statements such as Rupps, EARS, and User Stories. Templates provide a good solution to improve different Requirements Engineering (RE) tasks since their well-defined syntax facilitates the different text processing steps in RE automation researches. However, many empirical studies have concluded that there is a gap between these RE researches and their implementation in industrial and real-life projects. The success of RE automation approaches strongly depends on the consistency of the requirements with the syntax of the predefined templates. Such consistency cannot be guaranteed in real projects, especially in large development projects, or when one has little control over the requirements authoring environment. In this paper, we propose an unsupervised approach to recognize templates from the requirements themselves by extracting their common syntactic structures. The resultant templates reflect the actual syntactic structure of requirements; hence it can recognize both standard and non-standard templates. Our approach uses techniques from Natural Language Processing and Graph Theory to handle this problem through three main stages (1) we formulate the problem as a graph problem, where each requirement is represented as a vertex and each pair of requirements has a structural similarity, (2) We detect main communities in the resultant graph by applying a hybrid technique combining limited dynamic programming and greedy algorithms, (3) finally, we reinterpret the detected communities as templates. Our experiments show that the suggested approach can detect templates that follow well-known standards with a 0.90 F1-measure. Moreover, the approach can detect common syntactic features for non-standard templates in more than 73.5% of the cases. Our evaluation indicates that these results are robust regardless of the number and the length of the processed requirements.
... A study was aimed at employing the rule based NLP techniques to find quality defects from industrial based natural language requirements that are annotated by domain experts [6]. Another approach refers to the application of the code smells concept to requirements engineering for identification of defects in requirements [7] but is subjective and employs multiple reviews. One of the main contributions of most NLP tools is to automate the extraction and translation of natural language specified requirements into conceptual models so they can be manually validated by analysts themselves [8], [9]. ...
Article
Full-text available
The specification of customers’ need as software requirements in natural language create ambiguities (in the requirements) and may fail the software project. Generally, customers are unable to define their needs due to the lack of domain understanding, technological constraints, and knowledge gap between the stakeholders and the requirements analysts. One of the most effective approaches to minimize these gaps and ambiguities for requirements specification and validation is the use of ontologies. However, the current approaches are mostly limited to the translation of ambiguous software requirements. This paper discussed, analyzed and compared the current usage of these ontologies and found that these approaches are time-consuming and create complexities in the overall development process. It presented a requirements specification ontology (ReqSpecOnto), bypassing the need for creating an ambiguous Software Requirement Specification (SRS). The upper software requirements ontology is defined in Ontology Web Language (OWL) which can be applied to different software scenarios. A case study of budget and planning system for a state physics lab was selected to specify its requirements as derived ontology from the upper ontology created. The results are validated through HermiT and Pellet reasoners to verify the defined relationships and constraints. Finally, SPARQL queries were used to obtain the necessary requirements.
... According to [38,39], only 5% of the project budget is needed to commence risk management to prevent the probabilities of schedule overrun by 50-70%. Furthermore, researchers have also presented models of risk estimation using the analytic hierarchy process, Bayesian belief network, machine learning, risk metrics, fuzzy entropy, goal-oriented methodologies, decision trees, UML, etc. [40,41].These methods cumulatively state that evaluation of risk can efficiently help in producing software of quality by lessening the exposure to software risk. Furthermore, some researchers [42,43] have addressed in particular requirements risk management, risk-based testing, project risk dependencies, etc. ...
Article
Full-text available
Internet of Things (IoT) systems are revolutionizing traditional living to a new digital living style. In the past, a lot of investigations have been carried out to improve the technological challenges and issues of IoT and have focused on achieving the full potential of IoT. The foremost requisite for IoT software system developers seeking a competitive edge is to include project-specific features and meet customer expectations effectively and accurately. Any failures during the Requirements Engineering (RE) phase can result in direct or indirect consequences for each succeeding phase of development. The challenge is far more immense because of the lack of approaches for IoT-based RE. The objective of this paper is to propose a requirements risk management model for IoT systems. The method regarding the proposed model estimates requirements risk by considering both customers’ and developers’ perceptions. It uses multiple criteria using intuitionistic fuzzy logic and analytical technique. This will help to handle the uncertainty and vagueness of human perception, providing a well-defined two-dimensional indication of customer value and risk. The validity of the approach is tested on real project data and is supported with a user study. To the best of our understanding, literature lacks the trade-off analysis at the RE level in IoT systems and this presented work fills this prerequisite in a novel way by improving (i) requirements risk assessment for IoT systems and (ii) handling developers’ subjective judgments of multiple conflicting criteria, yielding more concrete and more observable results.
... Femmer et al. [18] defined a set of "requirements smells", which are undesirable requirement properties, and created a system called Smella to detect these requirement smells automatically. ...
Article
Full-text available
A common way to describe requirements in Agile software development is through user stories, which are short descriptions of desired functionality. Nevertheless, there are no widely accepted quantitative metrics to evaluate user stories. We propose a novel metric to evaluate user stories called instability, which measures the number of changes made to a user story after it was assigned to a developer to be implemented in the near future. A user story with a high instability score suggests that it was not detailed and coherent enough to be implemented. The instability of a user story can be automatically extracted from industry-standard issue tracking systems such as Jira by performing retrospective analysis over user stories that were fully implemented. We propose a method for creating prediction models that can identify user stories that will have high instability even before they have been assigned to a developer. Our method works by applying a machine learning algorithm on implemented user stories, considering only features that are available before a user story is assigned to a developer. We evaluate our prediction models on several open-source projects and one commercial project and show that they outperform baseline prediction models.
... As a result, we hypothesize that CiRA's robustness against grammatical mistakes is limited to a few errors in a sentence. We therefore propose to combine CiRA with requirements smell checkers [9] in the future to automatically verify the linguistic quality of requirements before passing them into the CiRA pipeline. ...
Preprint
Full-text available
Acceptance testing is crucial to determine whether a system fulfills end-user requirements. However, the creation of acceptance tests is a laborious task entailing two major challenges: (1) practitioners need to determine the right set of test cases that fully covers a requirement, and (2) they need to create test cases manually due to insufficient tool support. Existing approaches for automatically deriving test cases require semi-formal or even formal notations of requirements, though unrestricted natural language is prevalent in practice. In this paper, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of creating the minimal set of required test cases from conditional statements in informal requirements. We demonstrate the feasibility of CiRA in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8 % can be generated automatically. Additionally, CiRA discovered 80 relevant test cases that were missed in manual test case design. CiRA is publicly available at www.cira.bth.se/demo/.
... The interpretation of the semantics of conditionals affects all activities carried out on the basis of documented requirements such as manual reviews, implementation, or test case generations. Even more, a correct interpretation is absolutely essential for all automatic analyses of requirements that consider the semantics of sentences; for instance, automatic quality analysis like smell detection [11], test case derivation [14,17], and dependency detection [13]. In consequence, conditionals should always be associated with a formal meaning to automatically process them. ...
Chapter
Full-text available
Context: Conditional statements like “If A and B then C” are core elements for describing software requirements. However, there are many ways to express such conditionals in natural language and also many ways how they can be interpreted. We hypothesize that conditional statements in requirements are a source of ambiguity, potentially affecting downstream activities such as test case generation negatively. Objective: Our goal is to understand how specific conditionals are interpreted by readers who work with requirements. Method: We conduct a descriptive survey with 104 RE practitioners and ask how they interpret 12 different conditional clauses. We map their interpretations to logical formulas written in Propositional (Temporal) Logic and discuss the implications. Results: The conditionals in our tested requirements were interpreted ambiguously. We found that practitioners disagree on whether an antecedent is only sufficient or also necessary for the consequent. Interestingly, the disagreement persists even when the system behavior is known to the practitioners. We also found that certain cue phrases are associated with specific interpretations. Conclusion: Conditionals in requirements are a source of ambiguity and there is not just one way to interpret them formally. This affects any analysis that builds upon formalized requirements (e.g., inconsistency checking, test-case generation). Our results may also influence guidelines for writing requirements.
... The interpretation of the semantics of conditionals affects all activities carried out on the basis of documented requirements such as manual reviews, implementation, or test case generations. Even more, a correct interpretation is absolutely essential for all automatic analyses of requirements that consider the semantics of sentences; for instance, automatic quality analysis like smell detection [11], test case derivation [14,17], and dependency detection [13]. In consequence, conditionals should always be associated with a formal meaning to automatically process them. ...
Preprint
Full-text available
Context: Conditional statements like "If A and B then C" are core elements for describing software requirements. However, there are many ways to express such conditionals in natural language and also many ways how they can be interpreted. We hypothesize that conditional statements in requirements are a source of ambiguity, potentially affecting downstream activities such as test case generation negatively. Objective: Our goal is to understand how specific conditionals are interpreted by readers who work with requirements. Method: We conduct a descriptive survey with 104 RE practitioners and ask how they interpret 12 different conditional clauses. We map their interpretations to logical formulas written in Propositional (Temporal) Logic and discuss the implications. Results: The conditionals in our tested requirements were interpreted ambiguously. We found that practitioners disagree on whether an antecedent is only sufficient or also necessary for the consequent. Interestingly, the disagreement persists even when the system behavior is known to the practitioners. We also found that certain cue phrases are associated with specific interpretations. Conclusion: Conditionals in requirements are a source of ambiguity and there is not just one way to interpret them formally. This affects any analysis that builds upon formalized requirements (e.g., inconsistency checking, test-case generation). Our results may also influence guidelines for writing requirements.
... A broad range of automated NLP approaches has been proposed to assess the quality of natural-language requirements. The approaches we are aware of focus on aspects such as ambiguity [16], [17], completeness [18], equivalence [19], variability [7], writing quality [20], [21], templates based on best practices [22], [6] or anti-pattern referring to bad practices [23], [24]. Alternatively, neural networks can be trained to single out badly specified requirements [5] without having to provide an exhaustive definition for "badly specified". ...
Conference Paper
Full-text available
Without a precise specification, an IT project might not remain on time and on budget constraints, or it might lead to a different outcome than desired. A number of established standards define how requirements must be written to avoid such issues. This paper describes our ongoing work to derive a comprehensive set of standardized criteria that IT-requirements must meet in accordance with IEEE 1233-1996 and ISO/IEC/IEEE 29148-2011. We also use a text-mining approach to identify IT-requirements that violate these standards. Our preliminary results are promising: In our biased dataset, we can use text features that are easy to compute, to filter out requirements that do not comply with the standards. Our beneficiaries are auditors, developers, Scrum teams, customers and other stakeholders whose projects are highly dependent on extensive IT-requirements specification.
... These approaches aim to predict defects much earlier in the software development lifecycle using the concepts of code smells and requirements smells. Hennning et al. [75] proposed a lightweight static requirements analysis approach named Smella that allowed for immediate rapid checks when requirements were written down. ...
Article
Full-text available
Recent advances in the domain of software defect prediction (SDP) include the integration of multiple classification techniques to create an ensemble or hybrid approach. This technique was introduced to improve the prediction performance by overcoming the limitations of any single classification technique. This research provides a systematic literature review on the use of the ensemble learning approach for software defect prediction. The review is conducted after critically analyzing research papers published since 2012 in four well-known online libraries: ACM, IEEE, Springer Link, and Science Direct. In this study, five research questions that cover the different aspects of research progress on the use of ensemble learning for software defect prediction are addressed. To extract the answers to identified questions, 46 most relevant papers are shortlisted after a thorough systematic research process. This study will provide compact information regarding the latest trends and advances in ensemble learning for software defect prediction and provide a baseline for future innovations and further reviews. Through our study, we discovered that frequently employed ensemble methods by researchers are the random forest, boosting, and bagging. Less frequently employed methods include stacking, voting and Extra Trees. Researchers proposed many promising frameworks, such as EMKCA, SMOTE-Ensemble, MKEL, SDAEsTSE, TLEL, and LRCR, using ensemble learning methods. The AUC, accuracy, F-measure, Recall, Precision, and MCC were mostly utilized to measure the prediction performance of models. WEKA was widely adopted as a platform for machine learning. Many researchers showed through empirical analysis that feature selection and data sampling were important pre-processing steps that improve the performance of ensemble classifiers.
Chapter
Context: Software specifications are usually written in natural language and may suffer from imprecision, ambiguity, and other quality issues, called thereafter, requirement smells. Requirement smells can hinder the development of a project in many aspects, such as delays, reworks, and low customer satisfaction. From an industrial perspective, we want to focus our time and effort on identifying and preventing the requirement smells that are of high interest. Aim: This paper aims to characterise 12 requirements smells in terms of frequency, severity, and effects. Method: We interviewed ten experienced practitioners from different divisions of a large international company in the safety-critical domain called MBDA Italy Spa. Results: Our interview shows that the smell types perceived as most severe are Ambiguity and Verifiability, while as most frequent are Ambiguity and Complexity. We also provide a set of six lessons learnt about requirements smells, such as that effects of smells are expected to differ across smell types. Conclusions: Our results help to increase awareness about the importance of requirement smells. Our results pave the way for future empirical investigations, ranging from a survey confirming our findings to controlled experiments measuring the effect size of specific requirement smells.
Article
Scenario-based approaches are widely used for software requirements specification. Since scenarios are usually written using natural language, specifications may have statements that are ambiguous, unnecessarily complicated, missing, duplicated, or conflicting. Requirements quality is challenging since it is hard to achieve consistency in requirements products. Unfortunately, if done manually, analysis of textual scenarios can be an arduous, time-consuming, and error-prone activity. This work rethinks the unambiguity, completeness, consistency, and correctness properties of scenario-based specifications; and how static and dynamic analysis strategies could automatically evaluate them. To do so, we introduce an automated requirements analysis approach to check both structural and behavioral aspects of scenarios, which combines natural language processing, Petri-nets, and visualization techniques for: (i) identifying certain types of defects and their indicators; (ii) highlighting scenario statements or relationships among scenarios that can lead to defects; and (iii) foreseeing scenario execution paths that can lead to inconsistencies. We show the feasibility of the proposed approach through the analysis of four projects specified as scenario-based descriptions. Overall, our approach produced reasonable results, with precision greater than 89% and recall greater than 98%. Our work allows researchers, as well as practitioners, to improve the quality of scenarios through an automated analysis approach. Available at: https://authors.elsevier.com/a/1i8KObKHpCdjG
Article
Full-text available
Requirements engineering (RE) is an initial activity in the software engineering process that involves many users. The involvement of various users in the RE process raises ambiguity and vagueness in requirements modeling. In addition, traditional RE is a time-consuming activity. Therefore various studies have been conducted to support process automation on RE. This paper conducts a systematic literature review (SLR) to obtain information about RE automation related to RE activities, methods/models, tools, and domains. SLR is done through 5 main stages: definition of research questions, conducting the search, screening for relevant papers, data extraction, mapping, and analysis. The data extraction and mapping are carried out on 155 relevant publications from 2016 to 2022. Based on the results from SLR, around 53% of the research focuses on RE automation in analysis and specifications, 40% focuses on elicitation, validation, and requirements management, and 7% focuses on requirements quality. NLP is the most used method in elicitation and specification, while for analysis, machine learning, NLP, and goal-oriented models are mostly used in automatic RE. Furthermore, many papers use specific models and methods for validation and requirements management. From the domain analysis results, it is obtained that more than half of the papers contribute directly to the RE domain, and some contribute to the development of RE automation in the software application domain.
Chapter
Full-text available
The chapter addresses the importance of project quality (PQ) in the mediation of the project governance´s endeavours and its outcomes (of project performance). It provides a multilateral vision of what quality is from different stakeholder´s point of view. It intertwines the seminal contribution of Deming´s theories (of profound knowledge and optimization) which influenced decisively the evolution of the project management body of knowledge (PMBOK) with the total quality management (TQM). The chapter addresses moreover the types of quality processes and enters the domain of quality assurance as a system of relevance to ensure the fulfilment of quality standards. Within the PMBOK revisits a few theorizations (as the MODest framework) and enters the realm of the total quality management (TQM).
Chapter
Requirements are key artefacts to describe the intended purpose of a software system. The quality of requirements is crucial for deciding what to do next, impacting the development process’ effectiveness and efficiency. However, we know very little about the connection between practitioners’ perceptions regarding requirements quality and its impact on the process or the feelings of the professionals involved in the development process.Objectives: This study investigates: i) How software development practitioners define requirements quality, ii) how the perceived quality of requirements impact process and stakeholders’ well-being, and iii) what are the causes and potential solutions for poor-quality requirements.Method: This study was performed as a descriptive interview study at a sub-organization of a Nordic bank that develops its own web and mobile apps. The data collection comprises interviews with 20 practitioners, including requirements engineers, developers, testers, and newly employed developers, with five interviewees from each group.Results: The results show that different roles have different views on what makes a requirement good quality. Participants highlighted that, in general, they experience negative emotions, more work, and overhead communication when they work with requirements they perceive to be of poor quality. The practitioners also describe positive effects on their performance and positive feelings when they work with requirements that they perceive to be good.KeywordsRequirements EngineeringRequirements QualityHuman FactorsEmpirical Study
Article
Variability is a characteristic of a software project and describes the fact that a system can be configured in different ways, obtaining different products (variants) from a common code base, accordingly to the software product line paradigm. This paradigm can be conveniently applied in all phases of the software process, starting from the definition and analysis of the requirements. We observe that often requirements contain ambiguities which can reveal an unintentional and implicit source of variability, that has to be detected. To this end we define VIBE, a tool supported process to identify variability aspects in requirements documents. VIBE is defined on the basis of a study of the different sources of ambiguity in natural language requirements documents that are useful to recognize potential variability, and is characterized by the use of a NLP tool customized to detect variability indicators. The tool to be used in VIBE is selected from a number of ambiguity detection tools, after a comparison of their customisation features. The validation of VIBE is conducted using real-world requirements documents.
Article
Full-text available
Research has repeatedly shown that high-quality requirements are essential for the success of development projects. While the term “quality” is pervasive in the field of requirements engineering and while the body of research on requirements quality is large, there is no meta-study of the field that overviews and compares the concrete quality attributes addressed by the community. To fill this knowledge gap, we conducted a systematic mapping study of the scientific literature. We retrieved 6905 articles from six academic databases, which we filtered down to 105 relevant primary studies. The primary studies use empirical research to explicitly define, improve, or evaluate requirements quality. We found that empirical research on requirements quality focuses on improvement techniques, with very few primary studies addressing evidence-based definitions and evaluations of quality attributes. Among the 12 quality attributes identified, the most prominent in the field are ambiguity, completeness, consistency, and correctness. We identified 111 sub-types of quality attributes such as “template conformance” for consistency or “passive voice” for ambiguity. Ambiguity has the largest share of these sub-types. The artefacts being studied are mostly referred to in the broadest sense as “requirements”, while little research targets quality attributes in specific types of requirements such as use cases or user stories. Our findings highlight the need to conduct more empirically grounded research defining requirements quality, using more varied research methods, and addressing a more diverse set of requirements types.
Conference Paper
Full-text available
Context] Requirements Engineering (RE) artifacts are central items in software development: Their quality is of essential importance for development, testing and other software engineering activities. However, as requirements artifacts are used differently in different processes, the proper definition of what is good quality depends on the context under consideration. [Problem] So far, no methodology exists that enables to define context-specific RE artifact quality in a precise manner. [Principle Idea] We define context-specific RE artifact quality by how quality attributes of an RE artifact impact on the activities of the software development process in which this artifact is used. [Contribution] In this paper, we introduce a methodology to define RE artifact quality specifically to a projector process context. Furthermore, we provide a preliminary technical validation as well as an industrial validation on the application of our approach. Our studies indicate that the activity-based approach enables defining and validating RE quality in a precise and systematic manner. The industrial validation furthermore suggests the applicability of the approach in practical use.
Conference Paper
Full-text available
User stories are a widely used notation for formulating requirements in agile development. Despite their popularity in industry, little to no academic work is available on determining their quality. The few existing approaches are too generic or employ highly qualitative metrics. We propose the Quality User Story Framework, consisting of 14 quality criteria that user stories should strive to conform to. Additionally, we introduce the conceptual model of a user story, which we rely on to subsequently design the AQUSA tool. This conceptual piece of software aids requirements engineers in turning raw user stories into higher quality ones by exposing defects and deviations from good practice in user stories. We evaluate our work by applying the framework and a prototype implementation to multiple case studies.
Conference Paper
Full-text available
Background] Requirements Engineering is crucial for project success, and to this end, many measures for quality assurance of the software requirements specification (SRS) have been proposed. [Goal] However, we still need an empirical understanding on the extent to which SRS are created and used in practice, as well as the degree to which the quality of an SRS matters to subsequent development activities. [Method] We studied the relevance of SRS by relying on survey research and explored the impact of quality defects in SRS by relying on a controlled experiment. [Results] Our results suggest that the relevance of SRS quality depends both on particular project characteristics and what is considered as a quality defect; for instance, the domain of safety critical systems seems to motivate for an intense usage of SRS as a means for communication whereas defects hampering the pragmatic quality do not seem to be as relevant as initially thought. [Conclusion] Efficient and effective quality assurance measures must be specific for carefully characterized contexts and carefully select defect classes.
Article
Full-text available
Developing high-quality requirements specifications often demands a thoughtful analysis and an adequate level of expertise from analysts. Although requirements modeling techniques provide mechanisms for abstraction and clarity, fostering the reuse of shared functionality (e.g., via UML relationships for use cases), they are seldom employed in practice. A particular quality problem of textual requirements, such as use cases, is that of having duplicate pieces of functionality scattered across the specifications. Duplicate functionality can sometimes improve readability for end users, but hinders development-related tasks such as effort estimation, feature prioritization, and maintenance, among others. Unfortunately, inspecting textual requirements by hand in order to deal with redundant functionality can be an arduous, time-consuming, and error-prone activity for analysts. In this context, we introduce a novel approach called ReqAligner that aids analysts to spot signs of duplication in use cases in an automated fashion. To do so, ReqAligner combines several text processing techniques, such as a use case-aware classifier and a customized algorithm for sequence alignment. Essentially, the classifier converts the use cases into an abstract representation that consists of sequences of semantic actions, and then these sequences are compared pairwise in order to identify action matches, which become possible duplications. We have applied our technique to five real-world specifications, achieving promising results and identifying many sources of duplication in the use cases.
Article
Full-text available
This paper proposes a two-step approach to identifying ambiguities in natural language (NL) requirements specifications (RSs). In the first step, a tool would apply a set of ambiguity measures to a RS in order to identify potentially ambiguous sentences in the RS. In the second step, another tool would show what specifically is potentially ambiguous about each potentially ambiguous sentence. The final decision of ambiguity remains with the human users of the tools. The paper describes two requirements-identification case studies with one small NL RS using a prototype of the first tool based on an existing NL processing system and a manual simulation of the second tool.
Conference Paper
Full-text available
Choosing the right quality level for requirements is still a challenging but crucial task. While spending too little effort can result in a failed project, spending too much effort on requirements threatens the project schedule and the budget. Consequently, formal criteria are needed in order to determine from an objective point of view whether the quality of requirements is sufficient for a given project situation. We propose the use of an adapted Quality Gateway to obtain a comparable, repeatable, and objective check. Furthermore, the Quality Gateway concept can be strongly improved if it is combined with a Domain Oriented Design Environment (DODE) to construct adequate requirements documentation in the most critical areas beforehand. This way, requirements can be observed and fine-tuned for following activities within the software development process. This paper presents the DODE concept as a supplemental to adapted Quality Gateways.
Conference Paper
Full-text available
Bad requirements quality can have expensive consequences during the software development lifecycle. Especially, if iterations are long and feedback comes late - the faster a problem is found, the cheaper it is to fix. We propose to detect issues in requirements based on requirements (bad) smells by applying a light-weight static requirements analysis. This light-weight technique allows for instant checks as soon as a requirement is written down. In this paper, we derive a set of smells, including automatic smell detection, from the natural language criteria of the ISO/IEC/IEEE 29148 standard. We evaluated the approach with 336 requirements and 53 use cases from 9 specifications that were written by the car manufacturer Daimler AG and the chemical business company Wacker Chemie AG, and discussed the results with their requirements and domain experts. While not all problems can be detected, the case study shows that lightweight smell analysis can uncover many practically relevant requirements defects. Based on these results and the discussion with our industry partners, we conclude that requirements smells can serve as an efficient supplement to traditional reviews or team discussions, in order to create fast feedback on requirements quality.
Conference Paper
Full-text available
Context: For many years, researchers and practitioners have been proposing various methods and approaches to Requirements Engineering (RE). Those contributions remain, however, too often on the level of apodictic discussions without having proper knowledge about the practical problems they propagate to address, or how to measure the success of the contributions when applying them in practical contexts. While the scientific impact of research might not be threatened, the practical impact of the contributions is. Aim: We aim at better understanding practically relevant variables in RE, how those variables relate to each other, and to what extent we can measure those variables. This allows for the establishment of generalisable improvement goals, and the measurement of success of solution proposals. Method: We establish a first empirical basis of dependent variables in RE and means for their measurement. We classify the variables according to their dimension (e.g. RE, company, SW project), their measurability, and their actionability. Results: We reveal 93 variables with 167 dependencies of which a large subset is measurable directly in RE while further variables remain unmeasurable or have too complex dependencies for reliable measurements. We critically reflect on the results and show direct implications for research in the field of RE. Conclusion: We discuss a variety of conclusions we can draw from our results. For example, we show a set of first improvement goals directly usable for evidence-based RE research such as "increase flexibility in the RE process", we discuss suitable study types, and, finally, we can underpin the importance of replication studies to obtain generalisability.
Conference Paper
Full-text available
Context: The requirements specification is a central arte-fact in the software engineering (SE) process, and its quality (might) influence downstream activities like implementation or testing. One quality defect that is often mentioned in standards is the use of passive voice. However, the con-sequences of this defect are still unclear. Goal: We need to understand whether the use of passive voice in require-ments has an influence on other activities in SE. In this work we focus on domain modelling. Method: We designed an experiment, in which we ask students to draw a domain model from a given set of requirements written in active or passive voice. We compared the completeness of the re-sulting domain model by counting the number of missing actors, domain objects and their associations with respect to a specified solution. Results: While we could not see a difference in the number of missing actors and objects, participants which received passive sentences missed almost twice the associations. Conclusion: Our experiment indi-cates that, against common knowledge, actors and objects in a requirement can often be understood from the context. However, the study also shows that passive sentences com-plicate understanding how certain domain concepts are in-terconnected.
Article
Full-text available
Requirements engineering can solve and cause many problems in a software development cycle. The difficulties in understanding and eliciting the desired functionality from the customer and then delivering the correct implementation of a software system lead to delays, mistakes, and high costs. Working with requirements means handling incomplete or faulty textual specifications. It is vital for a project to fully understand the purpose and the functionality of a software. The earlier specifications are corrected and improved, the better. We created a tool called RESI to support requirement analysts working with textual specifications. RESI checks for linguistic defects [1, 2] in specifications and offers a dialog-system which makes suggestions to improve the text. It points out to the user which parts of the specification are ambiguous, faulty or inaccurate and therefore need to be changed. For this task, RESI needs additional semantic information which is inherent to natural language specifications. It receives this information by utilizing ontologies.
Article
Full-text available
Improving the quality of software demands quality controls since the very beginning of the development process, i.e., requirements capture and writing. Automating quality metrics may entail considerable savings, as opposed to tedious, manually performed evaluations. We present some indicators for measuring quality in textual requirements, as well as a tool that computes quality measures in a fully automated way. We want to emphasize that the final goal must be measure to improve. Reducing quality management to the acquisition of a numerical evaluation would crash against the strong opposition of requirements engineers themselves, who would not see in the measurement process the aid of a counselor, but a policeman mechanism of penalties. To avoid this, quality indicators must first of all point out concrete defects and provide suggestions for improvement. The final result will not only be an improvement in the quality of requirements, but also an improvement in the writing skills of requirements engineers.
Article
Full-text available
This paper presents a tool called QuARS (Quality Analyzer of Requirements Specification) for the analysis of natural language software requirements. The definition of QuARS has been based on a special Quality Model for software requirements. The Quality Model aims at providing a quantitative, corrective and repeatable evaluation of software requirement documents. To validate the Quality Model several real software requirements documents have been analyzed by our tool showing interesting results. 1.
Article
Full-text available
Since 2005, the international organization for standardization (ISO) has published a wealth of standards and reports that deal with requirements engineering for systems. ISO/IEC/IEEE 24765 defines a standard vocabulary for systems and software engineering. ISO/IEC 24766 defines requirements for requirements engineering tools. ISO/IEC/IEEE 29148 describes processes for requirements engineering. The ISO 25000 series targets product quality metrics. This review paper shall provide a high-level description of each of these standards and highlights their interconnection. It thus provides to the systems engineer some guidance as to the relevance of those standards to his or her work.
Article
Full-text available
This handbook is about writing software requirements specifications and legal contracts, two kinds of docu-ments with similar needs for completeness, consistency, and precision. Particularly when these are written, as they usually are, in natural language, ambiguity—by any definition—is a major cause of their not specifying what they should. Simple misuse of the language in which the document is written is one source of these ambiguities. This handbook describes the ambiguity phenomenon from several points of view, including linguistics, software engineering, and the law. Several strategies for avoiding and detecting ambiguities are presented. Strong emphasis is given on the problems arising from the use of heavily used and seemingly unambiguous words and phrases such as "all", "each", and "every" in defining or referencing sets pronouns referring to an idea; multiple adjectives; etc. Many examples from requirements documents and legal documents are examined. While no guide can overcome the careless or indifferent writer, this handbook is offered as a guide both for writing better requirements or contracts and for inspecting them for potential ambiguities.
Article
Full-text available
This paper describes an extension to the natural language require-ments specification quality model that is the basis for the QuARS (Quality Ana-lyzer of Requirements Specification) tool. The extension takes into account am-biguities that were not handled before.
Article
Full-text available
Numerous studies in recent months have proposed the use of linguistic instruments to support requirements analysis. There are two main reasons for this: (i) the progress made in natural language processing and (ii) the need to provide the developers of software systems with support in the early phases of requirements definition and conceptual modelling. This paper presents the results of an online market research intended (a) to assess the economic advantages of developing a CASE (computer-aided software engineering) tool that integrates linguistic analysis techniques for documents written in natural language, and (b) to verify the existence of the potential demand for such a tool. The research included a study of the language – ranging from completely natural to highly restricted – used in documents available for requirements analysis, an important factor given that on a technological level there is a trade-off between the language used and the performance of the linguistic instruments. To determine the potential demand for such tool, some of the survey questions dealt with the adoption of development methodologies and consequently with models and support tools; other questions referred to activities deemed critical by the companies involved. Through statistical correspondence analysis of the responses, we were able to outline two ‘‘profiles’’ of companies that correspond to two potential market niches, which are characterised by their very different approach to software development. Erratum in Volume 9, Number 2 / May, 2004; 10.1007/s00766-004-0195-3; http://www.springerlink.com/content /etkrf85ev9831ueu/?p=1f3f1cb51aa74cafb104f78dc60a8705&pi=1
Conference Paper
Full-text available
Use cases are a popular way of specifying functional requirements of computer-based systems. Each use case contains a sequence of steps which are described with a natural language. Use cases, as any other description of functional requirements, must go through a review process to check their quality. The problem is that such reviews are time consuming. Moreover, effectiveness of a review depends on quality of the submitted document - if a document contains many easy-to-detect defects, then reviewers tend to find those simple defects and they feel exempted from working hard to detect difficult defects. To solve the problem it is proposed to augment a requirements management tool with a detector that would find easy-to-detect defects automatically.
Conference Paper
Full-text available
Working with requirements means dealing with problems related to incomplete or faulty textual specifications. Specifications are created in various types and combinations such as models, drawings, and textual documents. The information gap between the specification and the implementation of any (software) system leads to delays, mistakes, and high costs due to problems in the production stages. Therefore, the earlier specifications are corrected and improved, the better. We providea tool called RESI to support analysts while working with textual specifications. RESI considers many of the directives to be followed when processing customer specifications. It offers a dialog-system which makes suggestions and inquires the user when parts of the specification are ambiguous, faulty or inaccurate. RESI utilizes various ontologies to discover such problems and to deliver common sense solutions.
Conference Paper
Full-text available
The development of complex systems frequently in-valves extensive work to elicit, document and review stakeholder requirements. Stakeholder requirements are usually written in unconstrained natural language, which is inherently imprecise. During system development, problems in stakeholder requirements inevitably propagate to lower levels. This creates unnecessary volatility and risk, which impact programme schedule and cost. Some experts advocate the use of other notations to in-crease precision and minimise problems such as ambiguity. However, use of non-textual notations requires translation of the source requirements, which can introduce further errors. There is also a training overhead associated with the introduction of new notations. A small set of structural rules was developed to address eight common requirement problems including ambiguity, complexity and vagueness. The ruleset allows all natural language requirements to be expressed in one of five simple templates. The ruleset was applied whilst extracting aero engine control system requirements from an airworthiness regulation document. The results of this case study show qualitative and quantitative improvements compared with a conventional textual requirements specification.
Conference Paper
Full-text available
The complexity of today's software systems is constantly increasing. As a result, requirements for these systems become more comprehensive and complicated. In this setting, requirements engineers struggle to capture consistent and complete requirements of high quality. We propose a feedback-centric requirements editor to help analysts controlling the information overload. Our HeRA tool provides analysts with important data from various feedback facilities. The feedback is directly given based on the input to the editor. On the one hand, it is based on heuristic rules, on the other hand, on automatically derived models. Thus, when new requirements are added, the analyst gets important information on how consistent these requirements are with the existing ones.
Conference Paper
Full-text available
[Context and motivation] Natural language is the main representation means of industrial requirements documents, which implies that requirements documents are inherently ambiguous. There exist guidelines for ambiguity detection, such as the Ambiguity Handbook [1]. In order to detect ambiguities according to the existing guidelines, it is necessary to train analysts. [Question/problem] Although ambiguity detection guidelines were extensively discussed in literature, ambiguity detection has not been automated yet. Automation of ambiguity detection is one of the goals of the presented paper. More precisely, the approach and tool presented in this paper have three goals: (1) to automate ambiguity detection, (2) to make plausible for the analyst that ambiguities detected by the tool represent genuine problems of the analyzed document, and (3) to educate the analyst by explaining the sources of the detected ambiguities. [Principal ideas/results] The presented tool provides reliable ambiguity detection, in the sense that it detects four times as many genuine ambiguities as than an average human analyst. Furthermore, the tool offers high precision ambiguity detection and does not present too many false positives to the human analyst. [Contribution] The presented tool is able both to detect the ambiguities and to explain ambiguity sources. Thus, besides pure ambiguity detection, it can be used to educate analysts, too. Furthermore, it provides a significant potential for considerable time and cost savings and at the same time quality improvements in the industrial requirements engineering.
Conference Paper
Full-text available
A use case model describes the functional requirements of a software system and is used as input to several activities in a software development project. The quality of the use case model therefore has an important impact on the quality of the resulting software product. Software inspection is regarded as one of the most efficient methods for verifying software documents. There are inspection techniques for most documents produced in a software development project, but no comprehensive inspection technique exists for use case models. This paper presents a taxonomy of typical defects in use case models and proposes a checklist-based inspection technique for detecting such defects. This inspection technique was evaluated in two studies with undergraduate students as subjects. The results from the evaluations indicate that inspections are useful for detecting defects in use case models and motivate further studies to improve the proposed inspection technique.
Conference Paper
Full-text available
complex. Analysts 1 have to cover all aspects of the requirements engineering process when working with the customer, such as work flows, psychological issues, and linguistic problems. We show how analysts can be supported during requirements elici- tation and documentation. We present an approach to improve natural language requirements specifications using ontologies. We demonstrate by several examples from real requirements how an ontological reasoner called RESI 2 can uncover gaps, inaccuracies, ambiguities and ask the user to clarify them. In many cases, it supports the analyst by giving a small number of reasonable suggestions to choose from. The implementation of RESI is currently in progress. I. INTRODUCTION
Conference Paper
[Context and Motivation] This paper notes the advanced state of the natural language (NL) processing art and considers four broad categories of tools for processing NL requirements documents. These tools are used in a variety of scenarios. The strength of a tool for a NL processing task is measured by its recall and precision. [Question/Problem] In some scenarios, for some tasks, any tool with less than 100% recall is not helpful and the user may be better off doing the task entirely manually. [Principal Ideas/Results] The paper suggests that perhaps a dumb tool doing an identifiable part of such a task may be better than an intelligent tool trying but failing in unidentifiable ways to do the entire task. [Contribution] Perhaps a new direction is needed in research for RE tools.
Article
Templates are effective tools for increasing the precision of natural language requirements and for avoiding ambiguities that may arise from the use of unrestricted natural language. When templates are applied, it is important to verify that the requirements are indeed written according to the templates. If done manually, checking conformance to templates is laborious, presenting a particular challenge when the task has to be repeated multiple times in response to changes in the requirements. In this article, using techniques from Natural Language Processing (NLP), we develop an automated approach for checking conformance to templates. Specifically, we present a generalizable method for casting templates into NLP pattern matchers and reflect on our practical experience implementing automated checkers for two well-known templates in the Requirements Engineering community. We report on the application of our approach to four case studies. Our results indicate that: (1) our approach provides a robust and accurate basis for checking conformance to templates; and (2) the effectiveness of our approach is not compromised even when the requirements glossary terms are unknown. This makes our work particularly relevant to practice, as many industrial requirements documents have incomplete glossaries.
Article
Use-cases perform an important role in capturing and analys-ing software requirements in the IT industry. A number of guidelines have been proposed in the literature on how to write use-cases. Structural defects can occur when use-cases are written without following such guidelines. We develop a taxonomy of structural defects and analyse a sample of 360 industrial use-cases to understand the nature of defects in them. Our sample comes from both client-based projects and in-house projects. The results show that, compared to a sample of theoretical use-cases that follow Cockburn's guidelines, industrial use-cases on the average exhibit defects such as complex structures, lack of customer focus and missing actors. Given the shortage of analysis of real industry samples, our results make a significant contribution towards the understanding of the strengths and weaknesses in industrial use-cases in terms of structural defects. The results will be useful for industry practitioners in adopting use-case modelling standards to reduce the defects as well as for software engineering researchers to explore the reasons for such differences between the theory and the practice in use-case modelling.
Article
Context For many years, we have observed industry struggling in defining a high quality requirements engineering (RE) and researchers trying to understand industrial expectations and problems. Although we are investigating the discipline with a plethora of empirical studies, they still do not allow for empirical generalisations. Objective To lay an empirical and externally valid foundation about the state of the practice in RE, we aim at a series of open and reproducible surveys that allow us to steer future research in a problem-driven manner. Method We designed a globally distributed family of surveys in joint collaborations with different researchers and completed the first run in Germany. The instrument is based on a theory in the form of a set of hypotheses inferred from our experiences and available studies. We test each hypothesis in our theory and identify further candidates to extend the theory by correlation and Grounded Theory analysis. Results In this article, we report on the design of the family of surveys, its underlying theory, and the full results obtained from Germany with participants from 58 companies. The results reveal, for example, a tendency to improve RE via internally defined qualitative methods rather than relying on normative approaches like CMMI. We also discovered various RE problems that are statistically significant in practice. For instance, we could corroborate communication flaws or moving targets as problems in practice. Our results are not yet fully representative but already give first insights into current practices and problems in RE, and they allow us to draw lessons learnt for future replications. Conclusion Our results obtained from this first run in Germany make us confident that the survey design and instrument are well-suited to be replicated and, thereby, to create a generalisable empirical basis of RE in practice.