ArticlePDF Available

Abstract and Figures

Software quality-in-use comprehends the quality from users’ perspectives. It has gained its importance in e-learning applications, mobile service based applications and project management tools. User’s decisions on software acquisitions are often ad hoc or based on preference due to difficulty in quantitatively measure software quality-in-use. However, why quality-in-use measurement is difficult? Although there are many software quality models to our knowledge, no works surveys the challenges related to software quality-in-use measurement. This paper has two main contributions; 1) presents major issues and challenges in measuring software quality-in-use in the context of the ISO SQuaRE series and related software quality models, 2) Presents a novel framework that can be used to predict software quality-in-use, and 3) presents preliminary results of quality-in-use topic prediction. Concisely, the issues are related to the complexity of the current standard models and the limitations and incompleteness of the customized software quality models. The proposed framework employs sentiment analysis techniques to predict software quality-in-use.
Content may be subject to copyright.
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
877
Towards Resolving Software Quality-in-Use Measurement Challenges
1 Issa Atoum, 2 Chih How Bong, 3 Narayanan Kulathuramaiyer
1,2,3 Faculty of Computer Science and Information Technology, University Malaysia Sarawak, 94300 Kota Samarahan,Sarawak,Malaysia
ABSTRACT
Software quality-in-use comprehends the quality from users perspectives. It has gained its importance in e-learning
applications, mobile service based applications and project management tools. Users decisions on software acquisitions are
often ad hoc or based on preference due to difficulty in quantitatively measure software quality-in-use. However, why quality-
in-use measurement is difficult? Although there are many software quality models to our knowledge, no works surveys the
challenges related to software quality-in-use measurement. This paper has two main contributions; 1) presents major issues and
challenges in measuring software quality-in-use in the context of the ISO SQuaRE series and related software quality models,
2) Presents a novel framework that can be used to predict software quality-in-use, and 3) presents preliminary results of
quality-in-use topic prediction. Concisely, the issues are related to the complexity of the current standard models and the
limitations and incompleteness of the customized software quality models. The proposed framework employs sentiment
analysis techniques to predict software quality-in-use.
Keywords: Software Quality-in-use; ISO 25010; SQUARE series; Sentiment analysis
1. INTRODUCTION
With thousands of software published online, it is
essential for users to find the software that matches their
stated or implied needs. Users often seek better software
quality. Garvin [1] identified five views/approaches of
quality. The nearest definition in this paper is the user based
approach definition “meeting customer needs”. If the
customer is satisfied, then product or service has good
quality. It has been implemented in mobile-based
applications[2][4] and Web applications[5][7].
Software quality can be conceptualized from three
dimensions; the quality characteristics, the quality model,
and software quality requirements. A Quality characteristic
is ”category of software quality attributes that bears on
software quality” [8, p. 9]. Quality requirements are what the
user needs in the software such as performance, user
interface or security requirements. The quality model is how
quality characteristics are related to each other and to the
final product quality. Measuring the software quality will
check if user requirements are met and decide the degree of
quality.
The ISO/IEC 25010:2010 standard (ISO 25010
hereafter), a part of a series known as the Software Quality
Requirements and Evaluation (SQuaRE), defines systems
quality as “the degree to which the system satisfies the stated
and implied needs of its various stakeholders, and thus
provides value [9, p. 8]. The ISO 25010 has two major
dimensions: Quality-in-use (QinU) and Product Quality. The
former specifies characteristics related to the human
interaction with the system and the latter specifies
characteristics intrinsic to the product. QinU is defined as
“capability of a software product to influence users'
effectiveness, productivity, safety and satisfaction to satisfy
their actual needs when using the software product to
achieve their goals in a specified context of use” [8, p.
17].The QinU model consists of five characteristics:
effectiveness, efficiency, satisfaction, freedom from risk and
context coverage. Table 1 illustrates the definition of these
characteristics.
Table 1: definitions of quality-in-use characteristics as
defined by the ISO 25010 standard
Characteristic
Definition
Effectiveness
Accuracy and completeness with
which users achieve specified goals
(ISO 9241-11).
Efficiency
Resources expended in relation to
the accuracy and completeness with
which users achieve goals (ISO
9241-11).
Freedom
From Risk
Degree to which a product or
system mitigates the potential risk to
economic status, human life, health,
or the environment.
Satisfaction
Degree to which user needs are
satisfied when a product or system
is used in a specified context of use.
Context
Coverage
Degree to which a product or
system can be used with
effectiveness, efficiency, freedom
from risk and satisfaction in both
specified contexts of use and in
contexts beyond those initially
explicitly identified.
1.1 Problem Statement
This paper investigates these problems 1) there are
many challenges that need to be tackled in order to measure
QinU systematically. However, current literature reviews on
software QinU does not identify or explain them. To the
best of our knowledge this is the first work that specifically
identifies and explains the problems towards measuring
QinU. 2) Insufficient research on other possible research
directions to tackle the first problem. To our knowledge,
little work target to resolve QinU problem [10].
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
878
1.2 Research Contributions
This paper identifies and explains several problems
while measuring software QinU using the standard
and customized quality models. This paper is the
first that surveys several quality models and
explains various challenges to measure QinU. In
brief, most of the challenges in ISO standard
models are related to the complication and
incompleteness of the documents. On the other
hand, customized quality models are subject to
incomplete models that are designed for their own
specific needs.
Proposes a novel framework to predict software
QinU from software reviews. Given the issues
related to measuring QinU a framework is
presented to resolve these issues. The framework is
based on sentiment analysis, an emerging branch
of Natural Language Processing. Sentiment
analysis or opinion mining targets to analyze
textual user judgments about products or
services[11], [12]
First major software quality-in-use related models
are illustrated. Then, the quality-in-use measurement
challenges are explained. Next, a proposed approach is
presented and finally, the paper is concluded.
2. SOFTWARE QUALITY-IN-USE
MODELS
There have been many works in software quality
models but to our knowledge, no research has been
conducted to summarize the main problems in measuring
quality-in-use. Measuring software quality-in-use can be
divided in two main frameworks; the standard and
customized model frameworks.
2.1 Standard Frameworks
There have been many standards that can support
software quality, but many of them are rather check list
guide. For example, the ISO 9000 family has been criticized
in literature not to be used for software [13]. The
ANSI/IEEE 730-2002[14] support quality assurance plans.
ISO/IEC 15504[15] or as it is known Software Process
Improvement and Capability Determination (SPICE), is a
set of technical standards documents for the computer
software development process and related business
management functions. These standards are not designed to
address quality-in-use nor specific characteristics of
software product quality.
Recently, the Software Product Quality
Requirements and Evaluation (SQuaRE) ISO standard
series are a result of blending the ISO/IEC 9126 and
ISO/IEC 14598 series of standards. The purpose of the
SQuaRE series of standards is to assist developing and
acquiring software products with the specification of quality
requirements and evaluation. From the viewpoint of the
stakeholders the quality requirements are specified, the
quality of the product is evaluated based on this
specification utilizing chosen quality model, quality
measurement and quality management process.
To measure QinU effectively, five divisions of the
SQuaRE series have to be considered the ISO 2502n to ISO
25024 and in line with the ISO 25010 model as shown in
Fig. 1. Technically and, more precisely, the QinU
Measurement Standard ISO 25022 has to be considered in
the context of four other standards: the Measurement
Reference Model and Guide ISO 25020; the Measurement
of Data Quality 25024, the Measurement of System and
Software Product Quality ISO 25023, and Quality Measure
Elements Standards ISO 25021. Fig. 2 depicts the
relationship between the ISO/IEC 25022 and other ISO/IEC
2502n division of standards.
While these standards provide the freedom of
customization, they need careful quality assurance to
provide apparent integration between related standards.
They also suffer to detail how the customization need to be
carried out.
2.2 Customized Software Quality Models
Below are some of related models grouped in
logical groups.
2503n
Quality
Requirement
Division
2501n
Quality Model
Division
2504n
2500n Quality
Management
Division
2502n Quality
Measurement
Division
ISO/IEC 25050 25099 SQuaRE Extension Division
Fig 1: Organization of Square series of International Standards
Fig 2: Structure of the Quality Measurement division
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
879
2.2.1 Hierarchical Models
These models link various quality characteristics
together at different levels, which in turn are finally linked
with the root product quality. Activity based models can
adopt these models to measure software quality. Activity
based models provide an interrelation between system
properties and the activities carried out by the various
stakeholders. Activity based models usually tracks
development or testing activities rather than user activities
.Famous hierarchical models are McCall’s Quality Model
[16], Boehm’s Quality Model [17], Dromey’s Quality
Model [18], and The Software Quality In Development
(SQUID) approach [19].
2.2.2 One-Quality-Aspect Models
These models measure one aspect of software
quality. This category includes predictive quality
models[20][23], quality metrics models[24][27] and
software reliability models.
Predictive quality models[20][23]use the product
attributes or its properties and product users to predict the
quality of the software. Exploiting ideas in
COCOMO[20]the Constructive Quality Model [21], [22]
(COQUAMO) helps project managers to manage, assess
and predict product during the development lifecycle.
Software Metrics Models are concerned with quality
evaluation of a specific quality metric, quality assurance, or
prediction. Many researchers have shown that these metrics
are not certain indicators of faults[28], line code dependent
[28] or programming dependent[29][30].Software
Reliability Models target to measure the reliability of
software systems based on failures intensity and software
history profiles. Example of such models are [31], [32]. The
application of these models to get quality-in-use is not
feasible because the software will be in operation phase for
users. I.e., there is neither history nor source code for
investigation
2.2.3 Provider-Specific Models
There are specific quality models to certain
programming language or implementation platforms.
FURPS Quality Model was presented by Grady[33] and
later extended and owned by IBM Rational Software [34]
[36]. Quamoco Product Quality Model[37] was initially
designed for German Software and is a multipurpose quality
model based on ISO 25010.
From previously studied models several challenges must be
tackled. Section 3 below discusses a list of these challenges.
3 QUALITY-IN-USE MEASUREMENT
CHALLENGES
Below are major challenges while measuring
software quality-in-use in general, measuring quality-in-use
using standard frameworks, and measuring quality-in-use
using customized models.
3.1 General Challenges
3.1.1 Task Measurement
To measure QinU there is a need to agree with the
software user on a set of tasks that he/she need to do in
order to accomplish a pragmatic goal (“do goals” to achieve
the task such as pay a bill). This means the user should be
involved in the quality requirements specification which
might not be applicable in all times. Other issues related to
task measurement embraces the variety of tasks from one
software function to another and from software to another.
For example, a task to open a file for writing is different
than a task of removing special characters from a text file.
Worse on this, defining what are the tasks is by itself a
major challenge. Hedonic tasks (the “be goals”) that imply
user satisfaction cannot be specified, thus cannot be
measured directly.
3.1.2 The Web Software Development Life Cycle
Users of publicly available online software are
never asked to be part of the system development life cycle,
but usually the developers and software designer are
making assumptions on user needs. In cases where software
is designed to be used by global users such as operating
systems or antivirus software, then software publishers have
to find other ways to collect user needs. However, it might
be a disaster when users start using the software. Not
because of software bugs, but because users are not
satisfied. Users need to see software doing what they were
thinking of without draining their mind with all the lifecycle
of the software.
3.1.3 Dynamic Customer Needs
Customer needs are dynamic and they can change
from time to time, so quantitative measures might not be
suitable. Ishikawa [38] states that "International standards
established by the International Organization for
Standardization (ISO) or the International Electro technical
Commission (IEC) are not perfect. They contain many
shortcomings. Consumers may not be satisfied with a
product which meets these standards. Consumer
requirements change from year to year and even frequently
updated standards cannot keep the pace with consumer
requirements". These needs are usually resolved by
building new software versions, however software might
get complicated or buggy due to extra feature added that
were not planned ahead. If users are involved ahead of time,
these needs will be incorporated. Therefore, this problem
returns us to the first and second unsolved issues above.
3.2 Challenges Related to Standard Quality
Frameworks
3.2.1 Quality Models Critiques
There are problems that are intrinsic to quality
models. In a comprehensive study of software quality
models,[39] identified critiques to many software quality
models; they are unclear of their purposes, not satisfying
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
880
users of how to use the quality models, and there is no
uniform terminology between different models. Masip et
al.[40]stated that user experience is implied in ISO 25010
but is not defined.
3.2.2 Evaluation Requirements
Looking into the mathematical formulas of quality-
in-use in the ISO 25022 standard and the proposed methods
to measure quality-in-use, quality managers find it a hard
job. For example, to measure the effectiveness; task
completion, task effectiveness and error frequency has to be
calculated. Moreover, Integrating related quality processes
of various standards (Fig. 1, Fig. 2) is a problem for quality
engineers. The reason behind this problem is the need of
experienced engineers given limited information in the
standard models on how to customize them, especially for
small sized companies. In an extension to the ISO 25010
Lew et al. [41] suggest adding data quality inside the ISO
25010 instead of being separate. Monitoring user actions or
usage statistics to measure quality-in-use are not enough. A
wide range of measuring methods needs an acceptable level
of experts in each domain as shown in ISO 25022 standard.
3.2.3 QinU Environmental Factors
While quality-in-use model tries to measure the
human computer system interaction there are many factors
that affect quality-in-use according to the ISO QinU model:
the information system, target computer system, target
software, target data, usage environment, and user type
(primary, secondary, or indirect user). Measuring or
estimating these factors is a complex process.
3.3 ChallengesRelated to Customized Models
3.3.1 Limitation of Quality-in-use Models
Although there are many software quality models
such as McCall, Boehm, Dromey and FURUPS [16], [18] ,
most of them target the software product or process
characteristics and does not suit software quality-in-use or
require manual user involvement [42], [43]. The
McCabe(1976) and Halstead(1977), are used since 1970’s
while Chidamber& Kemerer metrics[26] triggers its use in
1994. These metrics depend on programming style object-
oriented[26] versus procedure programming
approaches[24], [25]. Moreover, results from COQUAMO
model concluded that there were no software product
metrics that were, in general, likely to be good predictors of
final product qualities .Thus metrics used in measuring
product quality cannot be utilized to measure quality-in-use
directly.
4. PROPOSED FRAMEWORK
Opinion mining or sentiment analysis is an
emerging research direction based on Natural Language
Processing that targets to analyze textual user judgments
about products or service[11], [12]. Reviews text snippets
are good sources for users to decide software purchase and
they are a goldmine for product providers. It is obvious that,
the average human reader will have difficulty accurately
summarizing relevant information and identifying opinions
contained in reviews about a product. Moreover, human
analysis of textual information is subject to considerable
biases resulted from preferences and different
understanding of written textual expressions. Therefore,
opinion mining provided an alternative to identify important
reviews and opinions to answer users’ queries[44], [45].
Despite the difficulties of sentiment analysis
approach[46], [47][48], it can be used to overcome issues
discussed in Section 3. The sentiment analysis can
seemingly work on user reviews without active user
involvement. Moreover, by sentiment analysis the software
trends can be analyzed and future software quality can be
predicted.
Next are the details of the framework.
4.1 Proposed Quality-in-use Prediction Framework
Fig. 3 shows a general framework to predict
software Quality-in-use. We reemphasize that the purpose
of this framework is to present a conceptual model to
highlight high level details of the proposed framework. In
this figure processes are marked by the letter p followed by
process number and the input is marked by input1 and
input2 respectively. The general idea of the framework is to
utilize software reviews and ISO standard documents in
order to process review text snippets into QinU
characteristics.
The proposed framework has two inputs, the ISO
QinU documents (input1) and the software reviews
(input2). In this framework, the ISO documents have the
quality-in-use description, modeling, specification and
evaluation process components. These components will be
used to 1) describe QinU for annotators, 2) to get the QinU
characteristics, 3) to score QinU using formulas, and 4) to
help validate the proposed model. The software reviews text
is fed into a data preparation process (p1) in order to get a
Fig 3: Proposed Quality-in-use Prediction Framework
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
881
set of annotated/classified sentences that would be used as a
gold standard.
In the data preparation process, first reviews are
crawled from software, websites such as Amazon and
CNET. These reviews have a star rating from 1 to 5, where
1 stands for bad comment about the software and 5 stands
for excellent comment. To balance the input data, for each
star rating the top 10 reviews are selected for the next step.
This process ensures that the input comments covers the
whole star rating range of comments. Next, the reviews
from the previous step are split into sentences using a
combined automatic method and manual method to cover
long sentences or sentences that are missing punctuated.
Taking a sentence at a time and within the context of the
review the annotator assigns a topic to the sentence. If the
sentence is topic related, the annotator will assign the
keyword that makes that sentence for a certain topic. For
example the annotator might select the word fast from the
sentence “this software is fast” as a keyword for the topic
efficiency. Additionally the annotator will choose why a
sentence is positive or negative by choosing an opinion,
expression word and a modifier if available. Finally the
classified sentences are saved in the database for the next
step which is QinU extraction.
The core of this framework is the QinU extraction.
In this step it is proposed to use a sentence semantic
similarity measure to map testing sentences into QinU
characteristics (p2). In this step the sentences are classified
into 3 topics; effectives, efficiency, and risk mitigation
using a proposed sentence similarity. In the next step, each
sentence is assigned a polarity by using data from the gold
standard data set (p1). QinU is scored in process p4 by
using linear combination of Qin Characteristics described in
the ISO QinU Measurement Standard ISO 25022.
To evaluate the efficiency of the proposed sentence
measure in process p2, it can be compared with other
famous approaches. Several methods can be used for
comparison. This paper chooses to compare with the below
methods. The reason for choosing these methods is to have
different measures from different spectrums. Li and google
tri-grams [49] as sentence measures, Multinomial Naive
Bayes text classification (NB) [50][52] and SVM[53] for
text classification, LSA [54] for semantic space
classification. These methods are evaluated in terms of
standard classification performance measures: f-measure,
accuracy and ROC analysis shown in process p6. In order
to validate the framework a Use Case is built in p7 to
validate the model.
5. PRELIMINARY RESULTS
First the F-measure experiments are shown. Then
the top 5 topic words are shown.
5.1 F-measur Results
To show the validity of this work, 600 software
sentences were labelled to QinU topics. Then 3 algorithms
were run; the Multinomial Naive Bayes (NB) Measure
algorithm[55], The Multiclass SVM[56], and the Latent
semantic analysis[57]. These methods were able to detect a
test sentence topic (illustrated in the P2 step of the Fig.3).
The experiment was run on 3 fold cross validation. Fig.
Fig. 4 shows the F-measure of these measures. The figure
shows that the sentence length has a direct effect on the
final F-measure. Short sentences will tend to have less
common words and thus low F-measure.
5.2 Top Five Keywords
Table 2 shows the top 5 keywords in each topic.
From the table, we can see that the words in effectiveness
are talking about doing the job, the words in efficiency are
talking about expenditure of resources. The risk keywords
are talking about the possibility of losing data.
Table 2: Top 5 topic keywords for QinU
Effectiveness
Efficiency
Freedom from Risk
work
speed
issue
features
stable
trouble
interface
slow
error
simple
load
Freeze
easy
memory
fix
6. RELATED WORKS
Feature or topic extraction has been discussed in
literature in many works such as [48], [58][61]. Most of
these works use the language semantics to extract features
such as nouns and noun phrases along with their frequencies
subject to predefined thresholds. Qiu et al. [45], [61]
suggested to extract both features and opinion by
propagating information between them using grammatical
syntactic relations.
Fig. 4.Compared Methods against F-measure
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
510 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
F-measure
Sentence length(words)
F-measure of compared methods against sentence length
LSA
NB
SVM
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
882
Leopairote, Surarerks, and Prompoon[10] proposed
a model that can extract and summarize software reviews in
order to predict software quality-in-use”. The model
depends on a manually built ontology of ISO 9126 “quality-
in-use keywords and Word Net 3.0 synonyms expansion.
We consider the work of [10] the most nearby to this paper.
The difference from proposed work is that the proposed
framework employs word similarity and relatedness rather
than rule based classification and ontologies.
7. CONCLUSION
Quality-in-use represents software quality in the
viewpoint of a user. This paper presents the major issues in
measuring software quality-in-use. Quality-in-use can be
measured using standard SQuaRE series while many
characteristics of software quality-in-use are scattered in
many customized software quality models. Measuring
quality-in-use is challenging, due to the complexity of
current standard models and the incompleteness of other
related customized models. The viewpoint of the software
users is hard to be implemented within the software
lifecycle ahead of time especially for hedonic tasks. This
paper proposes to process software reviews in order to get
software quality-in-use. The framework employs sentence
semantic relatedness to get a score for QinU characteristics.
8. ACKNOWLEDGMENT
This study was supported in part by University
Malaysia Sarawak' Zamalah Graduate Scholarship and grant
ERGS/ICT07 (01) /1018/2013 (15) Minister of Education
Malaysia.
REFERENCES
[1] D. A. Garvin, “What does product quality really
mean,” Sloan Manage. Rev., vol. 26, no. 1, pp. 25
43, 1984.
[2] N. B. Osman and I. M. Osman, “Attributes for the
quality in use of mobile government systems,” in
2013 International Conference on Computing,
Electrical and Electronics Engineering (ICCEEE),
2013, pp. 274279.
[3] R. Alnanih, O. Ormandjieva, and T. Radhakrishnan,
“A New Methodology (CON-INFO) for Context-
Based Development of a Mobile User Interface in
Healthcare Applications,” in Pervasive Health SE -
13, 1st ed., A. Holzinger, M. Ziefle, and C. Röcker,
Eds. London: Springer London, 2014, pp. 317342.
[4] H. H. J. La and S. D. S. Kim, “A model of quality-
in-use for service-based mobile ecosystem,” in 2013
1st International Workshop on the Engineering of
Mobile-Enabled Systems (MOBS), 2013, pp. 1318.
[5] T. Orehovački, A. Granić, and D. Kermek,
“Evaluating the perceived and estimated quality in
use of Web 2.0 applications,” J. Syst. Softw., vol.
86, no. 12, pp. 30393059, Dec. 2013.
[6] T. Orehovački, “Development of a Methodology for
Evaluating the Quality in Use of Web 2.0
Applications,” in Human-Computer Interaction
INTERACT 2011 SE - 38, vol. 6949, P. Campos, N.
Graham, J. Jorge, N. Nunes, P. Palanque, and M.
Winckler, Eds. Springe rBerlin Heidelberg, 2011,
pp. 382385.
[7] J. L. González, R. García, J. M. Brunetti, R. Gil,
and J. M. Gimeno, “SWET-QUM: a quality in use
extension model for semantic web exploration
tools,” in Proceedings of the 13th International
Conference on Interacci\ón Persona-Ordenador,
2012, pp. 15:115:8.
[8] ISO, “ISO/IEC 25000:2014, Software and system
engineering--Software product Quality
Requirements and Evaluation (SQuaRE)--Guide to
SQuaRE,” Geneva, Switzerland, 214AD.
[9] ISO/IEC, “ISO/IEC 25010: 2011, Systems and
software engineering--Systems and software
Quality Requirements and Evaluation (SQuaRE)--
System and software quality models,” International
Organization for Standardization, Geneva,
Switzerland, 2011.
[10] W. Leopairote, A. Surarerks, and N. Prompoon,
“Software quality in use characteristic mining from
customer reviews,” in 2012 Second International
Conference on Digital Information and
Communication Technology and it’s Applications
(DICTAP), 2012, pp. 434439.
[11] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and
M. Stede, “Lexicon-based methods for sentiment
analysis,” Comput. Linguist., vol. 37, no. 2, pp.
267307, Jun. 2011.
[12] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai,
“Topic sentiment mixture: modeling facets and
opinions in weblogs,” in Proceedings of the 16th
international conference on World Wide Web, 2007,
pp. 171180.
[13] D. Stelzer, W. Mellis, and G. Herzwurm, “A critical
look at ISO 9000 for software quality
management,” Softw. Qual. J., vol. 6, no. 2, pp. 65
79, 1997.
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
883
[14] R. M. Poston, “IEEE 730: a guide for writing
successful SQA plans,” IEEE Softw., pp. 8688,
1985.
[15] K. El-Emam and I. Garro, “ISO/IEC 15504,” Int.
Organ. Stand., 1999.
[16] J. A. McCall, P. K. Richards, and G. F. Walters,
“Factors in software quality :Preliminary Handbook
on Software Quality for an Acquisition Manager,”
General Electric, National Technical Information
Service., Sunnyvale,CA, 1977.
[17] B. W. Boehm, J. R. Brown, H. Kaspar, M. Lipow,
G. J. MacLeod, and M. J. Merrit, Characteristics of
software quality, vol. 1. North-Holland Publishing
Company, 1978.
[18] R. G. Dromey, “A model for software product
quality,” IEEE Trans. Softw. Eng., vol. 21, no. 2,
pp. 146162, Feb. 1995.
[19] B. Kitchenham, S. Linkman, A. Pasquini, and V.
Nanni, “The SQUID approach to defining a quality
model,” Softw. Qual. J., vol. 6, no. 3, pp. 211233,
1997.
[20] B. W. Boehm, “Software engineering economics,”
IEEE Trans. Softw. Eng., no. 1, pp. 421, 1984.
[21] B. Kitchenham, “Towards a constructive quality
model. Part 1: Software quality modelling,
measurement and prediction,” Softw. Eng. J., vol. 2,
no. 4, pp. 105126, 1987.
[22] B. Kitchenham and L. Pickard, “Towards a
constructive quality model. Part 2: Statistical
techniques for modelling software quality in the
ESPRIT REQUEST project,” Softw. Eng. J., vol. 2,
no. 4, pp. 114126, 1987.
[23] D.-C. Sunita, “Bayesian analysis of software cost
and uality models,” UNIVERSITY OF
SOUTHERN CALIFORNIA, 1999.
[24] T. J. McCabe, “A complexity measure,IEEE
Trans. Softw. Eng., no. 4, pp. 308320, 1976.
[25] M. H. Halstead, Elements of Software Science
(Operating and programming systems series).
Elsevier Science Inc., 1977.
[26] S. R. Chidamber and C. F. Kemerer, “A metrics
suite for object oriented design,” IEEE Trans.
Softw. Eng., vol. 20, no. 6, pp. 476493, 1994.
[27] E. VanDoren, “Maintainability Index Technique for
Measuring Program Maintainability,” SEI STR Rep.,
2002.
[28] N. E. Fenton and S. L. Pfleeger, Software metrics: a
rigorous and practical approach, 2nd ed. PWS
Publishing Co., 1998.
[29] R. Marinescu and D. Ratiu, “Quantifying the quality
of object-oriented design: the factor-strategy
model,” in Proceedings of the 11th Working
Conference on Reverse Engineering, 2004, pp. 192
201.
[30] J. Bansiya and C. G. Davis, “A hierarchical model
for object-oriented design quality assessment,”
IEEE Trans. Softw. Eng., vol. 28, no. 1, pp. 417,
2002.
[31] J. D. Musa and K. Okumoto, “A Logarithmic
Poisson Execution Time Model for Software
Reliability Measurement,” in Proceedings of the 7th
International Conference on Software Engineering,
1984, pp. 230238.
[32] B. Littlewood, “The Littlewood-Verrall model for
software reliability compared with some rivals,” J.
Syst. Softw., vol. 1, no. 0, pp. 251258, 1980.
[33] R. B. Grady, Practical software metrics for project
management and process improvement. Upper
Saddle River, NJ: Prentice-Hall, Inc., 1992.
[34] P. Kruchten, The rational unified process: an
introduction, 3rd ed. Boston,Massachusetts, United
States: Addison-Wesley Professional, 2004.
[35] S. Ambler, J. Nalbone, and M. Vizdos, Enterprise
unified process, the: extending the rational unified
process, First. Upper Saddle River, NJ, USA:
Prentice Hall Press, 2005.
[36] I. Jacobson, The unified software development
process. India: Pearson Education, 1999.
[37] S. Wagner, K. Lochmann, L. Heinemann, M. Kläs,
A. Trendowicz, R. Plösch, A. Seidl, A. Goeb, and J.
Streit, “The quamoco product quality modelling and
assessment approach,” in Proceedings of the 2012
International Conference on Software Engineering,
2012, pp. 11331142.
[38] K. Ishikawa, What is total quality control? the
Japanese way, 1st ed. Upper Saddle River, NJ:
Prentice Hall, 1985.
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
884
[39] F. Deissenboeck, E. Juergens, K. Lochmann, and S.
Wagner, “Software quality models: Purposes, usage
scenarios and requirements,” in IEEE Workshop on
Software Quality WOSQ ’09, 2009, pp. 914.
[40] L. Masip, M. Oliva, and T. Granollers, “User
experience specification through quality attributes,”
in Proceedings of the 13th IFIP TC 13 international
conference on Human-computer interaction -
Volume Part IV, 2011, pp. 656660.
[41] P. Lew, L. Olsina, and L. Zhang, “Integrating
quality, quality in use, actual usability and user
experience,” in Software Engineering Conference
(CEE-SECR), 2010 6th Central and Eastern
European, 2010, no. 978, pp. 117123.
[42] D. Samadhiya, S.-H. Wang, and D. Chen, “Quality
models: Role and value in software engineering,” in
2010 2nd International Conference on Software
Technology and Engineering (ICSTE), 2010, vol. 1,
pp. V1320 V1324.
[43] R. E. Al-Qutaish, “Quality models in software
engineering literature: an analytical and
comparative study,” J. Am. Sci., vol. 6, no. 3, pp.
166175, 2010.
[44] W. Zhang, H. Xu, and W. Wan, “Weakness Finder:
Find product weakness from Chinese reviews by
using aspects based sentiment analysis,” Expert
Syst. Appl., vol. 39, no. 11, pp. 1028310291, Sep.
2012.
[45] G. Qiu, B. Liu, J. Bu, and C. Chen, “Expanding
domain sentiment lexicon through double
propagation,” in Proceedings of the 21st
international jont conference on Artifical
intelligence, 2009, pp. 11991204.
[46] P. D. Turney, “Thumbs up or thumbs down?:
semantic orientation applied to unsupervised
classification of reviews,” in Proceedings of the
40th Annual Meeting on Association for
Computational Linguistics, 2002, no. July, pp. 417
424.
[47] K. P. P. Shein and T. T. S. Nyunt, “Sentiment
Classification Based on Ontology and SVM
Classifier,” in Second International Conference on
Communication Software and Networks, 2010, pp.
169172.
[48] L. Zhang, B. Liu, S. H. S. H. Lim, and E. O’Brien-
Strain, “Extracting and ranking product features in
opinion documents,” in Proceedings of the 23rd
International Conference on Computational
Linguistics: Posters, 2010, no. August, pp. 1462
1470.
[49] A. Islam, E. Milios, and V. Kešelj, “Text similarity
using google tri-grams,” in Advances in Artificial
Intelligence, vol. 7310, L. Kosseim and D. Inkpen,
Eds. Springer, 2012, pp. 312317.
[50] D. D. Lewis and W. A. Gale, “A Sequential
Algorithm for Training Text Classifiers,” in
Proceedings of the 17th Annual International ACM
SIGIR Conference on Research and Development in
Information Retrieval, 1994, pp. 312.
[51] A. M. Kibriya, E. Frank, B. Pfahringer, and G.
Holmes, “Multinomial naive bayes for text
categorization revisited,” in AI 2004: Advances in
Artificial Intelligence, Springer, 2005, pp. 488499.
[52] L. Jiang, Z. Cai, H. Zhang, and D. Wang, “Naive
Bayes text classifiers: a locally weighted learning
approach,” J. Exp. & Theor. Artif. Intell., vol.
25, no. 2, pp. 273286, 2013.
[53] K. Crammer and Y. Singer, “On the Algorithmic
Implementation of Multiclass Kernel-based Vector
Machines,” J. Mach. Learn. Res., vol. 2, pp. 265
292, Mar. 2002.
[54] J. O’Shea, Z. Bandar, K. Crockett, and D. McLean,
“A Comparative Study of Two Short Text Semantic
Similarity Measures,” in Agent and Multi-Agent
Systems: Technologies and Applications, vol. 4953,
N. Nguyen, G. Jo, R. Howlett, and L. Jain, Eds.
Springer Berlin Heidelberg, 2008, pp. 172181.
[55] A. McCallum and K. Nigam, “A comparison of
event models for naive bayes text classification,” in
Association for the Advancement of Artificial
Intelligenceworkshop on learning for text
categorization, 1998, vol. 752, pp. 4148.
[56] B. E. Boser, I. M. Guyon, and V. N. Vapnik, “A
Training Algorithm for Optimal Margin
Classifiers,” in Proceedings of the Fifth Annual
Workshop on Computational Learning Theory,
1992, pp. 144152.
[57] S. C. Deerwester, S. T. S. T. Dumais, T. K.
Landauer, G. W. Furnas, and R. A. Harshman,
“Indexing by latent semantic analysis,” J. Am. Soc.
Inf. Sci., vol. 41, no. 6, pp. 391407, Sep. 1990.
Vol. 5, No. 11 November 2014 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2014 CIS Journal. All rights reserved.
http://www.cisjournal.org
885
[58] A. Mukherjee and B. Liu, “aspect extraction
through Semi-Supervised modeling,” in
Proceedings of 50th anunal meeting of association
for computational Linguistics (ACL-2012), 2012,
no. July, pp. 339348.
[59] L. W. Ku, Y. T. Liang, and H. H. Chen, “Opinion
extraction, summarization and tracking in news and
blog corpora,” in Proceedings of AAAI-2006 Spring
Symposium on Computational Approaches to
Analyzing Weblogs, 2006, pp. 100107.
[60] T.-L. T.-S. Wong and W. Lam, “An unsupervised
framework for extracting and normalizing product
attributes from multiple web sites,” in Proceedings
of the 31st annual international ACM SIGIR
conference on Research and development in
information retrieval, 2008, pp. 3542.
[61] G. Qiu, B. Liu, J. Bu, and C. Chen, “Opinion word
expansion and target extraction through double
propagation,Comput. Linguist., vol. 37, no. 1, pp.
927, 2011.
... Atoum et al. [14] suggested to build a dataset of software quality-in-use toward solving this problem. They further proposed two frameworks towards solving this problem [6], [15]. A complete model of software prediction were also proposed in [7], [16]. ...
Preprint
Software review text fragments have considerably valuable information about users experience. It includes a huge set of properties including the software quality. Opinion mining or sentiment analysis is concerned with analyzing textual user judgments. The application of sentiment analysis on software reviews can find a quantitative value that represents software quality. Although many software quality methods are proposed they are considered difficult to customize and many of them are limited. This article investigates the application of opinion mining as an approach to extract software quality properties. We found that the major issues of software reviews mining using sentiment analysis are due to software lifecycle and the diverse users and teams.
... Согласно стандарту ISO/IEC 25022 [15], существует множество факторов, влияющих на качество при использовании программного продукта. Измерение или оценка этих факторов представляет собой сложный процесс [24]. В целом, программная инженерия предлагает и методы, и инструменты, помогающие снизить сложность разработок. ...
Article
Full-text available
In this study, the relationship between the improvement of software requirement quality and the software product quality in use was explored and analyzed. Analysis was based on the design of software product quality-in-use and the measure of metrics from ISO/IEC 25010 standard in two software products. The results show that the validation activities introduced in the software requirements stage have a positive relationship with the quality in use of the software products analyzed. In the software studied, it can be said that the improvement of the quality of the requirements has contributed to the improvement of the quality in use of software products.
... We draw attention to the reader that the essential part of this framework is the customer who should agree on the approach, while the technical team implements the framework in details [31]. According to natural language processing communities, once the developer has enough data and a good semantic model, the results become a foregone conclusion [10], [32]- [34]. ...
Conference Paper
Full-text available
A successful operational software depends on adequacy and degrees of freedom in requirements definitions. The software developer in conjunction with the customer validates requirements to ensure the completion of the intended use and the capability of the target application. Notwithstanding, requirements validation is time-consuming, effortless and expensive , and many times involves error-prone manual activities. The difficulty of the problem increases with an increase in the application size, the application domain, and inherit textual requirements constructs. Current approaches to the problem are considered as passive-defect aggregations, domain specific , or rather fine-grained with formal specifications. We propose a scalable operational framework to learn, predict, and recognize requirements defects using semantic similarity models and the Integration Functional Definition methods. The proposed framework automates the validation process and increases the productivity of software engineers online with customer needs. A proof of concept shows the applicability of our solution to requirements inconsistency defects.
Chapter
Full-text available
People perceive advanced technologies to be nondeterministic as they become increasingly complex and opaque. Therefore, the issue of trust in technology is becoming crucial, particularly in risky domains where the consequences of misuse can cause significant harm or loss. There is a lack of design heuristics that consider the users’ perspective on the trustworthiness of technologies to support practitioners in promoting trust in technologies. In this exploratory study, we aimed to understand how users perceive the trustworthiness of advanced financial technologies. A survey was conducted to assess users’ risk propensity and trust in technology, followed by semi-structured interviews with 12 participants to examine their behaviours and perceptions when using cryptocurrency platforms. We use the observation method to triangulate the data, asking participants to show us how they use the application. The grounded theory analysis identified the following factors that affect users’ perceptions of trustworthiness in advanced financial technologies: Usability, Credibility, Risk Mitigation, Reliability, and Level of Expertise.KeywordsValue-sensitive designHuman-centered design, Human-computer trustTrustworthy technology
Article
Full-text available
In the current competitive world, producing quality products has become a prominent factor to succeed in business. In this respect, defining and following the software product quality metrics (SPQM) to detect the current quality situation and continuous improvement of systems have gained tremendous importance. Therefore, it is necessary to review the present studies in this area to allow for the analysis of the situation at hand, as well as to enable us to make predictions regarding the future research areas. The present research aims to analyze the active research areas and trends on this topic appearing in the literature during the last decade. A Systematic Mapping (SM) study was carried out on 70 articles and conference papers published between 2009 and 2019 on SPQM as indicated in their titles and abstract. The result is presented through graphics, explanations, and the mind mapping method. The outputs include the trend map between the years 2009 and 2019, knowledge about this area and measurement tools, issues determined to be open to development in this area, and conformity between conference papers, articles and internationally valid quality models. This study may serve as a foundation for future studies that aim to contribute to the development in this crucial field. Future SM studies might focus on this subject for measuring the quality of network performance and new technologies such as Artificial Intelligence (AI), Internet of things (IoT), Cloud of Things (CoT), Machine Learning, and Robotics.
Article
Full-text available
Evaluation of the use of information technology is needed to identify or check how successful the system has been created or implemented. This research is an evaluation of the simulation implementation that has been built previously. Simulation of information technology is done on one of the e-government system in health field that is EJKBM. The system was built using the SDLC method, program flow and database was created using ERD, DFD and MVC. System testing is done using white box and black box testing. The evaluation is done by using ISO/IEC 25010:2011. Based on the value of cyclomatyc complexity V (G) on test white box system stated relevant, on testing black box obtained results that the system in accordance with the expected results and based on testing using ISO / IEC 25010: 2011 system stated has a good functionality, reliability, usability,efficiency, maintainability and portability value.
Article
Full-text available
Software quality in use (QinU) relates to human-software interactions when a software product is used in a particular context. Currently, QinU measurement models are bound to ineffective measurement formulation and many models are subjectively incoherent. This paper proposes a novel QinU framework (QinUF) to measure QinU competently consuming software reviews. The framework has three components: QinU prediction, polarity classification, and QinU scoring. The QinU prediction component computationally maps software review-sentences to its respective QinU characteristics (topics) of the ISO 25010 model based on a text similarity measure. The topic prediction problem is run as a text to text similarity; where the first text (test) is the actual unlabeled review-sentence and the second text is the set of selected features (keywords) from a benchmark dataset. The polarity classification component classifies each test sentence to its polarity orientation; the respective sentimental values are recorded. To score QinU, the sentimental values are grouped and summarized into their respective QinU topics. The QinUF evaluation over real-life scenarios showed that the QinUF automates software QinU measurement; therefore, users could compare and acquire software on the fly. The framework is consistent and superior to related compared works.
Article
While software metrics are a generally desirable feature in the software management functions of project planning and project evaluation, they are of especial importance with a new technology such as the object-oriented approach. This is due to the significant need to train software engineers in generally accepted object-oriented principles. This paper presents theoretical work that builds a suite of metrics for object-oriented design. In particular, these metrics are based upon measurement theory and are informed by the insights of experienced object-oriented software developers. The proposed metrics are formally evaluated against a widelyaccepted list of software metric evaluation criteria.
Chapter
Mobile technology is an integral part of the modern healthcare environment. In Pervasive Healthcare, the Mobile User interface (MUI) serves as the bridge between the application and the healthcare professional. It is important that the doctor be able to easily express his needs on the MUI and correctly interpret the information displayed. The context-based MUI design methodology developed in this chapter offers a new approach to automated MUI context adaptation. This methodology for designing an adaptable context-dependent MUI for healthcare applications provides a solution that makes essential patient information available to doctors in an easily accessible, clear, and accurate way, and at any time. The quality-in-use of the MUI designed with this methodology is monitored using a new measurement model inspired by the ISO 25010 international standard and adapted to healthcare. The measurement model is validated both theoretically and empirically. The benefits of the proposed methodology for healthcare professionals include improved productivity, performance, and level of satisfaction, as well as increased patient safety, as doctors can access patient information whenever it is needed. The methodology is illustrated on a case study.
Article
This paper presents empirical results for several versions of the multinomial naive Bayes classifier on four text categorization problems, and a way of improving it using locally weighted learning. More specifically, it compares standard multinomial naive Bayes to the recently proposed transformed weight-normalized complement naive Bayes classifier (TWCNB) [1], and shows that some of the modifications included in TWCNB may not be necessary to achieve optimum performance on some datasets. However, it does show that TFIDF conversion and document length normalization are important. It also shows that support vector machines can, in fact, sometimes very significantly outperform both methods. Finally, it shows how the performance of multinomial naive Bayes can be improved using locally weighted learning. However, the overall conclusion of our paper is that support vector machines are still the method of choice if the aim is to maximize accuracy.
Article
A summary is presented of the current state of the art and recent trends in software engineering economics. It provides an overview of economic analysis techniques and their applicability to software engineering and management. It surveys the field of software cost estimation, including the major estimation techniques available, the state of the art in algorithmic cost models, and the outstanding research issues in software cost estimation.
Article
Due to being fast, easy to implement and relatively effective, some state-of-the-art naive Bayes text classifiers with the strong assumption of conditional independence among attributes, such as multinomial naive Bayes, complement naive Bayes and the one-versus-all-but-one model, have received a great deal of attention from researchers in the domain of text classification. In this article, we revisit these naive Bayes text classifiers and empirically compare their classification performance on a large number of widely used text classification benchmark datasets. Then, we propose a locally weighted learning approach to these naive Bayes text classifiers. We call our new approach locally weighted naive Bayes text classifiers (LWNBTC). LWNBTC weakens the attribute conditional independence assumption made by these naive Bayes text classifiers by applying the locally weighted learning approach. The experimental results show that our locally weighted versions significantly outperform these state-of-the-art naive Bayes text classifiers in terms of classification accuracy.
Conference Paper
The purpose of this paper is to propose an unsupervised approach for measuring the similarity of texts that can compete with supervised approaches. Finding the inherent properties of similarity between texts using a corpus in the form of a word n-gram data set is competitive with other text similarity techniques in terms of performance and practicality. Experimental results on a standard data set show that the proposed unsupervised method outperforms the state-of-the-art supervised method and the improvement achieved is statistically significant at 0.05 level. The approach is language-independent; it can be applied to other languages as long as n-grams are available.
Conference Paper
Aspect extraction is a central problem in sentiment analysis. Current methods either extract aspects without categorizing them, or extract and categorize them using unsupervised topic modeling. By categorizing, we mean the synonymous aspects should be clustered into the same category. In this paper, we solve the problem in a different setting where the user provides some seed words for a few aspect categories and the model extracts and clusters aspect terms into categories simultaneously. This setting is important because categorizing aspects is a subjective task. For different application purposes, different categorizations may be needed. Some form of user guidance is desired. In this paper, we propose two statistical models to solve this seeded problem, which aim to discover exactly what the user wants. Our experimental results show that the two proposed models are indeed able to perform the task effectively.
Conference Paper
Mobile government systems have become a very important channel of government services delivery to citizens, business, and other governmental organizations. However, the Quality in Use (QiU) of mobile systems is one of the challenges that face the usage of these systems to the maximum potential. This paper proposes a set of attributes for determining the QiU of mobile government systems. It presents Usability, user Acceptance to mobile systems, and User experience with the system as the attributes that determine the QiU of these systems. The attributes were validated based on the objective and subjective measures of the attributes. The data was collected from 75 subjects participated in experiments by performing 4 representative tasks on a high fidelity prototype for a Mobile-based Civil Registry System. The results of the analyzed data showed that Usability of the MCRS is related significantly to the Acceptance to use the system. The results also showed that User experience with the system was affected by the Usability and the Acceptance attributes. The results of the study clearly point out the important roles of QiU of mobile information systems and that designers of such systems need to focus the Usability, user Acceptance and User experience with the system in order to improve the QiU of mobile system.