ArticlePDF Available

Abstract and Figures

Mobile applications (apps) are software developed for use on mobile devices and made available through app stores. App stores are highly competitive markets where developers need to cater to a large number of users spanning multiple countries. This work hypothesizes that there exist country differences in mobile app user behavior and conducts one of the largest surveys to date of app users across the world, in order to identify the precise nature of those differences. The survey investigated user adoption of the app store concept, app needs, and rationale for selecting or abandoning an app. We collected data from more than 15 countries, including and South Korea. Analysis of data provided by 4,824 participants showed significant differences in app user behaviors across countries, for example users from USA are more likely to download medical apps, users from UK and Canada are more likely to be influenced by price, users from Japan and Australia are less likely to rate apps. Analysis of the results revealed new challenges to market-driven software engineering related to packaging requirements, feature space, quality expectations, app store dependency, price sensitivity, and ecosystem effect.
Content may be subject to copyright.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID 1
Investigating Country Differences in Mobile
App User Behavior and Challenges for
Software Engineering
Soo Ling Lim, Peter J. Bentley, Natalie Kanakam, Fuyuki Ishikawa, and Shinichi Honiden
AbstractMobile applications (apps) are software developed for use on mobile devices and made available through app
stores. App stores are highly competitive markets where developers need to cater to a large number of users spanning multiple
countries. This work hypothesizes that there exist country differences in mobile app user behavior and conducts one of the
largest surveys to date of app users across the world, in order to identify the precise nature of those differences. The survey
investigated user adoption of the app store concept, app needs, and rationale for selecting or abandoning an app. We collected
data from more than 15 countries, including USA, China, Japan, Germany, France, Brazil, UK, Italy, Russia, India, Canada,
Spain, Australia, Mexico, and South Korea. Analysis of data provided by 4,824 participants showed significant differences in app
user behaviors across countries, for example users from USA are more likely to download medical apps, users from UK and
Canada are more likely to be influenced by price, users from Japan and Australia are less likely to rate apps. Analysis of the
results revealed new challenges to market-driven software engineering related to packaging requirements, feature space,
quality expectations, app store dependency, price sensitivity, and ecosystem effect.
Index Terms Requirements/specifications, market-driven software engineering, mobile application development, user
requirements, survey research, app user behavior, software product lines, software ecosystems.
—————————— ——————————
1 INTRODUCTION
OBILE apps are software applications developed
for use on mobile devices such as smartphones and
tablets. Once developed, an app is sold via an ap-
plication distribution platform, commonly known as an
app store. App development is market-driven. Similar to
traditional market-driven software [1, 2], the require-
ments for an app are usually derived from strategic busi-
ness goals or from market opportunities. During the de-
velopment of an app, developers have limited contact
with potential users. Success is measured by the number
of downloads and revenues generated from the app. The
app store concept has democratized the software industry
almost anyone can build and sell apps to a worldwide
population of users via app stores.
The benefits of app stores come with significant chal-
lenges. App developers face a crowded and highly com-
petitive app market, and as a result, an app can fail (re-
ceive little or no downloads) due to features unrelated to
its functionality and usability, such as app name, app icon
or level of exposure. As the profit margins from app sales
are small (Section 1.2), an app should ideally appeal to a
large number of users worldwide in order to be success-
ful. However, many developers are unaware that users
from different countries have different behavior and
needs, and that these factors affect app downloads1. There
is also a lack of awareness about the importance of fea-
tures such as app description, screenshots, pricing, and
user feedback. These challenges have caused many apps
to fail. Studies have found that 400,000 out of 600,000
apps in the iOS App Store have no downloads2, and 80%
of paid Android apps received less than 100 downloads
[3].
Despite these failures, app development continues to
accelerate worldwide. Market-driven software engineer-
ing has been studied in the past [4-6], but today research-
ers are increasingly focusing on the new opportunities
and challenges of app development. Recent studies have
made advances in our understanding of app user behav-
iors through mining app store data, gathering user activi-
ty logs and surveys (e.g., [7-9]). These provide useful data
relating to specific smartphones, app stores, apps, app
categories (e.g., medical apps), countries, or age groups.
However to date there has been little research that studies
global user behaviors in different app stores and mobile
devices, comparing across countries. In this work we
1http://www.guardian.co.uk/technology/appsblog/2012/dec/04/ios
-android-revenues-downloads-country
2 http://www.pcadvisor.co.uk/news/mobile-phone/3373458/400000-
ios-app-store-apps-have-no-downloads/
xxxx-xxxx/0x/$xx.00 © 200x IEEE
————————————————
Soo Ling Lim is with the Department of Computer Science, University
College London, the National Institute of Informatics, Japan, and the Soft-
ware Systems Research Centre, Bournemouth University. E-mail:
s.lim@cs.ucl.ac.uk.
Peter J. Bentley is with the Department of Computer Science, University
College London. E-mail: p.bentley@cs.ucl.ac.uk.
Natalie Kanakam is with the Department of Clinical, Education and Health
Psychology, University College London. E-mail: n.kanakam@ucl.ac.uk.
Fuyuki Ishikawa is with the Digital Content and Media Sciences Research
Division, National Institute of Informatics, Japan. Email: f-
ishikawa@nii.ac.jp.
Shinichi Honiden is with the National Institute of Informatics, Japan. E-
mail: honiden@nii.ac.jp.
M
2 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
complement previous research by focusing on this im-
portant area.
1.1 Contributions
This work makes the following contributions:
We conducted one of the largest surveys to date of
mobile app users worldwide, in terms of question-
naire extent, participant number, and country cov-
erage. Our questionnaire investigated user adoption
of the app store concept, their app needs, and their
rationale for selecting or abandoning an app, as well
as the differences in user behaviors across countries.
We surveyed 10,208 participants from more than 15
countries, including the United States of America,
China, Japan, Germany, France, Brazil, the United
Kingdom, Italy, Russian Federation, India, Canada,
Spain, Australia, Mexico, and Republic of Korea. We
anticipate that this extensive dataset will form a
valuable resource for the fields of application soft-
ware development, human-computer interaction,
and mobile computing and we regard this as a ma-
jor contribution of our work.
We analyzed the data and identified clear evidence
that there exist country differences in user app be-
havior, where some, but not all, of these differences
can be correlated with known cultural differences
between countries. The analysis was conducted us-
ing well-established statistical measures such as the
Pearson correlation coefficient, linear regression,
Pearson's chi-square test, and odds ratio. The large
dataset enables our findings to be statistically signif-
icant.
From analysis of our results and comparison with
the market-driven software engineering literature,
we identified new challenges and their correspond-
ing implications for software engineering research.
1.2 Motivation
App development is now a mainstream form of software
engineering. Just as the growth of web development re-
sulted in every organization requiring its own webpages,
today every organization requires its own apps. Major
software companies such as IBM, Oracle and Accenture
are providing mobile application development services
and support3,4, 5. The result is unprecedented growth and
competition. For example, in January 2013, Apple’s iOS
(mobile operating system) App Store had more than
200,000 app developers6, 700,000 apps, and 1,000 new
apps per day. A keyword search for “to do list” on 18 Jan
2013 returned more than 1,000 apps offering the feature.
With so much competition, developers may lose down-
3 http://www-935.ibm.com/services/us/gbs/mobileconsulting/
4 http://www.accenture.com/us-en/Pages/service-custom-mobile-
application-development.aspx
5 http://www.oracle.com/technetwork/developer-tools/adf-
mobile/overview/index.html
6 An app developer can be a sole developer or a team of developers.
loads due to “packaging” features such as the app’s icon,
name, or description in the app store [10].
Apps often cost between $35,000 and $200,000 to de-
velop7,8,9, and one study reported that almost 70% of de-
velopers earned on average a total revenue of $5,000 to
date or less due to small margins (e.g., the profit of an app
priced at $0.99 has to be shared between the app store
and the developer)10. It is not surprising that 80% of de-
velopers reported generating insufficient revenue to sup-
port their business10. Some failures are very costly. For
example, a $41 million project to develop an app that al-
lows users to share live video broadcasts and photos with
their friends was abandoned due to insufficient users and
a high churn rate11,12 . Media attention received by the app
attracted downloads, but users found the app did not
meet their needs and was difficult to use, and therefore
abandoned the app11,12.
Some developers who have success in one country find
difficulty repeating the success in others13. As developers
have limited contact with their users, it is difficult for
them to identify target users and their needs. Although
developers can receive feedback or feature requests from
users via ratings and reviews, review rates are very low
with many developers reporting a rate of less than 1%14,15.
For example a developer reported 81 reviews out of
91,534 downloads (i.e., averaging 1 review per 1,130
downloads)14. Subsequently, only successful apps that
have been downloaded thousands of times have a chance
of obtaining useful user feedback. Previous research has
found cultural differences in organizations and infor-
mation systems (e.g., Hofstede et al. [11], Straub et al. [12],
van Everdingen [13]) between countries. Findings such as
these have led us to form the hypothesis that differences
may also exist in mobile app user behavior between coun-
tries. However, cultural and country differences in the
context of mobile apps have yet to be investigated. Our
research aims to provide evidence to support the hypoth-
esis and also to identify the precise differences in app
user behavior across countries.
The remainder of the paper is organized as follows.
Section 2 provides a review of related literature. Section 3
describes the research questions, Section 4 describes the
methodology used, and Section 5 provides the results.
Section 6 analyzes the country differences using Hof-
stede’s cultural index [11], and discusses the new chal-
7 http://www.bluecloudsolutions.com/blog/cost-develop-app/
8 http://answers.oreilly.com/topic/2292-how-much-does-it-cost-to-
develop-an-app/
9 http://www.padgadget.com/2010/10/17/the-cost-of-building-an-
ipad-app/
10 http://app-promo.com/wp-content/uploads/2012/04/AppPromo-
TheNecessityofMobileAppMarketing.pdf
11 http://www.digitaltrends.com/mobile/where-did-color-go-wrong/
12 http://mashable.com/2012/10/17/color-shuts-down/
13 Personal communication with a global app analytics company.
14 http://iphonedevsdk.com/forum/business-legal-app-store/54678-
how-many-reviews-per-download-do-you-get-on-average.html
15 http://www.cocos2d-iphone.org/forum/topic/1231
LIM ET AL. 3
lenges and their implications for software engineering
research. Section 7 discusses threats to validity, and Sec-
tion 8 concludes.
2 BACKGROUND
Existing research into understanding the needs of a large
population of app users and their app user behavior can
be categorized into those that mine app store data, those
that collect activity logs from mobile devices, and those
that conduct surveys and elicit feedback from users.
2.1 Mining App Store Data
App stores have accumulated a large amount of data,
such as app descriptions, user ratings, and reviews. As
such, an increasing number of studies to understand user
needs are conducted by mining data from the app stores
themselves. For example, Pagano and Maalej collected
data on user ratings and reviews for the top 25 free and
paid apps of one country on 16 September 2012 from each
app category in the Apple iOS App Store [7]. They used
various statistical measures to investigate how and when
users provide feedback, as well as analyze the content of
the reviews. Their results showed that most user reviews
were provided shortly after new releases, with a quickly
decreasing frequency over time. In addition, user reviews
typically contain multiple topics, such as user experience,
bug reports, and feature requests. The quality and con-
structiveness of user reviews vary widely, from helpful
advices and innovative ideas to offensive comments [7].
Harman et al. mined the Blackberry app store for in-
formation such as app description, app category, user
ratings, price and the rank of the app based on down-
loads [14]. The authors found a strong correlation be-
tween user ratings and app ranking, but no correlation
seemed to be present between price and number of down-
loads. Their study focused on priced apps, further work
may be necessary in order to corroborate the findings by
taking free apps into consideration [14]. Chen and Liu
mined the Apple iOS App Store and collected app infor-
mation such as name, developer, category, current rank-
ing, average rating, and number of ratings [15]. Their
analysis revealed that the top-ranked paid apps are not
necessarily closely correlated with user ratings, and their
finding was consistent with that of Pagano and Maalej [7].
2.2 Activity Logs
A large number of studies about mobile app users have
collected activity logs from mobile devices. For example,
Do et al. collected data about app access, location, and
Bluetooth from 77 Nokia Smartphone users over a dura-
tion of nine months [16]. They found that app usage de-
pends on the users’ location. For example, utility apps
such as clocks are used most frequently at home, while
camera and map apps are used most frequently on holi-
day. Participants who spend more time at a friend’s home
also use communication apps more [16]. Their study
highlighted the need for developers to recognize the
physical and social usage context of the apps they build.
Xu et al. studied network traffic created by apps [17].
Their results indicated that news and weather apps are
often used daily and at a certain time and suggested that
developers could implement prefetching mechanisms in
their apps to reduce latency perceived by users.
Falaki et al. collected app usage data from 255 Android
and Windows Mobile users [18]. They found immense
diversity among users, for example, the average number
of smartphone interactions per user per day ranged from
10 to 200, and suggested that apps should adapt to differ-
ent user groups. Bohmer et al. collected data related to the
status information of apps, such as installing, unin-
stalling, opening, and closing, from 4,125 Android users
[8]. Their study revealed many interesting app usage pat-
terns, for example, new applications are most popular in
the morning and games are most popular at night. How-
ever, the participants in Bohmer et al.’s study were biased
towards early adopters and frequent app users [19]. Alt-
hough these studies collected considerable data about
app usage, they have limited information about the par-
ticipants themselves [8], and as a result, have difficulty
achieving statistical control over potentially confounding
variables [19].
A number of studies focus on gathering requirements
for specific apps. For example, Henze et al. published five
game apps in the Android market and monitored how the
apps were used [20]. Their most popular app collected
data from 6,907 users. Their data showed that many users
abandoned the apps after a short period and they sug-
gested that developers should focus on app quality and
providing incentives to users in order to motivate long-
term use of an app [20]. Henze et al. also found that most
of their participants were English-speaking users from the
United States, hence limiting their ability to derive con-
clusions about a global population [20].
In another study, McMillan et al. collected usage data
of their iPhone app from 8,676 users over five months
[21]. Data logging seemed to be a cost effective way to
collect data from a large number of geographically dis-
persed users. However, activity logs were unable to pro-
vide an in-depth understanding of user behavior, and log
analysis failed to reveal the users’ needs and rationale
behind their behavior [21]. In addition, the data was bi-
ased towards users who enjoyed the app because users
who did not enjoy the app, stopped using it and were
unavailable for data logging [21]. The researchers sup-
ported the activity logs with questionnaires to elicit feed-
back on app features and user demographics (e.g., age,
gender, country of residence). They also interviewed us-
ers from a range of countries, but due to language barri-
ers and difficulty engaging the users, they could only in-
terview 10 users [21].
To provide a richer set of data about users, Rahmati et
al. collected demographic information such as age and
household income in addition to activity logs [19]. Their
study was longitudinal over the period of a year, involv-
ing iPhone 3GS usage among 34 university students.
Their study revealed the importance of understanding
target users of an app. For example, participants with a
lower household income used social networking apps
such as Facebook and YouTube more than their peers.
They also downloaded more apps, used them more fre-
4 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
quently, but found them more difficult to use. In another
study, Rahmati and Zhong conducted a four-month study
of HTC Wizard phone usage from 14 teenagers in the
United States [22]. Recreational applications were the
most popular, and boredom caused gaming apps to loose
popularity.
2.3 Surveys and User Feedback Elicitation
Surveys are one of the best tools to learn about large
groups of users, their interests and their preferences [23].
When conducted effectively, surveys can produce a high-
er degree of certainty about the user’s profile compared to
indirect analysis of user behavior via activity logs [23].
For example, in addition to activity logs from 117 users of
Nokia N95 smartphones in Switzerland, Chittaranjan et
al. also used a questionnaire to collect the users’ demo-
graphic information (e.g., gender, age, nationality) and
self-reported personality traits [24]. They found that ex-
traverted participants are more likely to use office and
calendar apps, and receive more calls on their
smartphone [24]. Male participants were more likely to
use game apps, while female participants who were in-
troverted were more likely to use Internet apps [24].
Franko and Tirrell conducted an online survey to ex-
amine the app needs of 3,306 medical practitioners in the
United States [9]. They collected and analyzed data relat-
ed to the app store adoption by physicians (e.g., use of
smartphones, use of apps in clinical practice), app needs
(e.g., commonly used apps, desired app features), and
demographics (e.g., medical specialty, level of training).
Their results indicated that more than 85% of the partici-
pants owned a smartphone and 56% used apps in their
clinical practice. They also found that the most useful fea-
tures are drug guides, followed by medical calculators,
coding and billing apps, and pregnancy wheels. Most
importantly, there was a mismatch between physician
needs and app availabilities. For example, although a
large number of reference materials apps already exist in
app stores, they remained the most requested types of
apps by physicians since the existing apps were of insuf-
ficient quality. Merely importing all information from a
textbook into an app does not provide the optimal user
experience due to screen size or other restrictions. Many
reference apps cost nearly as much as equivalent print
versions. In order for an app to be successful in being
commonly used by physicians, it must be easy to use and
reasonably priced. Finally, information contained within
those apps may not be based on validated or peer-
reviewed information [9].
In order to gain a better understanding of develop-
ment practices for mobile apps, Agrawal and Wasserman
conducted a survey on app developers, using existing
mobile developer forums to solicit respondents [25]. Their
survey revealed that developers adhered quite well to
recommended sets of “best practices” but rarely used any
formal development processes. In addition, developers
rarely tracked their development efforts in an organized
manner and gathered few metrics. As mobile apps move
from inexpensive recreational uses to complex business-
critical applications, it will be essential to apply software
engineering processes to assure the development of se-
cure, high-quality software [25]. Wasserman proposed
that while many software engineering techniques will
transfer easily to the mobile apps domain, there are other
areas for new research and development such as user ex-
perience, non-functional requirements, processes, tools,
and architecture [25].
In the field of requirements engineering, Seyff et al.
proposed using mobile devices to elicit end-user needs
[26]. Using their proposed method, mobile phone users
can document their needs and requirements using text
entry, audio recordings, and images captured using their
phone. Their evaluation revealed that end-users are able
to document their needs without being facilitated by re-
quirements analysts [26].
2.4 Summary
To summarize, existing research into app user behavior
focus on a specific smartphone, app store, app, app cate-
gory (e.g., medical apps), country, or age group. Large-
scale studies using activity logging and data mining can
reveal interesting usage patterns but not the rationale
behind the patterns. In addition, they lack information
related to user demographics (e.g., age, country of resi-
dence), which can be useful to understand the usage pat-
terns. User studies collect detailed data and can reveal
interesting insights but they often involve insufficient
number of participants for the results to be generalizable.
Most importantly, the data is derived from highly focused
studies, which are not able to elucidate the usage of many
types of app at an international scale. There is a need for
more comprehensive data that is representative of app
user needs in many countries, which may help improve
user experience and improve software development prac-
tice for mobile apps.
3 RESEARCH QUESTIONS
Our research questions first establish a baseline in order
to enable the discovery of country differences. This base-
line focuses on user adoption of the app store concept,
their app needs, and their rationale for selecting or aban-
doning an app. We then focus on the differences of these
findings between countries. The research questions are
listed as follows.
RQ1. How are users adopting the app store concept?
It is important to understand how best to develop apps
and app stores such that users can find apps. In this re-
search question we investigate user behavior relating to
seeking apps, in terms of the platform used, frequency of
use of that platform, frequency of downloads, and meth-
ods used to search for apps.
RQ1.1 What is the distribution of users across mo-
bile app platforms?
RQ1.2 How frequently do users visit their app
stores to look for apps?
RQ1.3 On average, how many apps do users down-
load per month?
RQ1.4 How do users find apps?
LIM ET AL. 5
RQ2. What needs are users trying to meet with apps?
In addition to the mechanics of finding apps, there are the
fundamental needs of the users. In this question we aim
to understand what might prompt a user to consider
looking for an app in the first place, why they download
apps, and which types of apps they prefer.
RQ2.1 What triggers users to start looking for apps?
RQ2.2 Why do users download apps?16
RQ2.3 What types of apps do they download?
RQ3. What are the features of an app that influence its
selection or abandonment?
Apps must be advertised through app stores, potentially
making non-functional and packaging requirements as
important as functional requirements. In this research
question we investigate the importance of app features
versus descriptions, ratings, price, and perceived quality.
RQ3.1 What are the factors that influence users'
choices of apps?
RQ3.2 Given that ratings influence app selection,
why do users rate apps?
RQ3.3 Why do users pay for apps?
RQ3.4 Why do users stop using an app?
RQ4. How do the behaviors above vary across coun-
tries?
Here we revisit all the previous research questions with
the aim of detecting differences across countries. Do users
in different countries have different approaches to finding
apps, or needs; are they influenced by different factors
when they choose or abandon apps?
4 METHODOLOGY
This study used a survey to investigate the research ques-
tions. We constructed a questionnaire in order to collect
quantitative data from app users. In order to provide a
representative and generalizable view of mobile app user
behavior, we targeted a large number of participants with
varied demographics. Our survey focused on the top 15
GDP17 countries. The targeted countries were the United
States of America, China, Japan, Germany, France, Brazil,
the United Kingdom, Italy, Russian Federation, India,
Canada, Spain, Australia, Mexico, and Republic of Korea,
sorted by decreasing GDP18 . Due to the large coverage of
participants, we employed an online survey in order
16 RQ2.1 focuses on the stimulus to launch an app store and look for
apps; RQ2.2 focuses on triggers to download apps once browsing within
the store. It is possible that some answers to RQ2.2 may apply to RQ2.1.
17 Gross domestic product (GDP) is the market value of all officially
recognized final goods and services produced within a country in a given
period of time. GDP per capita is often used an indicator of a country's
standard of living, so people from higher GDP countries are more likely
to be spending money on apps.
18 Data retrieved from the World Bank
http://data.worldbank.org/data-catalog/GDP-ranking-table
make the survey more accessible. To understand the par-
ticipants’ background, we also used questions to elicit
information about their demographics and personality.
4.1 Questionnaire Construction
The objective of this work is to understand user adoption
of the app store concept, their app needs, and their ra-
tionale for selecting or abandoning an app and the differ-
ences across countries. To achieve the objective, we for-
mulated survey questions to correspond to each of the
research questions in Section 3. For example, for RQ1.1
(user distribution across mobile app platforms), we asked
participants to specify the make, model name and num-
ber of the mobile device they use, as well as the app store
they use. We used close-ended questions whenever pos-
sible because open-ended questions require much more
effort from the respondents [23].
For each closed-ended question, we assembled a list of
options gathered from the literature, our previous re-
search, and our experiences as app users and app devel-
opers19. For example, for RQ1.1 (user distribution across
mobile app platforms), we compiled a list of popular app
platforms including Apple, Google Play, Blackberry,
Windows Phone. For RQ3.1 (factors that influence the
choice of apps), we compiled a list of items the user can
see in the screen of purchase, such as app icon, app de-
scription, star ratings, and screen shots. (Previous re-
search has shown that quality of the icon influences the
user’s perception of app quality and their decision of
whether to download [10].) We attempted to capture the
full variety of human behavior including those that were
previously unknown. Therefore we included an “Other
(please specify)” option where applicable [27].
We worded our survey carefully in order to avoid any
misunderstanding of the questionnaire. We used lan-
guage that can be easily understood by participants from
ages 12 and above, and used unambiguous words [27].
For example, as “developeris not a common word, we
substituted it with person who developed the app.”
Technical or uncommon words were followed by exam-
ples. For instance, for the app category “Utilities” we
provided examples of apps belonging to the category
such as Calculator and Alarm Clock. When asking about
how frequently users visit the app store, we provided
quantifiable options such as, “once a day” or “once a
month”, rather than “frequently” or “rarely”, which are
subjective words.
We arranged the questions so as to engage the partici-
pants in the survey because participants who are interest-
ed are more likely to complete the survey and provide
better quality data [23, 27]. For example, we grouped the
questions thematically and arranged questions to have a
natural progression [23], e.g., start from how users find
19 The second author developed an app that has received more than 3
million downloads (http://www.bbc.co.uk/news/11145583). The app
was the number 1 top downloaded app in the UK iOS App Store in Au-
gust 2010.
6 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
apps, to what influences them when downloading apps,
the amount they spend on apps, to why they rate apps,
and why they stop using apps. We put demographics
questions at the end because they are considered boring
and could be construed as intrusive at the start of the sur-
vey [23].
To reduce response bias, we randomized the ordering
of the answer choices for choices that do not need to be
sorted in order (e.g., answers for the app store questions).
This method reduces bias that may occur when respond-
ents choose answers without reading all of the options
[27]. In doing so, some options (such as “I don’t rate
apps” and “I do not pay for apps”) remain the first option
so that participants who do not do those things can quick-
ly move on to the next question, and some options (such
as “Other”) remain the final option where people usually
find them.
To ensure participants do not miss out any questions,
the online questionnaire highlights missing answers and
respondents cannot proceed until the missing answers are
completed. We also used skip logic so that respondents
do not see questions that are not relevant to them and
respondents who indicate that they do not own a mobile
device or their mobile device cannot run apps were
screened out. Finally, we tested the questionnaire on
common browsers, including Internet Explorer (v6 and
above), Apple Safari (v3 and above), Mozilla Firefox (v4
and above), and Google Chrome (v2 and above).
4.2 Pilot Study
We recruited eight participants to pre-test the question-
naire in order to identify potential problems [28]. We se-
lected the participants to reflect, as much as possible, the
varied demographics of our target audience in terms of
age (M = 31.75, SD = 10.17), gender (Female = 3, Male =
5), and countries (the United Kingdom, Germany, Japan,
China, and Australia). We asked the participants to com-
plete the questionnaire and point out any problems they
encountered. In particular, we asked them to (1) highlight
ambiguous instructions, questions, and options, (2) iden-
tify missing questions and options to the survey ques-
tions, and (3) point out improvements we can make to the
questionnaire in order to motivate potential respondents.
Based on feedback from participants, we revised the
questionnaire as summarized in Table 1.
Table 1. Participant Feedback and Questionnaire Modification
Feedback
Modifications
RQ3.4 App abandonment. A partici-
pant reported that he stopped using
many of his apps because he does not
need them anymore, but this option
was not available.
RQ1.4 Finding apps. A participant
reported that he finds apps using
search engines, but the option was
not available.
The options were added
to the questionnaire.
The checklist questions constrained
the number of options a participant
can select. Several participants found
The constraints were
removed such that partic-
ipants can select all op-
the constraint very restrictive and
counter-intuitive because many of the
options applied equally well to them.
Forcing them to choose between the
options may result in frustration as
well as missing data.
tions that applied to
them.
The draft questionnaire started with
the personality question to draw the
participants into the survey. Howev-
er, two participants commented being
confused about the focus of the sur-
vey.
The personality questions
were moved to the end of
the questionnaire.
Some participants did not understand
the rationale behind the de-
mographics and personality ques-
tions for a mobile app user survey.
They felt that understanding the pur-
pose behind the questions would
motivate participants to provide bet-
ter quality responses.
The purposes of our
questions were explained
in the questionnaire. For
example, we explained
that the demographic
questions were “to en-
sure that we have sur-
veyed a varied mix of
people in society. “
Some participants were uncomforta-
ble providing sensitive information
such as their ethnicity and household
income.
“Prefer not to say” op-
tions were added to sen-
sitive questions.
We evaluated our revised questionnaire on four new
participants. Feedback from all the participants was posi-
tive. Participants reported that the survey was very en-
gagingand “very well designed.
4.3 Questionnaire Translation
The survey targets individuals from a variety of coun-
tries, ages, and background. As such, the questionnaire
was translated into the first languages of the target coun-
tries in order to avoid misunderstanding and increase the
accuracy of responses. The questionnaire was translated
from English into nine other languages: Spanish, Korean,
French, German, Japanese, Italian, Mandarin, Russian,
and Portuguese. We selected our translators from native
speakers of the language who were also proficient in Eng-
lish. Each translator was asked to use words that can be
easily understood by an audience from ages 12 and
above, and to ensure that the translated questionnaire
matches the English questionnaire. Finally, we validated
the translated questionnaires by asking a separate set of
native speakers to trial the survey in each language.
4.4 Final Questionnaire
The final questionnaire had three sections and had 31
questions in total. The first section asked respondents
about their user behavior in terms of mobile app usage,
including the app stores they use, what triggers them to
look for apps, why they download apps, why they aban-
don apps, and the types of apps they download. The se-
cond section consisted of demographic questions in order
to understand the types of people who responded to the
survey. These questions asked about the respondent’s
gender, age, marital status, nationality, country of resi-
dence, first language, ethnicity, education level, occupa-
LIM ET AL. 7
tion, and household income. The final section asked the
respondents about their personality, using the Big-Five
personality traits [29]. Finally, participants were asked to
provide us with optional comments and their email ad-
dresses if they were interested to know the results. We
also collected their browser and operation system infor-
mation. An excerpt of the questionnaire can be seen in
Fig. 1, and the complete questionnaire is available in the
supplementary material of the paper and at:
http://www.cs.ucl.ac.uk/research/app_user_survey/
Fig. 1. Excerpt of questionnaire (second page).
The online questionnaire was set to automatically de-
fault to the respondents’ browser language, so that partic-
ipants could answer the survey in the language that they
were most comfortable with. Participants could also select
their preferred language on each page of the question-
naire (Fig. 1).
4.5 Data Collection
Two methods were used for data collection: snowballing
and online panels. The survey was conducted from the
26th of September 2012 to the 26th of November 2012. In
the first method, we used the snowballing method (used
in our previous research [30, 31]) to recruit participants.
Specifically, we invited individuals in our social networks
to complete the survey, and then asked them to invite
individuals in their social networks to complete the sur-
vey, and so on. The following methods were used: emails
to specific colleagues or friends, emails to mailing lists,
posting the survey link on Twitter, Facebook, and
LinkedIn.
The second method comprised the distribution of our
survey to a panel of international participants provided
by Cint20, an ISO certified panels company for conducting
opinion and social research21. To achieve a representative
sample of the target population, the panels used a ran-
20 http://www.cint.com/
21 ISO 20252:2012 establishes the terms and definitions as well as the
service requirements for organizations and professionals conducting
market, opinion and social research.
(http://www.mrs.org.uk/standards/iso_20252/)
dom and stratified sampling technique, and enabled the
recruitment of participants that is census representative22.
Within the required targets, sample is randomly generat-
ed as well as being stratified by high, medium and low
responders. A total of 32,491 panel members were re-
cruited to participate in the survey.
4.6 Data Cleaning Approach
We used the following approach to clean our data. We
focused on questions with an “Other (please specify)”
option where participants provided textual answers, in
order to codify their answers. We first translated each
textual answer to English, and then coded all the translat-
ed responses into categories [32]. For example, for the
question Why do you rate apps?” The Spanish answer
“para que los creadores las hagan funcionar mejor” was
translated to English as for creators to make them work
better,” and coded as “feedback to developers.” We
assigned the same code to other answers that when
translated have the same meaning, e.g., to provide
feedback to the developers” and “to inform creators of
defects in the app”.
We then parsed the codes as follows. If the code
duplicated an existing option in the same question, we
merged it with the existing option, and removed the
participantsselection of the “Other” option. (We found
the majority of codes to fall in this category.) If the code
duplicated an existing option in another question, we
selected the option in the other question, and maintained
the participants’ selection of the “Other” option in the
original question. If the code was new, but the number of
answers sharing the same code was more than 5%, we
created a new option for the question, and participants
were recoded to select the new option rather than
“Other.” If the code was new, but the number of answers
sharing the same code was less than 5%, the participants
remained selecting the “Other” option. This approach
was used so that the “Other” option was the one with the
fewest answers among all options [33]. Only the question
“Which app store do you use?” had more than 5% with
the same code. The original questionnaire and the
questionnaire with the coded options are available in the
supplementary material of the paper and at:
http://www.cs.ucl.ac.uk/research/app_user_survey/
Finally, for respondents who did not know their app
store, we used the mobile phone specifications they pro-
vided in order to derive their app stores. For example, if
their mobile phone is iPhone, we recoded their app store
as Apple iOS App Store, because the iOS App Store is the
most common and the only official app store used by
iPhone users.
4.7 Data Analysis Techniques
We analyzed RQ13 using descriptive statistics. We also
22 The panels provided by Cint comply with ESOMAR, MRS, CASRO,
MRA, ARF, MRIA, AMA and AMSRO standards.
(http://www.cint.com/explore/opinionhub/quality/)
8 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
used parametric statistics to analyze the relationship be-
tween variables as follows. We used the Pearson correla-
tion coefficient to analyze the relationship between users’
age and other variables, such as whether they use search
engines to find apps, or whether price influences their
app choice, as well as frequency of app store visits and
the average number of apps downloaded. Moderate sized
correlations (r > .5) were followed up with linear regres-
sions in order to assess whether one variable was a signif-
icant predictor of the other variable.
In RQ4 we revisited all previous research questions,
analyzing them across countries. Direct comparisons
were made for multiple-choice, single-answer questions
(RQ1.1 to RQ1.3). We analyzed the data using Pearson's
chi-square test (χ2) for multiple-choice, multiple-answer
questions (RQ1.4 onwards). Specifically, we used Pear-
son's chi-square test to analyze whether there were signif-
icant differences across countries for the categorical vari-
ables such as compare several apps” or browse ran-
domly.” A p value of less than 0.001 was used to deter-
mine variables that differed significantly across countries
[34]. We measured the magnitude of the difference be-
tween each country and the other countries in the dataset
combined using odds ratios [34]. For example, if country
C has an odds ratio of R for behavior B, it means that us-
ers from country C are R times more likely to exhibit be-
havior B compared to users from the other countries.
All quantitative analyses were conducted using SPSS23.
The results are presented using the APA standard [34].
5 RESULTS
Out of the 32,491 participants recruited from the panel, a
total of 9,818 participants responded, and a further 390
participants responded from our snowballing method,
resulting in a total of 10,208 participants who responded
to our survey (96% panel, 4% snowballing method). The
overall response rate was approximately 30%. This is sim-
ilar to the highest response rate achieved for online sur-
veys reported in Deutskens et al.24 [35]. Table 2 provides
the response rate for each country.
A total of 8,082 participants completed the survey
(panel = 7,831, snowballing = 251). (We exclude incom-
plete surveys in our analysis.) A total of 3,258 participants
were screened out because they did not use apps25. Only
23 http://www-01.ibm.com/software/uk/analytics/spss/
24 In Deutskens et al.’s study of the response rate of online surveys with
different configurations (e.g., short vs. long, donation to charity vs. lot-
tery incentive, early vs. late reminder), they found that the response rate
ranged from 9.4% to 31.4%.
25 A total of 3,258 responses were screened out because the respondent
did not own a mobile device (N=1,208), the respondent’s mobile device
could not run apps (N=1,653), and the respondent did not use apps
(N=394). These 394 participants selected “Other” for all app related ques-
tions and provided the explanation that they “do not use apps”. This
small percentage of participants may have completed the survey even
though they believed that they did not use apps because panel users
were rewarded only if they completed the survey, and their responses
were valid and did not contain bad data. Among the 3,258 responses that
were screened out, 48 were from our snowballing method (respondent
three participants provided bad data (e.g., garbage or
obscenities) and were excluded from the analysis. Thus
the final total comprised 4,824 participants (Male = 2,346
(49%), Female = 2,478 (51%), aged 1187, average age =
34.51, standard deviation = 15.19). Fig. 2 shows the coun-
try of residence of the participants at the time of the sur-
vey. A total of 1,805 participants (37.4%) were interested
to learn about the results of the survey and volunteered
their contact details. The complete dataset is available in
the supplementary material of the paper and at:
http://www.cs.ucl.ac.uk/research/app_user_survey/
The following subsections describe our results for each
research question. The results consider all users regard-
less of how long they have used apps, and include both
paid and free apps. For the purposes of brevity, we report
the results for correlation that are > 0.2 or < 0.2 and sig-
nificant. The complete correlation results are available in
the supplementary material of the paper and at:
http://www.cs.ucl.ac.uk/research/app_user_survey/
Table 2. Countries and Response Rates from Panel. A further 390
participants responded through the snowballing method26. For some
participants, the panel country differed from the country of residence.
In our analysis of different countries, we used country of residence
provided by the participant in the demographics section of the ques-
tionnaire.
Recruited
Responded
Response
Rate (%)
968
622
64.3
5350
707
13.2
3650
1075
29.5
4507
811
18
965
715
74.1
760
612
80.5
1388
479
34.5
810
362
44.7
2350
1439
61.2
900
453
50.3
5350
371
6.9
1521
553
36.4
650
430
66.2
810
518
64
2512
671
26.7
32491
9818
30.2
did not own a mobile device (N=18), respondent’s mobile device could
not run apps (N=25), and respondent did not use apps (N=5)).
26 The response rate for these participants could not be calculated be-
cause there is no way of knowing how many of our contacts saw our
posts on Facebook or Twitter, and no way of knowing which of the con-
tacts forward the link to their contacts and so on. In general, we found
that asking individuals to complete the survey (e.g., via personal email)
was more effective than posting the survey on Facebook or Twitter.
LIM ET AL. 9
Fig. 2. Number of respondents per country after screening
(N=4,824). Countries in the “Other” category included, in decreasing
number of participants, Cyprus, Malaysia, Belarus, Ukraine, Colom-
bia, Costa Rica, Indonesia, Vietnam, Sweden, Guatemala, Kazakh-
stan, Singapore, Chile, Puerto Rico, Thailand, Argentina, El Salva-
dor, Peru, Philippines, Croatia, Ecuador, Greece, Norway, Panama,
Paraguay, Romania, Austria, Belgium, Bolivia, Caribbean, Dominican
Republic, Fiji, Ghana, Honduras, Ireland, Ivory Coast, Kyrgyzstan,
Mauritius, Netherlands, Pakistan, Poland, Portugal, St. Vincent,
Switzerland, Taiwan, Turkey, Uruguay, and Venezuela.27
5.1 App Store Adoption (RQ1)
This section reports the results for RQ1: How are users
adopting the app store concept?
5.1.1 User Distribution (RQ1.1)
The app store that was most used was Google
Play/Android Market (39%), followed by Apple iOS App
Store (22%), Nokia Ovi Store (15%), Samsung Application
Store (13%), Blackberry App World (6%), and Windows
Phone Marketplace (3%) (Fig. 3). This distribution was
consistent with the market share of smartphone operating
systems in Q1 2012: Android had the highest market
share, followed by Apple, Symbian28 , Blackberry, and
Windows29. This result differed from that of Franko and
Tirrell, which found that the majority of practitioners
used Apple iOS (48%), followed by Android (19%) and
BlackBerry (13%). This could be due to their participants
being only medical practitioners in the United States of
27 Among the 4,824 responses, 203 are from our snowballing method
(Australia (N=13), Brazil (N=1), Canada (N=3), China (N=1), France
(N=1), Germany (N=2), Italy (N=30), Japan (N=19), UK (N=50), USA
(N=7), and Other (N=76)). Countries in the “Other” category included, in
decreasing number of participants, Cyprus, Malaysia, Sweden, Vietnam,
Indonesia, Singapore, Thailand, Taiwan, Norway, Netherlands, Greece,
Ireland, Mauritius, Austria, Portugal, and Switzerland.
28 Nokia phones run on Symbian OS.
29 http://www.idc.com/getdoc.jsp?containerId=prUS23503312
America, which was a subset of the whole population.
Approximately 15% of users did not know what their
app store was, despite visiting the store to download
apps. This might be due to some smartphone providers
supporting a number of operating systems (e.g., some
Samsung smartphones supporting Android, some Win-
dows, and others Samsung Bada), some app stores being
rebranded (the Android Market has been rebranded as
Google Play30, Ovi has been rebranded as Nokia31), and in
Japan some app stores are “wrapped” within local mobile
communication carrier stores. In the survey, some Apple
iOS users reported iTunes, Apple’s media player and me-
dia library application, as their app store.
Fig. 3. User distribution across mobile app platforms.
5.1.2 Frequency of Visit (RQ1.2)
More than once a week was the most common frequency
that users visited their app store (19%) (Fig. 4). This was
followed by less than once a month (18%) and once a
week (12%). The least common frequency of visiting the
app store was several times a day (8%). Approximately
9% of users reported not visiting the app stores to look for
apps. Correlation analysis revealed that as age increased,
the frequency of visiting the app store decreased signifi-
cantly, r = .292, p = .000.
Fig. 4. Frequency of visiting app stores to look for apps.
30 http://techcrunch.com/2012/03/06/goodbye-android-market-hello-
google-play/
31 http://conversations.nokia.com/2011/05/16/the-evolution-of-
nokia-and-ovi/
10 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
5.1.3 Average Downloads (RQ1.3)
The highest proportion of users downloaded 2–5 apps per
month (40%) (Fig. 5). This was followed by 0–1 apps
(35%), 6–10 apps (14%), 1120 apps (7%), and 2130 apps
(2%). Only 2% of users downloaded more than 30 apps
per month.
The frequency of visits to the app store was significant-
ly correlated with the average number of apps download-
ed per month, r = .662, p = .000. A linear regression re-
vealed that the frequency of app store visits accounted for
43.9% of the variation in the average number of apps
downloaded per month (R2 = .439, p = .000). Correlation
analysis showed that with increasing age the average
number of apps downloaded per month decreased signif-
icantly, r = .233, p = .000.
Fig. 5. Average number of app downloads per month.
5.1.4 Finding Apps (RQ1.4)
The majority of people found apps by keyword search in
the app store (43%) (Fig. 6). This was followed by brows-
ing randomly (38%), using search engines such as Google
(35%), looking at top downloads chart (35%), and com-
paring several apps (31%). The least number of users re-
ported downloading the first app they found (10%), sug-
gesting that users tend to spend some time choosing
apps, even if the apps were free. Correlation analysis
showed that as age increased, the likelihood of users find-
ing apps by looking at top downloads chart decreased
significantly, r = –.209, p = .000.
Fig. 6. Methods used to find apps.
5.2 User Needs (RQ2)
This section reports the results for RQ2: What needs are
users trying to meet with apps?
5.2.1 Triggers (RQ2.1)
The most popular situation that triggered users to look
for apps was when they needed to know something
(55%), followed by when they wanted to be entertained
(54%), and when they were feeling bored (45%) (Fig. 7).
The least popular reason to look for apps was when users
were depressed (6%). However, the respondents’ willing-
ness to specify this option might have been influenced by
social desirability bias.
With increasing age, users were significantly less likely
to be triggered by boredom (r = .331, p = .000), and the
need for entertainment (r = .305, p = .000).
Fig. 7. Triggers to start looking for apps.
5.2.2 Reasons for Download (RQ2.2)
The most popular reason for users to download an app
was to be entertained (58%), followed by to carry out a
task (51%) (Fig. 8). The third most popular reason for us-
ers to download an app was because the app was recom-
mended by friends or family (36%). This shows the im-
portance of viral marketing and social networks on app
downloads. Curiosity was also an important reason
(35%), which meant that novel or quirky apps have the
potential to attract downloads in the app store.
With increasing age, users were significantly less likely
to download apps for entertainment, r = –.269, p = .000.
Fig. 8. Reasons for downloading apps.
5.2.3 App Types (RQ2.3)
The most popular app category was games (60%) fol-
lowed by social networking (55%) and music apps (41%)
(Fig. 9), which was consistent with the fact that the most
LIM ET AL. 11
common reason to download apps was to be entertained
(Section 5.2.2). Utility apps and weather apps were very
popular too (41% and 39% respectively), indicating that
apps play an important role in supporting very specific
tasks and providing specific information.
As age increased, users were significantly less likely to
download entertainment apps (r = .231, p = .000), games
apps (r = .332, p = .000), social networking apps (r =
.228, p = .000), and music apps (r = .221, p = .000). Learn-
ing and empowerment may also be factors that can re-
duce boredom. However, the likelihood of downloading
apps that can provide learning and empowerment is not
correlated with age: education apps (r = .149, p = .000),
productivity apps (r = .075, p = .000) and reference apps
(r = .025, p = .078).
Fig. 9. Types of apps that users download.
5.3 Influencing Features (RQ3)
This section reports the results for RQ3: What are the fea-
tures of an app that influence its selection or abandon-
ment?
5.3.1 Choice (RQ3.1)
The most important factors that people consider when
choosing apps were: price (57%), app features (49%), app
description (49%), reviews by other users (48%), and star
ratings (46%) (Fig. 10). Sadly, the least important factor
that influenced a user’s choice of apps was the developer
(11%). This meant that developers would find it difficult
to use the success of their previous apps to promote fu-
ture apps. This finding was consistent with our experi-
ence.32 As age increased, screen shots became significantly
less likely to influence the users’ app choice, r = –.238, p =
.000.
Fig. 10. Factors that influence app choice.
5.3.2 Rating (RQ3.2)
Approximately 53% of users did not rate apps. The most
popular reasons for rating apps was to let other users
know that the app was good (34%), followed by to let
other users know that the app was bad (20%) (Fig. 11).
Interestingly, the app rewarding users to rate it (11%) was
a less popular reason compared to the app simply re-
minding the users to rate it (15%). The least common rea-
son for users to rate apps was because they were asked by
someone else to do so (6%).
Fig. 11. Reasons for rating apps.
5.3.3 Payment (RQ3.3)
Most app users did not pay for apps (57%). The most
popular reasons to pay for apps were that users could not
find free apps with similar features (19%). This was fol-
lowed by the need to get additional features for paid apps
(17%) and for free apps (15%), and that the apps were on
sale (14%) (Fig. 12). However, a similar number of users
selected each reason (M = 13%, SD = 4%). The least com-
mon reason people paid for apps was to subscribe for
paid content (7%). This might be that when the content
had to be paid for, users expected the app to be free.
32 When the iStethoscope Pro app by the second author was number 1
in the UK App Store, other apps by the same developer received no in-
crease in downloads.
12 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
Fig. 12. Reasons for paying for apps.
5.3.4 Abandonment (RQ3.4)
The most common reason for app users to abandon an
app was because they did not need the app anymore
(44%) (Fig. 13). This was followed by finding better alter-
natives (39%) and getting bored of the app (38%). This
finding suggested that many apps served temporary
functions, unlike desktop software. Correlation analysis
showed that with decreasing age, users were significantly
more likely to abandon apps because they were bored of
the app, r = .261, p = .000.
Non-functional requirements such as performance, re-
liability and usability, were important for app users. Rea-
sons such as the app crashed, the app did not have the
required features, the app was too slow, the app was dif-
ficult to use, the app did not work, were, on average, ad-
equate reasons for more than 30% of users for abandon-
ing an app (Fig. 13). This result showed that the quality of
an app was crucial to encourage continued usage. This is
consistent with the more recent study by Khalid et al. that
functional errors and app crashes are among the most
frequent complaints by users in their app reviews [36].
Only 17% of users stopped using an app because it in-
vaded their privacy. However, this might be due to app
users being largely unaware of their privacy being invad-
ed and the implications [37].
Fig. 13. Reasons for abandoning apps.
5.4 Differences between Countries (RQ4)
The results for RQ1 to RQ3 established the baseline mean
user behaviors across all countries in our study. We now
focus on the main aim of the paper: to investigate the dif-
ferences in app user behavior between countries.
When comparing the results for the first research ques-
tion (RQ1.1 to RQ1.3) between countries, some clear dif-
ferences were evident. Respondents in different countries
used some app stores more frequently than others less
frequently than the global trend (RQ1.1, Section 5.1.1). At
the time of the survey, Google Play was the app store
used by the highest number of respondents in all coun-
tries (RQ1.1, Section 5.1.1). However, in Australia the
highest number of respondents (41%) used Apple, like-
wise in Canada the highest number of respondents (33%)
used Apple; in India the highest number of respondents
(44%) used Nokia, and in Japan 50% of the respondents
selected “Other” as their responses to the app store ques-
tion and specified Japanese communication carriers such
as Docomo and AU as their app stores. Until recently,
Japanese communication carriers such as Docomo and
AU created their own app stores specific to feature
phones. Even today, for Android devices, Japanese com-
munication carriers have developed a wrapper around
Google Play such that users can access Google Play apps
via the app store of the communication carriers33,34. This
also results in fewer Japanese users knowing the name of
their app store compared to any other country. A total of
49% of app users in Japan did not know their app stores
while the average percentage per country was 16% and
the standard deviation was 11%.
Although the global results showed that the most
33 http://www.techinasia.com/docomo-dmarket-dmenu/
34 http://blog.appannie.com/world-series-of-apps-japan/
LIM ET AL. 13
common frequency of visits to app stores was more than
once a week (RQ 1.2, Section 5.1.2), in many countries the
most common frequency of visits was less than once a
month. Only Brazil (22%), China (34%), South Korea
(32%), Spain (20%), and the United States (20%) had the
most common frequency of visits as more than once a
week. In India, the highest number of respondents visited
the app store once a day (21%). Countries where re-
spondents visited app stores more frequently also had a
higher average number of downloads. This was con-
sistent with our findings of a strong correlation between
the frequency of app store visits and the average number
downloads per month (RQ1.3, Section 5.1.3).
Fig. 14 shows a heat map visualization of the differ-
ences normalized per country so that the values of the
odds ratio range from 0 to 1, where 0 is the lowest odds
ratio and 1 is the highest odds ratio. Low odds ratio
means low differences in behavior and high odds ratio
means high differences in behavior. (As described in Sec-
tion 4.7, if country C has an odds ratio of R for behavior B,
it means that users from country C are R times more like-
ly to exhibit behavior B compared to users from the other
countries.) It is clear from Fig. 14 that many countries
have unique differences compared to other countries. The
mostly blue stripe representing Japan shows that app us-
ers from Japan are indifferent for most answers apart
from not rating apps (the only red box in the blue stripe)
Japanese users strongly prefer not to rate apps compared
to users from the other countries.
Fig. 1522 illustrate the odds ratio results per country
for RQ1.4 onwards. Stacked bar charts are used in order
to show cumulative odds ratio results (i.e., odds ratios for
all answers to a given question are stacked in one bar per
country). A longer bar corresponds to a higher cumula-
tive odds ratio. For each question, the stacked bar charts
are ordered by decreasing cumulative odds ratio, so that
the country with highest cumulative odds ratio appears
first. For example, China had the highest cumulative odds
ratios for many questions, with Brazil, India and Mexico
following behind. Japan had the lowest cumulative odds
ratios for all questions except for reasons for rating apps
where Germany had the lowest cumulative odds ratio
(Fig. 20). The different colors within each bar shows the
odds ratio for each answer to each question to enable a
direct visual comparison across countries (each color cor-
responds to a specific answer). For example, in reasons
for downloading apps (Fig. 17), Germany, UK and China
are more likely to download out of impulse, compared to
Spain, Mexico and Brazil. The option “Other” was not
analyzed for odds ratio because for these research ques-
tions it comprised less than 5% of the responses per coun-
try. All countries showed similar odds ratios for reasons
to abandon an app, with Brazil showing the largest devia-
tion (Fig. 22).
---- Fig. 15 22 here ----
Pearson’s chi-square test on the countries and user be-
haviors provides a clear picture of the significant differ-
ences between countries of app user behaviors (RQ1.4
onwards). Table 3 reports the odds ratio results for each
country in turn, highlighting the top three largest differ-
ences of that country for brevity. The complete odds ratio
and Pearson’s chi-square results are available in the sup-
plementary material of the paper and at:
http://www.cs.ucl.ac.uk/research/app_user_survey/
Together, these results clearly indicate that significant
differences exist in mobile app user behavior between
countries, confirming our hypothesis. The findings pre-
sented here provide a crucial snapshot of the differences
to enable future work to track their evolution over time.
Odds Ratio
Australia
Brazil
Canada
China
France
Germany
India
Italy
Japan
Mexico
South Korea
Russia
Spain
United Kingdom
United States
Fig. 14. Heat map of odds ratio per variable normalized per country.
Blue to yellow shades indicate lower odds ratios (between 0 and
0.65 respectively), yellow to red shades indicate higher odds ratios
(between 0.65 and 1 respectively). Each row of the heat map corre-
sponds to each answer choice for the research question in the order
depicted in Fig. 15 22.
6 ANALYSIS AND DISCUSSION
Previous research in cultural differences in organizations
and technology usage by Hofstede [11] led to our hypoth-
esis that country differences may exist in app user behav-
ior. The results in Section 5 confirm the hypothesis and in
addition highlight specific differences for each country in
terms of app user behavior. Section 6.1 analyzes the coun-
try difference results by comparing them with Hofstede’s
work [11]. Section 6.2 compares our findings with the
literature in market-driven software engineering in order
to identify new challenges and to inform our discussion
of their implications for software engineering.
14 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
Table 3. Top Three Largest Differences in App User Behavior Between Each Country and the Other Countries. Differences are measured
using odds ratio. The results are statistically significant where p < .001.
Country
Top Three Differences in App User Behavior Compared to Other Countries
Pearson’s Chi-square
Australia
1. App users are 2.51 times more likely not to rate apps, Fig. 20.
χ2 (1) = 47.47, p = .000
2. App users are 1.64 times more likely to be influenced by price when choosing apps, Fig. 19.
χ2 (1) = 14.24, p = .000
3. App users are 1.61 times more likely to abandon an app because they had forgotten about it, Fig. 22.
χ2 (1) = 9.95, p = .002
Brazil
1. App users are 2.39 times more likely to stop using an app because it crashes, Fig. 22.
χ2 (1) = 76.64, p = .000
2. App users are 2.34 times more likely to stop using an app because it is slow, Fig. 22.
χ2 (1) = 73.06, p = .000
3. App users are 2.29 times more likely to download social networking apps, Fig. 18.
χ2 (1) = 57.02, p = .000
Canada
1. App users are 2.45 times more likely to be influenced by price when choosing apps, Fig. 19.
χ2 (1) = 74.19, p = .000
2. App users are 2.05 times more likely not to rate apps, Fig. 20.
χ2 (1) = 53.18, p = .000
3. App users are 1.85 times more likely to stop using an app because they had forgotten about it, Fig. 22.
χ2 (1) = 29.8, p = .000
China
1. App users are 9.27 times more likely to select the first app on the list presented to them, Fig. 15.
χ2 (1) = 541.92, p = .000
2. App users are 6.02 times more likely to rate apps, Fig. 20.
χ2 (1) = 278.4, p = .000
3. App users are 5.83 times more likely to download apps that feature their favorite brands or celebrities,
Fig. 17.
χ2 (1) = 264.32, p = .000
France
1. App users are 1.69 times more likely to download catalogue apps, Fig. 18.
χ2 (1) = 6.9, p = .009
2. App users are 1.47 times more likely not to rate apps, Fig. 20.
χ2 (1) = 7.93, p = .005
3. App users are 1.32 times more likely to be influenced by price when choosing apps, Fig. 19.
χ2 (1) = 3.89, p = .049
Germany
1. App users are 2.31 times more likely to download reference apps, Fig. 18.
χ2 (1) = 27.4, p = .000
2. App users are 2.12 times more likely not to rate apps, Fig. 20.
χ2 (1) = 30.4, p = .000
3. App users are 1.83 times more likely to download apps out of impulse, Fig. 17.
χ2 (1) = 9.82, p = .002
India
1. App users are 3.35 times more likely to download education apps, Fig. 18.
χ2 (1) = 119.46, p = .000
2. App users are 2.89 times more likely to rate apps because someone asked them to do so, Fig. 20.
χ2 (1) = 40.35, p = .000
3. App users are 2.43 times more likely to download sports apps, Fig. 18.
χ2 (1) = 56.11, p = .000
Italy
1. App users are 1.43 times more likely not to rate apps, Fig. 20.
χ2 (1) = 7.6, p = .006
2. App users are 1.30 times more likely not to pay for apps, Fig. 21.
χ2 (1) = 3.94, p = .047
3. App users are 1.21 times more likely to download travel apps, Fig. 18.
χ2 (1) = 1.67, p = .196
Japan
1. App users are 5.91 times more likely not to rate apps, Fig. 20.
χ2 (1) = 100.78, p = .000
2. App users are 2.2 times more likely not to pay for apps, Fig. 21.
χ2 (1) = 26.34, p = .000
3. App users are 1.36 times more likely to look for apps when they need to know something, Fig. 16.
χ2 (1) = 4.7, p = .03
Mexico
1. App users are 2.64 times more likely to pay for apps because they believe that paid apps have more fea-
tures in general, Fig. 21.
χ2 (1) = 45.15, p = .000
2. App users are 2.44 times more likely to rate an app because they were asked by the app to do so, Fig. 20.
χ2 (1) = 39.22, p = .000
3. App users are 2.31 times more likely to pay for an app to get additional features for free apps, Fig. 21.
χ2 (1) = 33.17, p = .000
South
1. App users are 4.1 times more likely to look for apps when feeling bored, Fig. 16.
χ2 (1) = 103.8, p = .000
Korea
2. App users are 3.46 times more likely to download game apps, Fig. 18.
χ2 (1) = 59.91, p = .000
3. App users are 3.15 times more likely to look for apps when they want to be entertained, Fig. 16.
χ2 (1) = 61.78, p = .000
Russia
1. App users are 2.47 times more likely to download reference apps, Fig. 18.
χ2 (1) = 35.6, p = .000
2. App users are 2.39 times more likely to find apps using search engines, Fig. 15.
χ2 (1) = 51.3, p = .000
3. App users are 2.02 times more likely to rate apps because someone asked them to do so, Fig. 20.
χ2 (1) = 11.62, p = .000
Spain
1. App users are 1.62 times more likely to find apps by looking at the featured apps section of the app
store, Fig 15.
χ2 (1) = 13.16, p = .000
2. App users are 1.6 times more likely to stop using an app because it crashes, Fig. 22.
χ2 (1) = 13.52, p = .000
3. App users are 1.52 times more likely to download apps to interact with people they dont know, Fig 17.
χ2 (1) = 4.45, p = .035
United
1. App users are 2.91 times more likely to be influenced by price when choosing apps, Fig. 19.
χ2 (1) = 54.12, p = .000
Kingdom
2. App users are 2.66 times more likely to abandon an app because they had forgotten about it, Fig. 22.
χ2 (1) = 52.65, p = .000
3. App users are 1.81 times more likely not to rate apps, Fig. 20.
χ2 (1) = 20.74, p = .000
United
1. App users are 2.07 times more likely to download medical apps, Fig. 18.
χ2 (1) = 21.51, p = .000
States
2. App users are 1.68 times more likely to download weather apps, Fig. 18.
χ2 (1) = 19.31, p = .000
3. App users are 1.66 times more likely to be influenced by price when choosing apps, Fig. 19.
χ2 (1) = 16.08, p = .000
6.1 Country Differences
While some differences are related to historical or techno-
logical legacies as in the case of app store awareness in
Japan (Section 5.4), the causes of other differences are
perhaps more complex and difficult to track. The differ-
ences in user behaviors are largely independent of GDP
when ranked in order of differences, the rankings do not
correspond to the relative wealth of those countries. Our
results indicate that country-specific differences exist in
almost all categories: users from the UK are most forget-
ful about their apps and most influenced by price, users
from Japan prefer not to rate apps, users from China are
more likely to select the first app on the list more than any
LIM ET AL. 15
other, users from Mexico think that paid apps have more
features, and users from Germany and Russia are more
likely to download reference apps.
In order to understand the differences, we measured
the correlation between app user behavior and Hofstede’s
cultural index as follows [11]:
1. Power Distance Index (the extent to which the less
powerful members of institutions and organizations
within a country expect and accept that power is
distributed unequally),
2. Individualism Index (the preference for a loosely-
knit social framework in which individuals are ex-
pected to take care of themselves and their immedi-
ate families only),
3. Masculinity Index (masculine societies have clearly
distinct emotional gender roles: men are supposed
to be assertive, tough, and focused on material suc-
cess, whereas women are supposed to be more
modest, tender, and concerned with the quality of
life),
4. Uncertainty Avoidance Index (the degree to which
the members of a society feel uncomfortable with
uncertainty and ambiguity),
5. Long-Term Orientation Index (the fostering of vir-
tues oriented towards future such as persistence and
personal adaptability), and
6. Indulgence Versus Restraint Index (indulgent socie-
ties have a tendency to allow relatively free gratifi-
cation of basic and natural human desires related to
enjoying life and having fun, restrained societies
have a conviction that such gratification needs to be
curbed and regulated by strict norms).
Our analysis indicates that Hofstede’s cultural index
helps to explain some, but not all, of the country differ-
ences we observed. Results with some correlation to the
cultural index include:
Users from strong power distance countries are less
likely to be influenced by price when choosing apps
(r = .219, p = .000), more likely to spend money on
apps because they believe paid apps have better
quality in general (r = .203, p = .000), less likely not
to rate apps (r = .275, p = .000), more likely to rate
an app to let others know that it is good (r = .262, p
= .000). For example, app users in Russia, Mexico,
China and India (high power distance) are more
likely to spend money on apps because they believe
paid apps have better quality in general than app
users in Canada, Australia, Germany and the Unit-
ed Kingdom (low power distance) (Section 5.4 Fig.
21).
Users from strong individualism index countries are
more likely to be influenced by price when choosing
apps (r = .240, p = .000). They are also more likely
not to rate apps (r = .250, p = .000) and less likely to
rate an app in order to let others know that it is
good (r = –.241, p = .000). For example, app users in
the United States, Australia, the United Kingdom
and Canada (high individualism index) are more
likely to be influenced by price when choosing apps
than app users in China and Mexico (low individu-
alism index) (Section 5.4 Fig. 19). The former group
of users is also less likely to rate an app in order to
let others know that it is good compared to the latter
(Section 5.4 Fig. 20). In previous work, individualist
cultures are less likely to share information with
their groups [21]. In individualist countries, media
is primary source of information. In collectivist
countries, social network is primary source of in-
formation.
Users from strong uncertainty-avoidance countries
are less likely to download the first app they see on
the list (r = .211, p = .000). They are also less likely
to download lifestyle apps (r = .248, p = .000). For
example, app users from Russia, Japan and France
(high uncertainty-avoidance index) are less likely to
download the first app they see on the list and
download lifestyle apps than app users from India,
the United Kingdom and China (low uncertainty-
avoidance index) (Section 5.4 Fig. 15 and Fig. 18). In
previous work, lower uncertainty-avoidance index
cultures are found to take fewer risks and exhibit
hesitancy toward new products and technologies
[11].
However, some correlations are not explained by cul-
tural differences. For example, we find that users from
strong power distance countries are more likely to down-
load music apps (r = .206, p = .000) and users from strong
individualism countries are less likely to download music
apps (r =.214, p = .000).
Some differences seem to be in contradiction to previ-
ous findings in cultural research. For example, according
to Hofstede, countries with higher indulgence versus re-
straint index tend to be less thrifty. However, Australia,
Canada and the UK, which are the three countries in our
dataset with the highest indulgence versus restraint
scores, are significantly more likely than other countries
to be influenced by price when choosing apps (Section
5.4). Only Mexico appears to follow the trend predicted
by Hofstede as the users are 2.64 times more likely than
other countries to pay for apps to get more features (Sec-
tion 5.4).
Some correlations are predicted by the cultural index
but are missing. For example, in masculine countries,
more nonfiction is read [11]. However, there is no correla-
tion between masculinity index and app types that relate
to nonfiction such as reference, business, and catalogues.
Countries with high indulgence index are expected to put
more emphasis in leisure enjoyment [11], however no
correlation is found between indulgence versus restraint
index with entertainment related answers. Countries with
a low individualism index might have more correlation
with influence from friends, to interact with friends or
family, download app for someone else to use, apps rec-
ommended by friends or family [11], but this was not
found to be the case.
Consequently, this analysis suggests that country dif-
ferences in apps are significant but they are not entirely
consistent with previous findings on cultural differences
nor are they fully explained by those findings.
Many app user behaviors are different in different
16 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
countries. However, one universal factor worldwide is
app abandonment all users are very likely to cease us-
ing apps of bad quality (e.g., crashes, too slow, difficult to
use, does not work). It seems that only an effectively en-
gineered app will stand the test of time and become a
popular addition to the mobile device of users. Evidence
for this can also be seen in the participants’ responses
when asked to name of the app they spent most money
on and describe the best and/or worst feature of the app.
One of the most common answers was WhatsApp Messen-
ger35 with very positive feedback on simplicity and ease of
use. This successful app is an example of a well-
engineered, cross-platform app that has been popular for
most of the life of the app store itself. The app allows us-
ers to exchange messages without having to pay for SMS.
Its user base is large and users are satisfied, evident by its
consistently high ranking and a majority of favorable re-
views from users saying it is easy to use and well devel-
oped. In this sense, app development is no different from
other forms of software development: good software en-
gineering practices matter. (Since the writing of this pa-
per, WhatsApp was sold to Facebook for $19 Billion36.)
6.2 Challenges for Software Engineering
Analysis of the survey results suggests that app-based
software development brings new challenges to market-
driven software engineering. In this section, we discuss
the challenges and their implications for software engi-
neering, in the context of our results and challenges sug-
gested by previous research in market-driven software
engineering listed in Table 4.
Table 4. Summary of software engineering challenges from market-
driven software engineering literature
Challenge
References
Volatile requirements due to market changes, competi-
tors and customers
[5, 38]
Communication gaps between marketing and devel-
opment
[5, 38]
Balancing the influence between marketing and devel-
opment on requirements decisions
[5, 38]
Limited value of monolithic requirements specifica-
tions
[5, 38]
Requirements overloading as requirements volume is
potentially very large and continuously growing
[5, 38]
Dependencies among requirements make release
planning difficult
[5, 38]
Balancing between elaborate and elementary devel-
opment processes, and finding tools and solutions that
are not too complex
[5, 38]
Requirements often overlap with design, it is difficult
to draw a clear line between the phases
[5, 39]
Due to ad hoc processes and lack of documentation,
[5, 40]
35 http://www.whatsapp.com/
36 http://www.bloomberg.com/news/2014-02-19/facebook-to-buy-
mobile-messaging-app-whatsapp-for-16-billion.html
companies rely on low staff turnover to succeed
Difficulty completely satisfying the end user as the
quality level that is considered acceptable is dependent
on both the usage and the application domain
[2]
The developing organization makes all decisions but
also takes all risks
[38, 41]
Difficulty managing and testing requirements that are
often informally described
[38]
Pressure on short time-to-market on initial and subse-
quent releases, frequent releases
[1, 2, 39]
Large markets
[2]
Users are difficult to identify or initially unknown
[2, 4]
Limited contact with end users
[2, 39, 40,
42]
The need to constantly invent new, selling requirement
[2, 39]
Difficulty in portfolio and product line planning
[43]
Difficulty managing knowledge share with other par-
ticipants in the software ecosystem
[43]
Challenges architecting for extensibility, portability,
and variability
[43]
Difficulty interfacing functionality with other systems
that may have differing organizational contexts
[43]
6.2.1 Addressing Packaging Requirements
Packaging requirements such as app description, title,
keywords and screenshots play an important role in app
discovery and download. For example, 43% of users find
apps by searching for keywords and 38% browse ran-
domly to find apps that catch their attention (Section
5.1.4). A number of factors that influence users’ choice of
apps are packaging related, such as app description
(49%), screenshots (30%), app name (17%), and app icon
(13%) (Section 5.3.1). Due to the rapidly increasing num-
ber of apps on the app store, packaging requirements
have a large influence on the visibility of the app and
hence its discoverability and download. As shown in Ta-
ble 4, marketing influence and communication have been
identified as challenges by other researchers. However
the specific challenge of addressing the packaging re-
quirements of apps in App Stores has not been identified
previously.
This challenge is complex, for the packaging require-
ments vary across different countries. Some countries are
more influenced by the packaging of an app. For exam-
ple, when choosing apps, users from China are 2.5 times
more likely than other countries to be influenced by app
name and 2.6 times more likely to be influenced by app
icon (Fig. 19). Equally, the same packaging can be appeal-
ing in one country but not in another. For example, in
Japanese app stores, many apps targeted at adults have
elements of “cuteness” in their icon and interface, which
is inline with the cute culture in Japan [44], but this is not
found in app stores in other countries.
Traditionally packaging requirements were met by
marketing teams. However, app stores have enabled in-
dividual developers and small developer teams to be in-
volved in global market-driven software engineering.
This brings additional responsibilities to developers that
LIM ET AL. 17
are not within their skill set.
To address this challenge, natural language processing
tools can be used to mine descriptions of existing apps in
the app store for each country and evaluate the develop-
ers’ app description in terms of clarity and attractiveness
as well as to suggest improvement using recommender
systems. For example existing work in pattern analysis
that uses natural language processing and statistics based
machine learning to identify news popularity [45] could
be adopted to evaluate app descriptions. Research has
also been conducted to investigate the use of Latent Di-
richlet Allocation (LDA) to evaluate app description
against app behavior [46]. In addition, large-scale data
mining of app stores and local media for a specific coun-
try can be used to automatically suggest popular locale-
specific names, using machine learning and pattern anal-
ysis methods [47].
Research could also be conducted to develop tools that
can automate or semi-automate app packaging design.
For example, techniques in search-based software engi-
neering, in particular, evolutionary computation tech-
niques such as genetic algorithms that have been used to
generate attractive art [48], can be adapted to generate
country-specific attractive app icons and app graphics
based on existing icons in the app store for each country.
This tool would be particularly useful for countries such
as China where users are highly influenced by visibility.
6.2.2 Managing Vast Feature Spaces
Traditional market-driven software tends to offer a large
feature set in order to meet all of the users’ anticipated
needs, and the number of features increases as new ver-
sions are released, and such releases may be very fre-
quent (these have previously been identified as challeng-
es as shown in Table 4). Mobile apps tend to be highly
specific with very few features, and developers release
new updates frequently in order to engage with and re-
tain their customer base37. Our study shows that users’
preferences for features also differ across countries. For
example, users in India are 3 times more likely to down-
load education apps and users in Germany are 2 times
more likely to download reference apps (Section 5.4). De-
velopers face the challenge of selecting an optimal and
small subset of features or combination of features in a
very large feature space and the ability to tailor the fea-
tures for each country. The challenge is as much about
which features to omit as which to include. This is an in-
teresting contrast to the “requirements overloading” chal-
lenge listed in the literature for more general market-
driven software engineering (Table 4).
To address the challenge, insights from country and
culture differences can be used to inform app feature se-
lection and tailoring. For example, a medical app for per-
sonal use by an adult user in an individualist country can
be tailored to include features for a high collectivism
country that might enable a user to use it to help care for
37 https://blog.kissmetrics.com/mistakes-in-app-marketing/
their elderly parents [11].
App users have a wide range of needs. For example,
users look for apps when they need to know something
(55%), want to be entertained (54%), feel bored (45%), and
need to do something (42%) (Section 5.2.1). Although us-
ers download apps mainly to be entertained (58%) or to
carry out a task (51%), a large proportion of users also
download apps out of curiosity (35%) curiosity is the
fourth most popular reason for app downloads (Section
5.2.2). Techniques from creative requirements engineering
can be used to invent features for apps that will catch a
user’s interest. (Creative requirements engineering is the
use of creative thinking techniques including random
idea combination, analogical reasoning and storyboard-
ing as part of a requirements process [42].) Creative re-
quirements engineering can be applied in all app types
and is particularly useful for those with large demand
and supply such as games (downloaded by 60% of users)
and social networking apps (downloaded by 55% of us-
ers) (Section 5.2.3). User needs and trends change quickly:
38% of users abandon apps because they are bored of
them and 44% users abandon apps because they are no
longer needed (Section 5.3.4). Indeed, volatile require-
ments due to market changes is listed as a challenge in
the literature (Table 4). Techniques used in evolutionary
computation to automate the creative process of produc-
ing design can be applied to the feature space to evaluate
and suggest interesting combinations of app features [49].
The challenge of managing vast feature spaces is aided
by their method of sale. In traditional market-driven
software engineering some of the known challenges in-
clude limited contact with end users, difficult-to-identify
users, and communication gaps between marketing and
development (Table 4). These challenges are now much
reduced because app stores provide an unprecedented
opportunity for researchers to access a large amount of
historical data about app features, user preferences, and
download patterns. These data can be mined and used to
support the requirements engineering process and is
more cost effective and scalable compared to market re-
search and focus groups. Recent research has investigated
the use of data mining to extract user requirements from
app reviews [7, 50]. Natural language processing can be
used to automatically identify useful information such as
bugs and feature requests from the large amount of textu-
al reviews. In addition, data mining and pattern analysis
can also be used to identify features that are popular, and
can be used to predict trends and changing needs. Rec-
ommender systems have been applied to large-scale re-
quirements elicitation (e.g., [31, 51]). The large amount of
data in the app store suggests that requirements engineer-
ing researchers can build an extensive app user profile,
and use recommender systems for requirements elicita-
tion for apps. Recommender systems can also be used to
identify features that are popular in one country and
bring it to another country with similar profile interests.
Some app stores have already begun implementing coun-
try specific recommender systems to suggest apps to us-
ers in the same region.
Finally, research in end-user programming [52] can in-
18 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
vestigate simple methods that can engage users to cus-
tomize the content and features of the apps and compose
different apps together to meet their goals. Apps that al-
low interfaces with other apps (e.g., document editor
apps that can interface with cloud storage app) do better.
Techniques should be developed to enable the develop-
ment of highly customizable apps where users can “turn-
on” and only pay for features or content that they need
(today implemented via in-app purchases).
6.2.3 Meeting High Quality Expectations
App users have high expectations on the usability and
performance of apps and tend to be unforgiving when an
app fails to meet their expectations. For example, 34% of
users stop using an app because it is too slow, 30% of us-
ers stop using an app because they cannot get it to work,
26% of users stop using an app because they found it dif-
ficult to use, and 25% of users stop using an app because
they found the advertisements annoying (Section 5.3.4).
However, users from different countries have differing
concerns about app quality. For example, users from Bra-
zil and Spain are 2 times more likely than other countries
to stop using an app because it crashes and users from
Brazil are also 2 times more likely than other countries to
stop using an app because it is slow (Section 5.4). The
difficulty of completely satisfying the end user has always
been a challenge in market-driven software engineering
(Table 4), but due to the large number of competing mo-
bile apps, the challenge of meeting high quality expecta-
tions has more severe consequences for app developers,
i.e., their app may be abandoned.
With so many apps offering the same features, 39% of
users abandon apps because they found better alterna-
tives (Section 5.3.4). A large number of apps offering the
same or similar features also means that non-functional
requirements determine if an app will be downloaded
and used. Users can assess non-functional requirements
from app description, screen shots and ratings and re-
views from other users. As a result, non-functional re-
quirements have become, in some instances, more im-
portant than functional requirements.
To address this challenge, requirements engineering
researchers need to develop effective techniques to cap-
ture non-functional requirements for apps, taking into
account the country differences in priorities of the re-
quirements. Requirements prioritization methods for
non-functional requirements for commercial off the shelf
software and the NFR framework can be adapted to pri-
oritize country specific non-functional requirements [53,
54]. Techniques in data mining and recommender sys-
tems mentioned in the previous sections can also be used
to identify and prioritize non-functional requirements.
There is also the need to develop methods to quantitative-
ly evaluate apps against their non-functional require-
ments.
6.2.4 Managing App Store Dependency
Traditional market-driven software can be sold via multi-
ple channels such as directly through the software vendor
or via other software vendors and resellers, and in soft or
hard copies. The challenge ofdependencies among re-
quirements making release planning difficult” has been
noted by other researchers for market-driven software
(Table 4). However mobile apps have a new and very
specific dependency which may override all others. Mo-
bile apps can only be sold via the app store of the plat-
form they are developed for. Althoughjail-brokenplat-
forms38 exist, less than 1% of users reported using such
platforms (Section 5.1.1). Apps are governed by app store
guidelines, which are frequently updated and vary across
app stores. Apps that do not adhere to the guidelines will
be removed from the store, which makes the success or
failure (or even existence) of an app highly dependent on
the app store. For example, AppGratis, an app that rec-
ommends other apps to app users (which was used by 12
million iOS users and developed by a team of 45), was
removed from the iOS app store because of a new app
store guideline that stated that “Apps that display Apps
other than your own for purchase or promotion in a
manner similar to or confusing with the App Store will be
rejected.”39 As such, developers need to consider app
stores as important stakeholders during requirements
elicitation and be alert and responsive towards changes in
app store guidelines.
There are differences in app store uses across countries
and those differences change rapidly. For example, Japan
has its own app store system (Section 5.1.1). One app
store in Japan provides a “smart pass” where users can
access a selection of apps for free for a monthly fee.40
There are even Japanese app stores specifically designed
for girls.41 Laws within different countries can cause app
store rules and regulations to change (e.g., the need for
FDA approval in the USA can affect medical apps). Some
rules apply only in some countries and not others. It is
possible for the functionality of an app to contravene the
customs or laws of some countries, e.g., religious or free-
dom of speech, and be banned from the countries. Conse-
quently, there may therefore be unanticipated costs and
benefits of developing for each platform and country,
which developers should consider when planning app
projects.
Techniques to model app store guidelines such that
app specifications can be verified to meet the guidelines
before the app is developed would be very useful to keep
developers from investing time developing apps that will
be rejected from the store. As guidelines change frequent-
ly and are different across platforms and countries, and
enforced at different levels of rigor, requirements tracea-
bility tools are needed to track different versions guide-
lines, guidelines for different countries, and app specifica-
tions to ensure continuous alignment between the fea-
tures offered by an app and the app store guidelines.
38 Jail-breaking allows the download of additional applications, exten-
sions, and themes that are unavailable through the official platforms.
39 http://appgratis.com/blog/2013/04/09/appgratis-pulled-from-the-
app-store-heres-the-full-story/
40 http://www.au.kddi.com/english/content/#smartpass
41 http://www.medias-joshibu.net/
LIM ET AL. 19
Software developers often wish to make their apps
available on multiple mobile app platforms due to the
distribution of users across different app platforms. For
example, 39% of users use Google Play, 22% use Apple
iOS, 15% use Nokia, 13% use Samsung (Section 5.1.1).
However, they face difficulty having to port an app from
the source platform to the target platforms due to differ-
ences in hardware specifications (screen size, resolution,
memory), software architecture, API, programming lan-
guages and app store guidelines.
To address this challenge, techniques from software
product line engineering can be used to develop apps
with very few features and need to be released and up-
dated in quick succession across platforms. These tech-
niques should be ultra lightweight in comparison with
traditional techniques used by large companies that have
long term return on investment [55] and should support
short deadlines and the ability to be responsive to market
pressure and trends [55, 56]. Feature maps can be used as
a lightweight method for defining a feature space of op-
tions as well as assessing the value of a particular subset
of those options [57]. Recent research in search-based
software engineering has investigated the use of indica-
tor-based evolutionary algorithm to maximize the use of
user preference knowledge and optimize feature selection
[57]. Techniques to model and visualize feature space for
apps would be very useful to support optimization.
Research in product line software engineering should
also be conducted to support variability in app platforms
[58]. For example, recent work by Gokhale et al. has de-
veloped a technique to systematically infer likely map-
pings between the APIs of Java2 Platform Mobile Edition
and Android graphics APIs [59]. Techniques from soft-
ware product lines should be applied to enable strategic
and systematic reuse to support the release of an app to
different platforms, particularly platforms with high user
distribution. Researchers should also investigate the de-
velopment of a meta-language such that an app can be
developed once and then deployed on different plat-
forms, in different architectures, and in different configu-
rations.
6.2.5 Addressing Price Sensitivity
App users are highly sensitive towards app prices. For
example, 57% of users do not pay for apps and 19% pay
for apps only if they cannot find free apps with similar
features (Section 5.3.3). Price is the most important influ-
ence in app choice (Section 5.3.1), and users from some
countries more likely than others to be influenced by
price when choosing apps (e.g., UK 3 times more likely
and Canada 2 times more likely) (Section 5.4).
In the past, it was difficult to attain accurate infor-
mation about product prices and number of purchases
and their variations over time, making pricing a challeng-
ing and rarely-studied topic in software engineering.
However, in many app stores, the daily price and number
of downloads for each app are publicly available, provid-
ing researchers with new opportunities to study pricing
and develop predictive models on the effects of pricing
changes to downloads. Such studies can help developers
identify the optimal price point for their apps, which
should vary according to the country of sale, as users
from different countries are receptive to different price
points. The large amount of data also enables the possibil-
ity to develop accurate predictive statistical models to
model complex country-specific users behavior towards
prices [60]. Previous researchers have identified the chal-
lenge that “requirements often overlap with design, it is
difficult to draw a clear line between the phases” (Table
4). In this case price clearly impacts design decisions and
requirements prioritization. For example, design process
of a free app with incrementally added paid content dif-
fers from the design process of a free or paid app.
Finally, research in software product lines can develop
methods to support common variability in function that
are related to pricing, such as free version of an app with
adverts, free version with limited features, free version
with in-app purchases, paid version, and paid version
with in-app purchases.
6.2.6 Balancing Ecosystem Effects
Traditionally, software vendors function as relatively in-
dependent units, where performances are largely de-
pendent on product features, reputation, and marketing
efforts [43]. For example, software houses involved in
market-driven software engineering build reputation and
the reputation influences users’ buying decisions (e.g.,
Microsoft, Norton). In contrast, the developer’s identity is
the least important factor that influences a user’s app
choice (Section 5.3.1). App stores have created a software
ecosystem where vendors have become networked and
their success or failure highly dependent on one another
and on app users who can influence the sale of their apps.
For example, users are highly influenced by other users
when choosing apps: other users’ reviews (48%), their
ratings (46%), the number of existing users (29%), and the
number of ratings (27%) (Section 5.3.1). As a result, an
app that has received good reviews can receive more
downloads, and in turn, receives even more ratings and
reviews. Reviews can be positive or negative. For exam-
ple, 34% of users rate an app in order to tell other users an
app is good, and 20% do so to warn other users about a
bad app (Section 5.3.2).
Recommendation by friends or family is one of the top
reasons for downloading apps (36%), more so than other
forms of publicity such as being mentioned in the media
(20%), featured in the app store (19%) or in the top down-
loads chart (17%). This result is consistent with results
from consumer research, which found that consumers
trust “earned media” such as word-of-mouth and rec-
ommendations from friends or family, above all other
forms of advertising42.
With the importance of “earned media”, there is a need
to develop techniques to effectively elicit feedback, re-
views and ratings from users. The elicitation strategies
42 http://www.nielsen.com/us/en/press-room/2012/nielsen-global-
consumers-trust-in-earned-advertising-grows.html
20 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, MANUSCRIPT ID
need to be country specific because users’ app rating be-
havior differs across countries. For example, users from
Australia, Canada, Japan are more likely not to rate apps,
users from China are 6 times more likely to rate apps,
users from India are more likely to rate an app if someone
asks them to do so, and users from Mexico are more likely
to rate an app if asked by the app (Section 5.4). As such
the app can be configured to elicit reviews from users
more proactively and creative methods that leverage the
user’s social network should be developed. In addition,
with so much data available, there is a need for tools to
manage and analyze user feedback, identify unmet needs
and prioritize the needs based on level of demand.
Previous challenges in the literature relating to soft-
ware ecosystems have identified issues such as portfolio
planning, knowledge management with other partici-
pants in the ecosystem, architecting sufficient flexibility,
and integrating functionality with other systems (Table
4). These challenges reflect only part of our challenge of
balancing ecosystem effects, for in addition to interactions
between vendors and their apps, the significant interac-
tion of the users and the app stores also has a significant
impact. App vendors have to consider their strategic role
in the software ecosystem to survive [43]. Addressing this
challenge requires an understanding of complex app eco-
systems and the network effects of all the players, which
are themselves challenging research topics.
In our previous research, we have developed multi-
agent systems and artificial life simulations to understand
interactions between developers, users and apps, and
specifically the effect of publicity, and developer strate-
gies on app downloads and ecosystem health [61-64]. The
data collected from this study can be used to provide a
more accurate model of mobile app ecosystems, in partic-
ular user profile differences across countries. Using our
data in combination with historical data from the app
stores, there is a potential to develop a tool that can esti-
mate the performance of an app and explore pricing
strategies for the app during planning and development
phase of an app. We can leverage knowledge from inter-
disciplinary research such as biology and artificial life
where such predictions are often used to understand nat-
ural ecosystems.
7 THREATS TO VALIDITY
Considerable care and attention has been made to ensure
the rigor of this work, but as with any chosen research
methodology, it is not without limitations. One common
issue in survey research is non-response bias. This is
where the behaviors of users who responded differ from
the behaviors of those who did not respond. Due to the
scale of our survey, we were unable to follow up non-
respondents and ask for their reason of non-response.
However, we found that most respondents who did not
complete the survey did not use apps. Thus, it was likely
that people who did not respond to the survey did not
use apps, hence the sample is unlikely to be subjected to
systematic bias among non-<