ArticlePDF Available

Abstract and Figures

Positive deviance is a growing approach in international development that identifies those within a population who are outperforming their peers in some way, eg, children in low‐income families who are well nourished when those around them are not. Analysing and then disseminating the behaviours and other factors underpinning positive deviance are demonstrably effective in delivering development results. However, positive deviance faces a number of challenges that are restricting its diffusion. In this paper, using a systematic literature review, we analyse the current state of positive deviance and the potential for big data to address the challenges facing positive deviance. From this, we evaluate the promise of “big data‐based positive deviance”: This would analyse typical sources of big data in developing countries—mobile phone records, social media, remote sensing data, etc—to identify both positive deviants and the factors underpinning their superior performance. While big data cannot solve all the challenges facing positive deviance as a development tool, they could reduce time, cost, and effort; identify positive deviants in new or better ways; and enable positive deviance to break out of its current preoccupation with public health into domains such as agriculture, education, and urban planning. In turn, positive deviance could provide a new and systematic basis for extracting real‐world development impacts from big data.
Content may be subject to copyright.
Positive deviance, big data, and development: A systematic
literature review
Basma Albanna |Richard Heeks
Centre for Development Informatics,
University of Manchester, Manchester, UK
Basma Albanna, Centre for Development
Informatics, University of Manchester,
Manchester, UK.
Positive deviance is a growing approach in international development that identifies
those within a population who are outperforming their peers in some way, eg, chil-
dren in lowincome families who are well nourished when those around them are
not. Analysing and then disseminating the behaviours and other factors underpinning
positive deviance are demonstrably effective in delivering development results. How-
ever, positive deviance faces a number of challenges that are restricting its diffusion.
In this paper, using a systematic literature review, we analyse the current state of pos-
itive deviance and the potential for big data to address the challenges facing positive
deviance. From this, we evaluate the promise of big databased positive deviance:
This would analyse typical sources of big data in developing countriesmobile phone
records, social media, remote sensing data, etcto identify both positive deviants and
the factors underpinning their superior performance. While big data cannot solve all
the challenges facing positive deviance as a development tool, they could reduce time,
cost, and effort; identify positive deviants in new or better ways; and enable positive
deviance to break out of its current preoccupation with public health into domains
such as agriculture, education, and urban planning. In turn, positive deviance could
provide a new and systematic basis for extracting realworld development impacts
from big data.
big data, developing countries, machine learning, mobile data, positive deviance, systematic
literature review
Many development practitioners continue to use a traditional needsbasedapproach to development, involving topdown identification of needs
and problems, and the external imposition of solutions that meet those needs. This type of approach can work well in addressing specific technical
challenges. But it works much less well where development requires learning and behavioural change by beneficiary groups, something which
necessitates much greater knowledge of and engagement with beneficiary communities (Nel, 2018; Pascale, Sternin, & Sternin, 2010; Saïd Busi-
ness School, 2010; Singhal, 2011). As a result, more bottomup assetbasedapproaches have come into existence, which capitalize on a
community's inherent assets and capabilitiesincluding knowledgein solving development problems. Positive deviance (PD) is one such asset
based approach. It is based on the observation that in every group or community, a few individuals use uncommon practices and behaviours to
achieve better solutions to problems than their peers who face the same challenges and barriers (Pascale et al., 2010). Those individuals are
referred to as positive deviants(PDs), and adopting their solutions on a wider basis is referred to as the PD approach.
The term positive deviancewas first used in 1976 to describe a practical strategy for the design of food supplementation programmes in
Central America, a strategy that was derived endogenously rather than exogenously through identifying dietary practices developed by mothers
Received: 4 October 2018 Accepted: 4 October 2018
DOI: 10.1002/isd2.12063
E J Info Sys Dev Countries. 2019;85:e12063.
© 2018 John Wiley & Sons 1of22
in lowincome families who had wellnourished children (Wishik & Van Der Vynckt, 1976). The results of this study were not widely publicized,
limiting uptake. It was not until the 1990s that PD started to be seen as a credible strategy for nutrition research and action, based on an
accumulation of evidence of impact (Sternin, Sternin, & Marsh, 1997; Sternin, Sternin, & Marsh, 1998; Zeitlin, 1991). The 1990s also witnessed
its first largescale adoption in international development by Save the Children, which used PD as a strategy to reduce malnutrition in Vietnam,
rehabilitating an estimated 50 000 malnourished children in 250 communities (Sternin, 2002). But it has only really started to attract attention
in the 2000s, when Sternin and collaborators promoted PD more broadly as an assetbased approach for social change and demonstrated how
it can be operationalized across a variety of development domains (Sternin & Choo 2000; Sternin 2002). Since the early 2000s, PD has been
applied across multiple development domains, with public health being the most prominent.
As will be discussed in further detail below, PD tends to rely on indepth primary data collection in identifying PDs and then community
mobilization in disseminating and scaling successful practices. Identification is therefore time and labour intensive, with costs proportional to
sample size (Felt, 2011; Lapping, Marsh et al. 2002; Marsh, Schroeder, Dearden, Sternin, & Sternin, 2004). As a result, PD has traditionally made
use of relatively smallscale samples. Statistically and practically, this can make it harder to identify positive deviants, given their relative rarity
(Marsh et al., 2004). It also limits the ability to accurately generalize the identified practices to larger populations (Marsh et al., 2004). Path
dependency has also been evident in uptake of the PD approach, in terms of geographical distribution and domain of application, with most
studies concentrated in a few countries of Asia and in addressing malnutrition: the region and domain where it was initially introduced and
practised by Sternin et al. (1997).
Given these and other challenges and limitations, there are obvious opportunities for innovation in positive deviance. Our particular
interest here is in the innovative opportunities offered by big data: the increasing amounts of data about what we are and what we
do and what we say, generated from digital devices, which provide an opportunity to gather insights into human behaviour. If big
data can provide insights into behaviour, then big data analytics could identify patterns of abnormal behaviour: variances from the
average collective behaviour of observed units which could include the behaviours of those which a PD approach would define as positive
In this paper, we therefore investigate the potential of big databased positive deviance(BDPD). Our particular interest flows from
the line of argument above that there are challenges for the traditional positive deviance approach which big data might be able to address.
But there is also a converse interest that positive deviance might represent a new approach to the extraction of development value from
big data.
To investigate this potential, we have undertaken a systematic literature review (SLR) of the empirical applications of positive deviance and of
big data in developing country contexts, in order to answer three questions. First, how is PD currently being applied in development? In particular,
we seek to identify from the literature challenges in that application which new approaches might seek to address. Second, how is BD currently
being applied in development? We investigate this particularly in light of the challenges to positive deviance identified earlier; but we also extract
challenges in use of big data. Third, what development value might result from the combined use of BD and the PD approach? Here, we combine
the findings of both literature reviews to address the interests expressed above: not only how big data can address PD challenges but also how PD
might be a valuable approach to the use of big data in development.
The paper begins with a brief description of the method used in conducting the literature review, followed by three sections that answer each
of the questions in turn: a presentation of the findings from our PD and then BD literature review before concluding with a discussion about big
databased positive deviance.
2.1 |Literature search and selection
A systematic review of the literature was conducted using an adaptation of the Preferred Reporting Items for Systematic Reviews and
MetaAnalysis (PRISMA) protocol (Moher, Liberati, Tetzlaff, & Altman, 2009). The review included academic, peerreviewed, Englishlanguage
literature that reported empirical results using secondary or primary data sources from developing countries. The literature search was
implemented using Google Scholar because (1) it is free and easy to access, making the SLR reproducible; (2) both PD literature and BD literature
are multidisciplinary so it was important to use a nondisciplinary comprehensive base of literature; and (3) Google Scholar has the widest coverage
of academic articles in comparison to other search engines and databases (Khabsa & Giles, 2014). The utilized search strings and strategy are
summarized in Table 1.
To retrieve relevant studies, we used the intitle operator, which ensures that the title of the retrieved articles would include the words
following the operator. We also used AND and OR Boolean operators to ensure the existence of key terms in the text of the articles, thus
reducing the time required in screening irrelevant sources. For example, in the PD literature search, the words positive deviance,”“positive
deviant,or positive deviantswere used with the intitleoperator, thereby targeting articles that have PD as the central theme. The words
study,”“empirical,”“practice,”“experimental,”“survey,or fieldworkwere used for the intext search to restrict articles not providing empirical
evidence. The same search strategy was used to retrieve the BD literature but with a simpler search string found to deliver the corpus of literature
suitable for analysis. To ensure that key studies were not excluded, backward snowballing
of relevant articles was employed, and it led to the
identification of one additional PD article and 22 additional BD articles.
A total of 75 articles were included in the final corpus of analysis: 41
PD articles and 34 BD articles. Figure 1 reports on the identification and selection protocol.
Backward snowballing involves screening the reference lists of relevant literature review articles to look for additional literature. It is considered an effective tech-
nique for conducting complex systematic literature reviews (Greenhalgh & Peacock, 2005).
The number of BD articles identified by backward snowballing was much greater because many relevant bigdatafordevelopment articles either do not specifically
use the words big datain their title (eg, they use mobile dataor satellite images) or do not specifically use the words developing countriesin their text (eg, they
use the name of a particular country or a development goal).
FIGURE 1 Flow diagram for identification
and selection of positive deviance (PD) and
big data (BD) articles (adapted from the
PRISMA protocol)
TABLE 1 Google Scholar search strategy used for the literature search
Positive Deviance Literature
Search String
Str1: intitle:Positive deviance(study OR empirical OR practice OR experimental OR survey OR fieldwork)
Str2: intitle:Positive deviants(study OR empirical OR practice OR experimental OR survey OR fieldwork)
Str3: intitle:Positive deviant(study OR empirical OR practice OR experimental OR survey OR fieldwork)
Time period 19702017
Exclude Patents, citations, and nonEnglish results
Big Data Literature
Search String intitle:Big data(developing countries
AND application)
Time period 19982017
Exclude Patents, citations, and nonEnglish results
The first part of the search strings looks for terms in the title of a paper, the terms in brackets search within the text of the paper.
The term developing countrieswas not used in the PD search as initial investigation suggested that the majority of studies were in those countries, and
we could then manually exclude those that were outside scope. Conversely, it was used in the BD search as the majority of BD literature was not in devel-
oping countries, and so the term was seen to be a useful means to quickly narrow the search to morerelevant items. As noted below, however, the term
itself was in practice rather narrow and served to omit a number of relevant studies which then had to be manually identified and included via backward
To the best of our knowledge, the term big datawas first introduced in 1998 (Mashey, 1998), thus setting the boundary for the search period.
2.2 |Content analysis
NVivo was used for the qualitative and quantitative content analysis of the selected articles. For each article in the PD and BD corpus, the follow-
ing attributes were identified and used for classification: title, year of publication, research methodology, research approach, types of data used,
sample area (ie, rural or urban), sampling unit, country, region, and study duration (if stated). Those attributes were derived based on a mix of
commonly used data fields in systematic literature review and an iterative process of attribute selection depending on what arises as an important
variable for the topic of analysis (Okoli, 2015; Petticrew & Roberts, 2005). Additionally, articles were coded into several nodes based on the areas
covered in the qualitative analysis: those areas identified iteratively based on our overall purpose of understanding the potential development
value of combining big dataand positive deviancebased analysis. Those areas can be summarized into (1) challenges and limitations, (2) benefits,
(3) conceptual frameworks, (4) methods and data, (5) research findings, and (6) research opportunities. The following sections do not report all
content analysis but only those main elements seen as relevant to the purpose of this paper.
According to the Positive Deviance Initiative (Springer, Nielsen, & Johansen, 2016), the successful application of PD has been reported in more
than 60 countries across the globe with a total outreach of more than 30 million individuals for the period between 1990 and 2016. Applications
include reducing childhood malnutrition, enhancing school retention, eliminating neonatal mortality, limiting HIV transmission, improving
salesforce productivity, fighting against female genital cutting, enhancing health care services, reducing transmission of antibiotic resistant bacteria
in hospitals, and enhancing pregnancy outcomes. The central premise of the PD approach is that it harnesses the inherent wisdom of individuals
existing within a community to develop solutions to their own problems. And since solutions come from the people, they take into account
contextual and cultural variables, making them less vulnerable to social rejection. PD is also considered an efficient approach within international
development, because it reduces reliance on aid and external expertise and instead capitalizes on local resources and knowhow. It can also
generate local engagement in identifying and disseminating practices and is seen as creating selfefficacy (individuals' belief in their capacity to
execute behaviours necessary to achieve a desired objective), often considered to be a key influencer in the adoption of recommended behaviours
(Babalola, 2007; Babalola, Awasum, & QuenumRenaud, 2002).
Much of the positive deviance literature has aforgiving the punrather positive, even proselytizing tone. Balancing this, there are some
more critical insights with three particular concerns being raised.
First is a concern thatcompared with its practical applicationthe ideas of
positive deviance lack conceptual clarity, with papers using different definitions and with limited theorization of positive deviance (Herington &
van de Fliert, 2018). Second is a concern that positive deviance does not always work in practice. Problems have included difficulties in identifying
PDs (Marsh et al., 2004) and/or their differential characteristics and behaviours (Bradley et al., 2009; Felt, 2011) and inability to scale the PD
solutions across a community and, particularly, between communities
(LeMahieu, Nordstrum, & Gale, 2017). These concerns overlap significantly
with material on the third area of concern: practical challenges to the implementation of positive deviance, a topic discussed further below as an
outcome of the SLR. In sum, although, one may conclude that there has yet to be a weight of critique sufficient to discredit positive deviance as a
development approach or to identify aspects necessary and inherent to PD that would undermine it. Conversely, there is a growing weight of
evidence demonstrating beneficial development outcomes emerging from its application.
That application generally follows the five steps of the PD methodology, which can be outlined as follows (Positive Deviance Initiative, 2010):
1. Defining the problem and determining desirable outcomes;
2. Discovering PDs, ie, individuals or other social entities who unexpectedly achieved the desired outcomes;
3. Determining the underlying practices that led to those outcomes (this is known as positive deviance inquiry [PDI]);
4. Designing interventions to enable others to access and practice new behaviours; and
5. Monitoring and evaluating the PD intervention.
Building from this background on positive deviance, the systematic literature review begins with a timeline showing the volume of PD
literature over the last two decades followed by a thematic classification of the literature. We then analyse the secondary data sources used in
previous PD studies as these may share characteristics with attempts to use big data in PD. We then discuss the different units of analysis used
in the literature before presenting the identified challenges of the PD approach, those challenges presenting potential opportunities for big data to
make a contribution.
Specifically within the field of social psychology, there has been opposition to the idea of positive deviance by those working on deviant behaviour, who wish to
solely assign a negative connotation to deviance. However, such arguments do not transfer beyond the specific field of deviant studies and have, in any case, been
fairly well refuted (Shoenberger, 2017).
With an orthodoxview of PD being that it should not seek to transfer solutions between communities, but develop them within each community (LeMahieu et al.,
2017). Even if one accepts this view, it clearly depends on where one sets the definition and boundary of community.
3.1 |PD literature timeline
The publication timeline for work on positive deviance is shown in Figure 2, beginning with the first empirical PD study, published in 1976
(Wishik & Van Der Vynckt, 1976). This article published the impending methodology but did not publish results, and it was not until the early
1990s that the approach started to gain attention due to the book (1990) and study (1991) published by Zeitlin (1990) which provided extensive
observations on the PD approach in nutrition with a strong emphasis on impact. There was a peak in the early 2000s, which we can owe to
Sternin who operationalized the PD approach and published the results of its application in the Save the Children project which reduced malnu-
trition in Vietnam by 65% to 80% in 2 years (Sternin & Choo, 2000). This led to an extensive and rigorous evaluation of the PD strategy in solv-
ing child malnutrition, revealing positive results that supported its uptake in this field at this time (Hendrickson et al., 2002; Lapping, Schroeder,
Marsh, Albalak, & Jabarkhil, 2002; Mackintosh, Marsh, & Schroeder, 2002). From the mid2000s, there has been steady growth, with particular
expansion in recent years: A possible explanation could be that in 2016, three international PDfocused conferences were held for the first time.
Whatever the particular reason, it suggests growing interest and activity around positive deviance in developing countries, encouraging further
work in this domain.
3.2 |PD research approaches
Four research approaches to positive deviance were identified from the 41 reviewed articles, normal PD (25 studies), comparative PD (seven
studies), programmatic PD (six studies), and PD evaluation (three studies). Below is a summary of each detailed here so that the reader may
understand better the type of development activity to which positive deviance has so far been applied and in what manner:
3.2.1 |Normal PD
This is the most common approach that applies the PD approach to a single group. Most studies of this type stop at the PD inquiry stage (ie,
step 3 of the PD methodology outlined above), where the uncommon, successful practices of PDs are identified, without going further into
designing interventions to promote those practices and monitoring progress. For instance, in LackovichVan Gorp (2017), a study was con-
ducted to investigate strategies that could prevent marriage by abduction in Ethiopia. PDs were girls over 18 years old coming from very poor
households who were still not married. The intervention applied only the first three steps of the PD methodology to identify PDs and the
strategies they employed to protect themselves from marriage by abduction. The average duration of such studies is 8 months. A total of
68% of those studies use mixed methods, 21% use quantitative methods, and 11% used qualitative methods. The normal PD approach covered
studies that tackled issues including health careassociated infections (de Macedo et al., 2012; Marra et al., 2013), enhancing health outcomes
of women in disadvantaged circumstances (Long et al., 2013), cancer prevention (Vossenaar et al., 2009; Vossenaar, Bermúdez, Anderson, &
Solomons, 2010), child marriage (LackovichVan Gorp, 2017), child rearing (Aruna, Vazir, & Vidyasagar, 2001), infectious disease control
(Babalola, 2007; Babalola et al., 2002; NietoSanchez, Baus, Guerrero, & Grijalva, 2015), improving pregnancy outcomes (Ahrari et al., 2002),
counselling for family planning (Kim, Heerey, & Kols, 2008), child malnutrition (Aday, Hyden, Osking, & Tomedi, 2016; Bolles, Speraw,
Berggren, & Lafontant, 2002; Guldan et al., 1993; Kanani & Popat, 2012; Merchant & Udipi, 1997; Merita, Sari, & Hesty, 2017; Roche
et al., 2017; Sethi, Kashyap, Seth, & Agarwal, 2003; Shekar, Habicht, & Latham, 1991; Shekar, Habicht, & Latham, 1992; Wishik & Van Der
Vynckt, 1976), neonatal mortality (Marsh et al., 2002), and managing medicosocial problems through selfcare (Gidado, Obasanya, Adesigbe,
Huji, & Tahir, 2010).
FIGURE 2 Timeline of reviewed positive deviance (PD) literature for period 1976 to 2017. The thick line represents the actual number of studies
whereas the thin line represents a projection of the trend. The bold line represents the actual number of studies whereas the thin line represents a
projection of the trend
3.2.2 |Comparative PD
Studies in this research approach compare the results of two methodologies each applied on a different group, having PD as one of the
methodologies. It includes control trial study designs where a PD intervention is applied to one group, and the outcomes are compared
with an equivalent control group that was not exposed to the PD intervention. As an example, in one of the reviewed studies (Lapping,
Schroeder et al. 2002), a PD inquiry was compared with a casecontrol study to identify factors associated with nutritional status of Afghan
refugees in Pakistan (concluding PD to be at least as good if not more effective than control study in factor identification). Research in
such studies is mainly mixed methods and sometimes it is only quantitative. Study durations are 7 months on average. The comparative
PD approach includes studies that tackled: health careassociated infections (Escobar et al., 2017; Marra et al., 2010), malnutrition
(Hendrickson et al., 2002; Ndiaye, Siekmans, Haddad, & Receveur, 2009; Nishat & Batool, 2011), and clinical performance in medical
schools (Zaidi et al., 2012).
3.2.3 |Programmatic PD
Studies belonging to this approach aim at understanding why a few individuals (PDs) respond to a development intervention programme
better than their peers who are targeted by the same intervention. The PD inquiry is used to identify reasons behind the successful
responses of the PDs, and the findings are used to inform intervention strategies and to increase overall adoption. For instance, in Garrett
and Barrington (2013), a qualitative study was conducted to investigate barriers that prevent Honduran women from engaging in a cervical
cancer screening programme. PDs were women that engaged in the uncommon but beneficial practice of screening. The PD intervention was
designed to identify those women and the factors that led to their uncommon behaviour. Those factors (eg, selflove and selfsupport) were
to be used in future screening promotion efforts. Research in such studies is either quantitative or mixed methods. Study durations are on
average 3month long. And since programmatic studies' main interest is just in post hoc identification of reasons for deviants to adopt or
engage with the focal programme, they usually end at the third stage of the PD methodology. Example studies include programmes
concerned with malnutrition (D'Alimonte, Deshmukh, Jayaraman, Chanani, & Humphries, 2016; Levinson, Barney, Bassett, & Schultink,
2007; Sethi, Sternin, Sharma, Bhanot, & Mebrahtu, 2017), farmer training (Tekle, 2015), and livestock feed technology adoption (Birhanu,
Girma, & Puskur, 2017).
3.2.4 |PD evaluation
This is the least prevalent research approach that aims at evaluating the sustainability and impact of a PD intervention. Studies are also the longest
with an average duration of 18 months and rely mainly on mixed methods in evaluation. The reviewed literature included three evaluative studies
in malnutrition (Anino, Were, & Khamasi, 2015; Lapping, Schroeder et al. 2002; Mackintosh et al., 2002) and one study in infectious disease
control (Marra et al., 2011).
3.3 |Sources of data
All the reviewed studies used primary data for PD identification and inquiry except for two studies that used secondary data. The first of this latter
group was an exploratory study (Long et al., 2013) that investigated the factors associated with positive health outcomes among rural women in
West Bengal. It used previous data from a randomized control trial conducted in a rural population, on 2227 consenting women and adolescent
girls. Using quantitative analysis only, it was possible to examine the characteristics of PDs and factors affecting better health outcomes. However,
there was limited ability to examine other possible factors affecting the targeted outcome, since the tool used to collect data for the previous
study was not designed for the same purpose as this later study. The second study (Birhanu et al., 2017) was also an exploratory study that aimed
at investigating the factors leading to better adoption of livestock feed technologies in Ethiopia. It used a previous household survey that included
603 farm households and aimed at identifying successful cases of improved livestock feed technologies and factors underpinning this success.
Since the original study had the same purpose as the PD study, the collected data were able to unveil all possible factors affecting the desired
outcome through quantitative analysis. These studies indicate the potential to undertake positive deviant identification without a need for primary
research, thus signalling not only the potential for big databased PD studies but also the challenge of repurposing datasets not specifically
gathered for PD purposes.
3.4 |PD unit of analysis
The majority of the reviewed PD studies had individuals (infants, children, mothers, patients, students, health care workers, etc) as their
primary unit of analysis, except for three studies that investigated positivelydeviant farmer training centres (Tekle, 2015), farm households
(Birhanu et al., 2017), and (diseaseresistant) houses (NietoSanchez et al., 2015). None of the studies conducted aggregation analysis,
eg, identifying communitylevel deviance instead of individuallevel deviance. This can be attributed to the small sample size in terms of number
of communities covered that would not permit the identification of this type of deviance, a limitation that largerscale datasets might not suffer.
3.5 |PD challenges
Analysis of the literature on positive deviance reveals a series of challenges or limitations arising from work to date, challenges which we will later
interrogate to see if big data might have some response:
3.5.1 |Time and cost
The application of the PD approach is timeconsuming (Felt, 2011; Lapping et al., 2002; Marsh et al., 2004). As can be seen from the data in
Section 3.2, it takes months to complete the phases sequentially. Alongside concerns about the time requirements are also concerns that the
quality of implementation may be compromised due to time constraints. For instance, one of the studies reported that the desired large sample
size was not obtained because of time limitations (Nishat & Batool, 2011). Since PD depends typically on primary data collection, community
participation, facetoface interviews and observation, the cost of PD interventions also tend to be high. As with the time constraint, cost is also
a function of sample size that can encourage smaller samples. In addition, collecting primary data from some highrisk areas brings with it
additional time, cost, and complexity in order to mitigate the risks (Shekar et al., 1991).
3.5.2 |Positivedeviant identification
Within any given population, positive deviants are relatively rare. Based on those reviewed studies that provide the necessary data, we can
calculate an average prevalence rate of 11%. This is slightly higher than, but not completely out of line with, earlier estimates that PDs typically
form 0% to 10% of a population (Marsh et al., 2004). Whatever the exact figure, PDs are statistical outliers, and sample size thus plays a role. As
the sample size increases, the more representative of the population it becomes, and thus, the likelihood/prevalence of positive deviants becomes
greater (Osborne & Overbay, 2004).
Hence, there is a statistical pressure to undertake large sample size studies in order to identify a sufficient sample size of PDs. However, given
the timeand labourintensity of PD just noted, with costs proportional to sample size, there is a counterpressure to keep overall sample sizes
small. For example, in the comparative study of Afghan refugees in Pakistan (Lapping, Schroeder et al. 2002), the compared groups were 8 and
50 strong. Another study in Egypt that addressed factors associated with successful pregnancy outcomes reported that the information gained
from PDs was limited; this can be attributed to their very small sample size (n = 11) (Ahrari et al., 2002). Similarly, in the Honduras study examining
women who overcame barriers to cervical screening, the sample size (n = 8) was seen as not large enough to achieve full saturation of relevant
factors. The use of very small samples for PDI not only potentially misses important aspects of PD behaviour but would also have less statistical
power to identify valid associations. Additionally, as previously mentioned in Section 3.4, small sample size limits the ability to identify deviance at
different levels of aggregation. One potential solution would be the use of large secondary datasets, which could be analysed at low cost while not
compromising the number of PDs identified.
Moreover, PD primary data collectionoften due to its time and costis undertaken via a crosssectional not longitudinal design. It provides a
single snapshot of the population since it depicts the behaviours of the analysed units at a certain point of time; hence, deviance becomes static
and could be referred to as point anomaly in statistics (Goldstein & Uchida, 2016). What it cannot do is identify the dynamics of deviance such as
contextual/conditional anomalies that arise due to the particular condition of a context, with those conditions potentially differing over time. For
example, one of the reviewed studies sought to identify preventive measures to control Chagas disease; PDs were bugfree houses throughout
the period of inspection. However, an identified limitation of the study was that the houses selected are not necessarily bugfree throughout
the year, since the entomological searches happened during the summer, and natural factors could have affected the results (NietoSanchez
et al., 2015). Hence, a few of those PDs might have been false positives: appearing as a point anomaly attributed to the individual house but in
fact a contextual/conditional anomaly. Again, one potential costefficient solution would be large secondary datasets, in this case, where the data
were collected longitudinally.
3.5.3 |Methodological risk
Alongside the practical risks of PD given the need for largescale primary fieldwork in developing countries, we were able to identify two meth-
odological risks associated with use of the PD approach. First, there is a PD behaviour identification risk. For example, some of the studies
(D'Alimonte et al., 2016; de Macedo et al. 2012; Marra et al., 2010) that used observational methods in PD inquiry reported the potential for a
Hawthorne effect: an alteration of the behaviour of the subjects of a study due to being observed. Another risk is the inability to extract successful
strategies and behaviours practised by the positively deviant individuals. Positive deviance methods presume the willingness of PDs to share their
strategies and best practices. However, this might compromise what the deviants see as a competitive advantage over others resulting from their
outlier behaviour, leading them to be unwilling to share (Felt, 2011). For example, in one of the reviewed studies (Zaidi et al., 2012), positive
deviance was used to try to identify and disseminate the strategies employed by successful medical students, in order to improve the clinical per-
formance of their peers. There is a potential risk that the high performers would refrain from sharing their best practices when interviewed. In both
cases, analysis of behaviour via secondary/remote observation could help to avoid these risks.
Second, there is a risk of not being able to establish a cause and effect relationship between PD interventions and achieved results (step 4 of
the PD methodology). Some studies (eg, NietoSanchez et al., 2015; Nishat & Batool, 2011) noted that results could not be attributed to the PD
intervention alone, since the targeted population might have been exposed to other interventions and external factors that might have contrib-
uted, partially, to the desired outcome. This challenge of attribution (and also issues of time and cost) may be one explanation behind the limited
number of evaluative studies of PD interventions: ie, those that moved to step 5 of the PD methodology (Ndiaye et al., 2009; Roche et al., 2017).
Another explanation may be the lack of guidance on how to apply credible monitoring and evaluation techniques (Lapping, Schroeder et al. 2002;
Felt, 2011). As noted in Section 3.2, only 7% of the reviewed studies were evaluative: very low considering the importance of understanding the
development impact and value of PD approaches. Being able to demonstrate the lasting success of a PD intervention could support its wider
adoption and the wider adoption of PD more generally.
3.5.4 |Scalability
There are two challenges underlying the scaleup of PD interventions. The first challenge is in scaling practices within a community. PD relies
heavily on community engagement to promote the adoption and mobilization of the identified practices and to achieve behavioural change
through selfefficacy. For instance, 25% of the reviewed studies employed the PD Hearth (Wollinka, Keeley, Burkhalter, & Bashir, 1997), a nutrition
education framework designed to empower mothers to enhance the conditions of their malnourished children. It requires mothers of PD children
to host neighbouring mothers and their malnourished children for 12 consecutive days, where they prepare meals and feed their children together
(Felt, 2011; Lapping, Marsh et al. 2002; Marsh et al., 2004; Pascale et al., 2010). With this level of engagement, PD proved successful in smallscale
adoption but made largescale adoption a challenging task given both the growing complexity and the challenge of a strong enough preexisting
social fabric to ensure cooperation of the PD mothers. Zeitlin (1991) notes a similar point thatfor certain communities and domains of action
there could be resistance to making everyone a top performer, and that moves towards that might disrupt and disintegrate system dynamics.
The second challenge is the scaling of practices across communities. An issue with PD is the inability to generalize practices and behaviours
inferred from one community to another. In the majority of the reviewed studies, PD interventions targeted smallscale communities, and the
inferred practices were particular to the circumstances of this community making it difficult to replicate in other communities (Saïd Business
School, 2010). If broader, crosscommunity data could be accessedidentifying PDs and their behaviour on a wide scalethen this challenge of
limitations on generalization could be reduced to some extent; although of course this would likely assume/require the presence of
noncommunityspecific behaviours underlying positive deviance.
3.5.5 |Narrow domain/geographic scope
There is a current skew in the domain and geographic focus of PD applications in developing countries, summarized in Figure 3. Regarding domain
coverage, we found that the vast majority89%of the reviewed studies were in public health, with 41% focused specifically on malnutrition.
Put another way, there were only four nonhealth studies: two on agriculture (Birhanu et al., 2017; Tekle, 2015), one on child protection
(LackovichVan Gorp, 2017), and one on education (Zaidi et al., 2012). As for the geographic coverage, there are nearly 150 developing countries
(OECD, 2017), but PD studies identified by the review encompassed only 20, with just four countries (India, Brazil, Pakistan, and Ethiopia)
responsible for almost 50% of studies, and only two countries having hosted studies from more than two domains. There is also a withincountry
geographic concentration, with 83% of studies being undertaken with rural communities, significantly out of kilter with population distribution in
developing countries.
This domain and geographic concentration can be attributed to a form of path dependency in positive deviance. The first use of PD (Wishik &
Van Der Vynckt, 1976) was for a nutritionbased intervention in a rural community. Then, early adopters of PD who set the foundation for the
field (Sternin et al., 1997; Zeitlin, 1991; Zeitlin, Ghassemi, & Mansour, 1994) including development of an operationalizable framework for PD
(Sternin et al., 1998) all undertook work on malnutrition in rural areas. Despite applicability of PD across many domains, subsequent PD actions
have often followed suit, leaving a gap of domains and locations that have been largely ignored by PD to date.
That PD has relevance to other countries, and other domains can readily be seen from its application in the global North, eg, to public sector
reforms (Andrews, 2015), enhancement of prison conditions (Awofeso, Irwin, & Forrest, 2008), organizational scholarship (Mertens, Recker,
Kohlborn, & Kummer, 2016), and waste management (Delias, 2017). But PD for developing countries needs encouragement to spread further than
its current narrow path.
Although noting as per the point raised in Section 3.3 that relying on quantitative analysis of secondary data to infer practices might limit the ability to test the effect
of other potential factors influencing the desired PD outcome, especially in studies where the instruments used to collect the data were not designed to measure the
desired outcome (Long et al., 2013).
The evolution and diffusion of digital infrastructures has led to a proliferation of data, often referred to as big data(BD). There has also been an
increasing ability to make use of it, characterized in enhanced processing and storage capacities, which has provided an opportunity to convert
these data into information and knowledge that feeds decisions and actions. The main characteristics of BD derived from Gartner's definition
(Gartner, 2013) are (1) volume: huge amounts of data generated from the rapid diffusion of mobile phones, social media, and other online services,
plus the growing use of sensors and satellite imagery; (2) velocity: the growing speed and currency of data production and enabling decisions and
actions to be taken in a timely manner; and (3) variety: the combined availability and potential use of structured data, having a predefined structure
(eg, mobile transactions), and unstructured data, not having a predefined structure (eg, email, video, and audio), to extract insights.
The applications of big data for development(BD4D) span a wide variety of domains and leverage new sources of data and new analytical
tools. It is argued that big data can fundamentally shift the way we pursue social change as it is capable of providing snapshots of the wellbeing of
populations at high frequency, high degree of granularity, and from a wide range of angles, narrowing both time and knowledge gaps (UNGP,
2012). Sitting alongside concerns about and critiques of BD4D (eg, Taylor & Broeders, 2015), big data are therefore also argued to offer new
opportunities for development for reasons that include the following:
1. Low cost: Digital traces produced from digital platforms provide a lowcost alternative to traditional sources of data (eg, censuses and
surveys); in some instances by replacing variables of interests with correlated proxies (Hilbert, 2016). For instance, mobile call duration
and frequency have been correlated to income or education levels in a geographic region (FriasMartinez & Virseda, 2013) and could then
substitute for survey and similar data gathering.
2. Realtime feedback and awareness: Through monitoring populations, BD makes it possible to understand where policy and programme
interventions are succeeding or failing in real time, in order to make adjustments in a timely manner.
3. Broad sampling: With a global average penetration of 95% and a 75% penetration among base of the pyramid populations (Cartesian, 2014),
mobile phones are coming close to sampling the universe N instead of sampling n of the universe N (Hilbert, 2016).
4. Detail and insight: The ability to merge and use different sources of data reflecting a certain event or reflecting the behaviours of an
individual, community, or an organization provides a realtime, crossvalidated, finegrained picture of reality.
5. Big data analytics: Advanced analytics techniquesthose which perform particularly well when applied to huge datasetsenable big data to
be used for better decisionmaking. An example is machine learning, a subfield of artificial intelligence, which gives computers the ability to
learn from data without being explicitly preprogrammed on the knowledge they will extract.
Given this potential value of big data to development, the review of literature outlined in Section 2 was undertaken and is reported next. This
section begins with a timeline showing the volume of BD literature over the last decade followed by a thematic classification of the literature and
the geographic distribution of its application domains. We then discuss the other forms of data that were combined with BD, the different units of
FIGURE 3 Geographic distribution of the domains of positive deviance (PD) literature relating to developing countries
analysis, and the employed BD analytics techniques before outlining the challenges of BD4D at the end. All this provides the knowledge of big
data necessary to understand how it might be applied to action research on positive deviance, as then discussed in Section 5.
4.1 |BD4D literature timeline
Figure 4 shows that the BD4D research area is relatively new, given that all the identified literature was published since 2007. There is some
growth in the number of studies during the later years and, similar to PD studies, there is an accelerated growth towards the very end of the
period under review. As with the PD literature, this suggests a growth in activity and interest, encouraging further work on big data for
4.2 |BD4D research approaches
Utilising the taxonomy of BD applications proposed by Hilbert (2016), we classified the reviewed literature into four main approaches based on
the elements being tracked: locations (12 studies), words (four studies), nature (six studies), and economic activity (12 studies). This gives a sense
of the general scope of big data application, for example, its potential application in positive deviance analysis.
4.2.1 |Tracking locations
This approach contains applications that analyse human and object mobility data. This typically comes from mobile phones in the form of de
identified call detail records (CDRs), which usually provide the time and associated cell tower of text messages and calls. This is the most common
data type in the reviewed BD4D studies, notwithstanding the concerns of mobile phone operators about releasing such data: either that it has
commercial sensitivity and could give competitors an unfair advantage or that techniques could be used to deanonymize the data and uncover
individual subscribers' identities. There are also inherent biases in CDRs due to socioeconomic and geographic variations in phone ownership,
but evidence still suggests that CDRs provide the best description to date of population movement in lowand middleincome countries (Wilson
et al., 2016). CDRs have thus been used to analyse travel and migration patterns of mobile users to understand the spread of infectious diseases in
lowincome settings (Bengtsson, Lu, Thorson, Garfield, & von Schreeb, 2011; Buckee, Wesolowski, Eagle, Hansen, & Snow, 2013; Tatem et al.,
2009; Wesolowski et al., 2014; Wesolowski et al., 2015) and to identify population displacement following disasters (Bengtsson et al., 2011;
Lu, Bengtsson, & Holme, 2012; Wilson et al., 2016) and migration patterns in climatestressed regions (Lu et al., 2016).
But CDRs are not the only locationrelated data that has been used. For example, data from a web mapping service application (
were used to calculate average travel distance to healthy food outlets in order to identify urban areas with limited food accessibility in China
(Su, Li, Xu, Cai, & Weng, 2017). Car GPS data have been used to map the spatiotemporal distribution of pollution emissions from traffic
(Huang, Cao, Jin, Yu, & Huang, 2017; Luo et al., 2017). There are also studies that combine mobility data with other sources of data for better
representation, crossvalidity, data enrichment, and for covariance analysis. For instance, in Tatem et al. (2014), physical data in the form of
satellite images, climate, and topographic data were combined with CDR data to understand the spread of malaria, specifically, the seasonality
of movements including movement across borders.
4.2.2 |Tracking words
This approach contains applications that analyse actions, activities, and events based on words, which typically come from social media. It usually
faces the challenge of representational validity in terms of the demographic, socioeconomic, and geographic profile of contributors given skews in
FIGURE 4 Timeline of reviewed big data for development (BD4D) literature for period 2007 to 2017. The thick line represents the actual
number of studies whereas the thin line represents a projection of the trend. The bold line represents the actual number of studies whereas
the thin line represents a projection of the trend
terms of those who do and do not use social media. There is also the challenge of potential differences between digital and real behaviour such as
selfcensorship or presenting a false image. One advantage, although, is that in many cases, data sources are readily accessible because they are
open data
in nature; and they also benefit from enabling mapping of behaviour in real time (Pfeffer, Verrest, & Poorthuis, 2015). Examples of
applications include the use of Google search word trends to compare the demand for massive open online courses (MOOCs) between different
countries (Tong & Li, 2017), applying coword analysis to map the research themes of Indonesian scholars' publications (Surjandari,
Dhini, Lumbantobing, Widari, & Prawiradinata, 2015), analysing protest activity using twitter data during the Egyptian January 25, 2011 revolution
(Wilson, 2011), and revealing geographical and social patterns of tweets pertaining to flooding and criminal activity in Caribbean cities
(Pfeffer et al., 2015).
4.2.3 |Tracking nature
This approach contains applications that use data to observe environmental and natural phenomena to mitigate risk, improve emergency response,
or to optimize performance (Hilbert, 2016). Satellite imagery is the most common BD type used by those applications; this is due to its increasing
availability at global and lower scales, often via open or other nocost access. Growth in datasets over time is also allowing use for time series
analysis. The reviewed studies falling under this approach used satellite imagery to detect illegal deforestation activities (Burgess, Hansen, Olken,
Potapov, & Sieber, 2012), to monitor coal fires (Jiang, Jia, Chen, Deng, & Rao, 2017), to map temporal water surfaces (Haas, Bartholomé, &
Combal, 2009), and to model crop growth (Tesfaye et al., 2016).
Other studies collect environmental data via sensors. For example, sensor networks were used to monitor the spatiotemporal distribution of
greenhouse gas emissions in China (Tang, Yang, & Zhang, 2014). There is also a study (Zhang, Chen, Chen, & Chen, 2016) that combined sensor
data, satellite images, and meteorological data with social media for the analysis of urban waterlogging disasters (where drainage systems are
unable to cope). Physical data were used to observe and understand waterlogging, and social media data (ie, tracking words) were used to identify
qualitative features of waterlogging incidents.
4.2.4 |Tracking economic activity
This approach contains applications that use data that reflect the economic situation of the analysed units. Satellite images and CDRs are the two
most popular BD sources that are used for this purpose. For instance, CDRs were used to predict wealth of individuals by tracking their history of
mobile use represented in the intensity, volume, time or direction of calls (Blumenstock, Cadamuro, & On, 2015), or through phone ownership
(Blumenstock & Eagle, 2012). Similarly, satellite images were used to predict poverty and estimate economic growth in a number of studies.
For instance, daytime satellite images were used to predict socioeconomic wellbeing by analysing visible household assets (Jean et al., 2016).
They were also used to map population settlements (Tatem, Noor, von Hagen, Di Gregorio, & Hay, 2007) and classify slums (Kohli, Sliuzas, Kerle,
& Stein, 2012).
Other studies used satellite nightlight images as a proxy for electricity consumption levels which, in turn, can be seen as a proxy for levels of
economic activity (Doll & Pachauri, 2010; Henderson, Storeygard, & Weil, 2012; Sutton, Elvidge, & Ghosh, 2007). However, nightlight images have
difficulty distinguishing between poor densely populated areas and wealthy sparsely populated areas (Jean et al., 2016). Hence, Njuguna and
McSharry (2017) complemented satellite nightlight data with CDR data to build a stronger poverty proxy by incorporating the level of mobile
usage. Other examples include the use of data from a leading online retail platform in China ( to analyse spatiotemporal features
of housing prices (Li, Ye, Lee, Gong, & Qin, 2017), and the use of data from China's industrial enterprise database and customs import/export trade
database to examine the extent of greeningof global value chain enterprises (Song & Wang, 2017).
4.3 |BD4D studies domain/geographic distribution
Table 2 summarizes the overall domains and specific application topics of the reviewed BD4D studies. We can see that economics, public health,
and environmental studies were the most common application domains, within which the most common applications were infectious disease
(16%) and poverty measurement (16%). In infectious disease studies (Bengtsson et al., 2011; Tatem et al., 2009; Wesolowski et al., 2014;
Wesolowski et al., 2015), mobile data and health data are combined to identify risk areas. As for poverty measurement (Blumenstock et al.,
2015; Jean et al., 2016; Njuguna & McSharry, 2017), satellite nightlight and mobile data are used as a proxy for economic activity at temporal
and geographic scales for which traditional data are of poor quality or unavailable.
Figure 5 shows the geographic distribution of BD studies by domain, indicating that China had the biggest share of BD4D studies that span
multiple domains. In addition, there were three studies not presented on this map since they were applied across multiple countries. The first study
(Doll & Pachauri, 2010) used nighttime satellite imagery to estimate rural populations without access to electricity in developing countries.
The second study (Henderson et al., 2012) used satellite data to augment official income growth measures of coastal areas in subSaharan Africa.
Data freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control(WP, 2018).
The third study (Haas et al., 2009) used remote sensing data to map temporary water bodies in regions of western subSaharan Africa. For
withincountry studies, some are ruralfocused, many cover both urban and rural areas, and some are solely urbanfocused.
While the geographic coverage of big data studies in terms of countries is, as yet, not much better than that of PD studies, there is clearly
ready potential for a much broader scope given the universal presence of at least some types of big data, including scope for more urbanfocused
work. And, in relation to domains, big data have already shown application to a wider range of topics than positive deviance.
4.4 |Sources of data
A total of 37% of the reviewed studies complemented big data with other sources of primary and secondary data. For instance, in Sutton et al.
(2007), a model was developed using GDP and population data and nighttime satellite imagery to predict GDP at subnational levels. Similarly,
TABLE 2 Classification of BD4D studies by domain and application
Domain Application No. of studies
Economics Predicting poverty 6
Measuring economic growth 3
Mapping population distribution 1
Analysing housing prices over time 1
Public Health Infectious disease control 6
Environment Analysing traffic pollution 2
Monitoring deforestation activities 1
Monitoring changes in small water surfaces 1
Monitoring greenhouse gas emissions 1
Coal fire suppression efforts 1
Disaster Response Population displacement post disasters 4
Food & Agriculture Accessibility of healthy food stores 1
Drought tolerant crops 1
Energy Predicting electricity demand load 1
Green technology adoption 1
Education Quantifying MOOC demand 1
Research theme mapping 1
Urban Governance Urban waterlogging 2
Security 1
Politics Analysing twitter activity in revolutions 1
The total number of studies listed in the table is 37 although the reviewed BD studies were only 34. This is because two studies presented applications in
multiple domains.
FIGURE 5 Geographic distribution of the domains of big data for development (BD4D) literature
Jean et al. (2016) used daytime satellite images, annotated with georeferenced household consumption survey data, to develop a transfer learning
model which trained the data in surveyrich countries to predict consumption and assets in surveypoor countries, using daytime satellite images
alone. Wesolowski et al. (2014) demonstrated that community surveys can complement mobile data to approximate travel patterns of
nonsubscribers in rural areas. Secondary data are also used for cross validation, for example, in Wesolowski et al. (2015), populationbased sur-
veys were used to validate the results of mobile data analysis that measured the magnitude of population displacements.
A number of studies combined health data, having georeferenced disease cases, with mobility data to develop risk maps for infectious
diseases (Bengtsson et al., 2011; Tatem et al., 2009; Wesolowski et al., 2014; Wesolowski et al., 2015). Spatially referenced primary survey data
have been used to support satellite imagery and mobile phone data in understanding the reality and the impacts of socioeconomic or sociospatial
differences. For example, Blumenstock et al. (2015) was able to derive insights on the degree of wealth of individuals by supplementing mobile
phone history big data with data from an anonymized mobile phone survey. Hence, it was possible, through training models, to predict
wealth using only mobile phone use history for individuals not included in the survey sample. We may conclude from the above examples that
notwithstanding the potential for big data to provide a faster, cheaper, and a more granular alternative to traditional data sourcesgreater
value may be captured when BD is combined with those data sources instead of simply replacing them. In particular, big data can be validated
in locations where comparator data exists and then applied alone in locations where comparators are absent, an especially helpful approach in
developing countries where comparator data may be thin on the ground.
4.5 |BD4D unit of analysis
In the reviewed BD4D studies, the unit of analysis ranged from individuals (eg, mobile users and twitter users), through geographic areas (eg, grid
areas located via satellite imaging), to regions and even countries. The majority of studies applied aggregation on different scales to analyse and
visualize patterns, events, and spatial relations. Conversely, BD4D studies also provided a disaggregation opportunity. For example, in poor
countries, data about economic growth are often available only at high levels of geographic aggregation (eg, national level) because it is collected
using sample surveys instead of locationdisaggregatable census surveys (Chandy, Hassan, & Mukherji, 2017). Due to the finer granularity of big
data represented in user CDRs or nighttime satellite images, it was possible in a number of studies (Blumenstock et al., 2015; Blumenstock &
Eagle, 2012; Doll & Pachauri, 2010; Henderson et al., 2012; Sutton et al., 2007) to use those forms of data as proxies for economic growth at
subnational level. Compared with typical PD data, big data therefore may provide much greater aggregation and disaggregation potential.
4.6 |BD4D analytics
Discussion of big data in development can tend to focus on inherent qualities of the data such as the 3Vs: volume, velocity, and variety. But
greater valueincluding value for positive deviancemay vest in the advanced analytics techniques that are being applied to big data, and to
the use of these techniques for improved development decision making. The reviewed BD4D literature utilized two types of analytics:
1. Descriptive analytics provides information about the past and present. It uses data aggregation and data mining techniques to summarize
historical data and answer the questions, What has happened?or What is happening?. BD visualization is the essence of this type of
analytics, creating a new face to standard descriptive statistical methods. For example, Pfeffer et al. (2015) used descriptive analytics to
geomap the word frequency of twitter data relating to two Caribbean cities which referred to crimes and flooding in order to better
understand those phenomena.
2. Predictive analytics uses statistical models and forecasting techniques to answer the question, What will happen?. It encompasses two
types: inference and forecasting. Inference models predict the value of a certain variable of interest based on its association with variables
in another data source. For example, Jean et al. (2016) used daytime satellite images and households surveys, covering a specific area, to
train a predictive model that was able to estimate the economic wellbeing, in another area, using only satellite imagery. This approach
was able to overcome the data sparsity issue, through training models in datarich areas to make predictions in datapoor areas. Inference
models were also used to predict water precipitation and potential for waterlogging based on climate data and road and terrain maps
(Zhang et al., 2016). On the other hand, forecasting models utilize trend analysis and pattern recognition techniques to predict what will
happen in the future, based on what happened in the past. For example, in Ifaei, Karbassi, Lee, and Yoo (2017), multivariate dynamic
models were used to forecast power consumption using previous data on exported and imported power and quantity of stored power
over a period of time.
Big data also enable the use of intelligent data analytics techniques that perform better when applied to huge amounts of data. As noted
above, an example is machine learning (ML). From the reviewed BD4D literature, ML techniques can be grouped into two main categories:
supervised and unsupervised learning. In the former, ML is applied on a training dataset where each input Xis labelled to a class or output Y
and the primary objective of the learning algorithm is to develop a mapping function Y = f(X), so that when you have any input x, you can predict
its output y. For example, supervised learning was used to predict poverty levels from mobile phone data (Blumenstock et al., 2015) and from
satellite imagery (Jean et al., 2016). In unsupervised learning, ML is applied on a dataset where you only have input data Xand no corresponding
output Y. The primary objective of the learning algorithm is to discover underlying similarities between the input data points and create clusters of
data based on the perceived similarity. It is also capable of allocating new inputs into the appropriate cluster. For example, in one of the studies,
unsupervised learning was used to find the interrelationship among academics' research approaches and cluster them, based on the cooccurrence
of the publications' keywords (Surjandari et al., 2015).
4.7 |BD4D challenges
While there are broader challenges relating to the use of big data in developmentsuch as associated shifts in power between different groups
(Sengupta, Heeks, Chattapadhyay, & Foster, 2017; Taylor & Broeders, 2015)there are also a set of more practical challenges that emerged from
the literature reviewed, which would need to be taken into account if using big data to research positive deviance.
Absence of Theory: The majority of BD applications do not use theorydriven models, especially in cases of predictive analytics where they
depend mainly on past data to predict what will happen in the future. However, attribution analysis (cause and effect studies) investigating why
outcomes change in response to variations in inputs will need a theoretical framework. Employing a theory of change can guide the identification
of explanatory variables (inputs) and indicators for outputs, outcomes, and impact (Bamberger, 2016).
Proof of Concept Skew: As might be expected given the relatively formative nature of bigdatafordevelopment, most of the BD4D
literature represents a proof of concept rather than use of data for actual developmentrelated decisionmaking and implementation. As just
one example, Pfeffer et al. (2015) demonstrate what (relatively little) tweets might tell us about location of urban flooding and crime but without
any engagement with reallife urban planning decisions.
Representational Validity: Mobile phone ownership is skewed towards certain population groups based on income, gender, or age, leaving
specific groups and geographic areas underrepresented in mobilebased sources of big data (Bengtsson et al., 2011; Tatem et al., 2014;
Wesolowski et al., 2014; Wilson et al., 2016). This is even more of a challenge for social media data, in terms of demographic and socio
economic profiling and the geographic spread of the content generators (Pfeffer et al., 2015). BD sources were not particularly produced to
investigate, assess, or measure any of the presented developmentrelated applications; they are rather a side effect (Pfeffer et al., 2015). They
provide one or more aspects of the studied issue but they might overlook other important aspects requiring on the ground, targeted inquiry
(Lu et al., 2016; Pfeffer et al., 2015). This explains the recent debates (Graham & Shelton, 2013; Pfeffer et al., 2015) around the combined
use of BD and other sources of data for better representational validity.
Human Capacity: Both researchers and practitioners typically lack the necessary technical skills needed to clean up data sources, to link
different data sources, to analyse big data, to identify emerging patterns from big data, and so on (Pfeffer et al., 2015). There are also highly
unstructured data types, like satellite imagery, the analysis of which requires knowledge of advanced analytics and machine learning tools
(Jean et al., 2016). Those missing skills and local conditions, especially in developing countries, limit the exploitation of this valuable data source,
creating new digital divides (Batty et al., 2012).
Data Accessibility: Most big data are not open and easily accessible. Data gatekeepers, such as mobile operators and public institutions, are
not always willing to share their data, often because they consider it to be a source of commercial or political advantage (Pfeffer et al., 2015; Jean
et al., 2016a).
Privacy and Legal Issues: It can be difficult to link and analyse different data sources while respecting privacy, eg, of individuals who
produced the data (Blumenstock et al., 2015; Sutton et al., 2007). This can be particularly challenging in developing countries, where there is
an absence of legal frameworks protecting citizens (Pfeffer et al., 2015). As noted above, one reaction of data providers, like mobile operators,
is to restrict access to datasets. Another reaction is to anonymize CDRs by removing and aggregating some attributes. While understandable, this
can reduce the developmental value that can be captured from the provided datasets.
In summary, and despite the demonstrated value of using big data in a variety of developmentrelated applications, it is important to note the
challenges associated with its use. Of particular relevance for this paper are challenges that could affect the significance of its use in positive
deviance, like privacy and accessibility. For instance, BD depicting human behaviour is the most relevant data for PD; however, if these data might
compromise the security or privacy or undermine competitive advantage of data owners, its accessibility and usage would require strict rules and
principles backed by adequate tools and systems to ensure privacypreserving analysis(UNGP, 2012).
Positive deviance has been shown to be effective as a problemsolving approach in certain development domains but it faces challenges that
have so far limited its uptake. Big data could potentially address some of those challenges and/or in other ways enhance current approaches to
positive deviance, providing there was an adapted PD framework that could guide its use. Conversely, positive deviance appears to provide an
approach to the detection of socioeconomic anomalies that might broaden the application of big data in development. Alongside the growing
interest in and practice of both positive deviance and big data in development, this creates an opportunity for big databased positive devi-
ance(BDPD). In this section, we will examine how BD could address some of the aforementioned PD challenges and we then propose a
means to operationalize their combined use.
5.1 |BD as a response to PD challenges
While big data cannot address all of the positive deviancerelated challenges identified in Section 3.5, it has potential in relation to most of them:
5.1.1 |Time and cost
PD studies mainly use primary data collection both for identification of positive deviants and for PDI: the inquiry into what causes the deviant out-
comes. As noted above, primary data collection involves significant time, cost and risk, and use of other forms of data collection could therefore
offer important advantages. In light of this, a few studies have made use of traditional secondary datasuch as that from surveysbut this brought
its own challenges; for example, it is hard to identify positive deviants from such datasets as they are often anonymized or outofdate by the time
they become publicly accessible; or it may be difficult to explain causes of positive deviance as important factors are missing from the survey. In
addition, while the cost of reuse of survey data may be low, the actual financial costs of the original datagathering are very highparticularly for
census data.
In comparison, big data brings with it not just the gains of reduced time and cost common to reuse of secondary datasets but the more foun-
dational reduction that the costs of initially gathering big data tend to be very low since it often makes use of already existing data exhaust
from digital processes. In part thanks to low cost, there are also thinking of satellite imaging and social media data increasing sources of
realtime big data. These avoid the problem of time lag (something particularly challenging with, say, census data which is often only gathered
every ten years). Finally, the lack of cost constraints means that big datasets often have a much greater geographical scale than other forms
of secondary data.
While big data does still suffer the secondary data shortcoming that it has been created for purposes other than PD analysis, it is increasingly
present in locationssuch as poorer countries or communitieswhere survey data either tend to be lacking, or based on very small samples,
or inaccurate; these problems themselves sometimes deriving from the high cost and time requirements of surveys and the lack of
resources for these locations (Letouzé & Jütting, 2015; Mügge, 2014). Initiatives have already demonstrated the ability of big datasuch as sat-
ellite imaging or CDRsto fill these data gaps and act as proxies for socioeconomic indicators (Henderson et al., 2012; Jean et al. 2016;
Njuguna & McSharry, 2017).
Big data therefore show significant potential to help address the time, cost, and risk constraints faced by current positive deviance studies,
including the constraints associated with using traditional secondary data sources for PD analysis.
5.1.2 |Positive deviant identification
As mentioned earlier, the use of primary data collection to identify positive deviants has three main drawbacks: low sample power, inability to
identify dynamic anomalies, and limited aggregation analysis. Use of big data sources offers a potential to overcome those challenges as follows:
1a) Sample Power: Big data are produced passively at marginal or no additional cost whereas traditional data sources are produced actively
with a cost that is proportional to the size of the sample. As a result, BD sources tend to have a much larger coverage of populations. Given that
positive deviants are relatively rare, the larger samples from big datasets will enable the identification of a larger number of PDs. Accordingly, the
risk of overlooking important factors will be reduced and the ability to generalize practices to larger populations will be improved.
1b) New IdentificationTechniques: Although not recognized in the PD literature as a challenge, there is an opportunity offered via BDPD that
is not available to traditional positive deviance identification. This is the application of machine learning which, as noted above, works most
effectively with large datasets (Hilbert, 2016). Machine learningbased approaches for anomaly detection outperform simple statistical models
for various types of anomaly (Chandola, Banerjee, & Kumar, 2009; Goldstein & Uchida, 2016), thus providing the potential for better PD
identification than currently possible. Supervised machine learning can also be used for predictive analysis, by first being trained to analyse a
small sample set of big data in tandem with ground truth data: for example, satellite images combined with survey data. The survey data already
identifies the positive deviants, and machine learning then develops the ability to identify PDs within the corresponding satellite image data. The
analytical algorithms can then be applied solely to big data sources and will identify positive deviants from those sources on a much wider scale.
2) Dynamic Anomalies: Where traditional data sources typically provide a static, crosssectional view of behaviour, big data can often provide
a dynamic picture of the targeted population over time. Hence, as discussed in Section 3.5, BDPD can be better at identifying and potentially
eliminating contextual, conditional anomalies as explanations for positive deviance.
3) New Levels of Aggregation: The majority of PD studies have individuals as the primary unit of analysis, whereas in BD studies, the unit of
analysis can range from individuals to communities and regions. Hence, the use of big data could provide PD with the ability to identify deviance
at differentie, higherlevels of aggregation than just individuals.
5.1.3 |Methodological risk
Big data are in almost all cases gathered within the explicit intervention of, or tangible visibility to, the subject populations. As such, risks arising
from populations knowing they are being observed and questioned, such as the Hawthorne effect or refusal to share practices, are avoided. In
addition, where BD sources incorporate outcome indicators of the positive deviant behaviours, then those sources can be used for ongoing
monitoring of the effects of a PD intervention. This might help address the challenge that the lack of credible monitoring and evaluation
techniques limits uptake of the PD approach, especially in new contexts and settings.
5.1.4 |Scalability
As discussed earlier, PD faces two scalability challenges: scaling practices within a community and scaling practices across communities. Both of
these issues are partly rooted in sociobehavioural factors that big data are unlikely to be able to address. But one can hypothesize some potential
added value.
For example, for the first challenge, unsupervised machine learning could be applied to a big dataset to cluster the PD intervention
population (based on machineinferred similarities). Then intervention could target only those clusters with socioeconomic determinants
similar to those of the deviants for practice dissemination and adoption. This could reduce both the time and cost required for scaleup in
comparison to the traditional methods of practice dissemination that rely heavily on community mobilization. Although noting two potential
limitations: first that, by definition, positive deviants are sought within populations with similar socioeconomic determinants; and second that
mobilization and incentives are always likely to be important in any type of PD implementation. There is also a small possibility that BDbased,
statistically verified evidence might prove more convincing to potential adopters of PD behaviours than the more qualitative findings typical of
traditional PD.
The second challenge could be mitigated if crosscommunity big data sources are available. These would enable the identification of PDs and
their behaviour on a broader scale making generalizations possible.
5.1.5 |Narrow domain/geographic scope
Despite the effectiveness of PD as a problemsolving approach for international development, its uptake by developing countries in domains other
than public health has been very limited, and its application has been concentrated in rural areas of just a few countries. While geographic
coverage of BD4D in the literature to date has also been concentrated, that literature already illustrates more application in urban areas and
application in several other domains. Big data can thus expand the scope of PD, enabling it to break from its current path dependency. Outline
domain examples where big databased positive deviance could operate include the following:
Infectious Disease Control: A number of studies (NietoSanchez et al., 2015; Tatem et al., 2014; Wesolowski et al., 2014; Wesolowski et al.,
2015) used CDRs to map the travel patterns of individuals who are members of disease sources(areas with many reported disease cases), in
order to identify areas vulnerable to transmission, known as sinks(areas having high inflows of individuals from source areas). The aim of
such studies was to prioritize source and sink areas for disease control. PDs in this case would be sinkareas with very high potential for
transmission due to high travel inflows from source areas, yet having only a small number of reported cases. Understanding and
identifying the measures and the factors that were behind this deviance could provide valuable insights into successful disease control for other
infected areas.
Urban Resilience/Planning: In Zhang et al. (2016), factors affecting waterlogging in one city were used to predict waterlogging in another city
using satellite imagery, precipitation meteorological data, terrain data, and road maps. Positive deviance could be used to investigate why certain
areas (PDs) within the same city experience less frequent waterlogging than others, and using those factors (eg, infrastructure and road networks) for
better urban planning.
Academic Research: In Surjandari et al. (2015), Indonesian scholars' publications indexed in Scopus were analysed to map their primary
research themes and advise on a nationwide research roadmap. One could aggregate those publications by department, and identify departments
with exceptionally high research publication quality (PDs) as proxied by citation indicators in Scopus or equivalent sources, eg, using the average
of the publishing authors. Understanding factors leading to better publication quality would provide insights into departmentallevel good
practices that could be adopted in other departments.
Where his the highest number where a scholar has hpapers that have been cited at least htimes.
Deforestation: In Burgess et al. (2012), satellite images were used to identify forested districts in Indonesia that are practising illegal logging.
Positive deviance could be applied to identify districts with minimal illegal logging activities (PDs) and then investigate the measures and practices in
place within those districts that are linked to that minimization.
Agriculture: In Tesfaye et al. (2016), crop, soil, and climate data were used to assess the performance of new droughttolerant crop varieties.
These data could be used to identify those smallholder farms with high productivity (PDs), ie, high output levels from droughttolerant crops. Using
these or other big datasets, or survey data, one could infer good practices that could be adopted by neighbouring farms facing the same social,
economic, and environmental constraints.
5.2 |PD as an opportunity for BD4D applications
The primary interest of this paper is that outlined in the previous section: to identify current challenges to positive deviance action research and to
identify ways in which big databased positive deviance might address those challenges. But we can also reverse the polarity of the investigation
and ask what positive deviance might offer the subfield of big data for development. Reviewing Section 4.7, there is little or nothing that positive
deviance can do to address issues of representational validity, human capacity, data accessibility, or privacy and legal issues. Instead, these
represent constraints that BDPD would have to contend with.
But those working on big data do identify a need to develop methodologies to characterize and detect socioeconomic anomalies in context
(UNGP, 2012). Use of positive deviance to analyse big datasets and to detect anomalies in context cannot be said to provide a theoretical
foundation in an academic sense, but it does offer a conceptual frame and a systematic methodologya theory of changethat can link big data
to development outcomes, something which has typically been missing to date. And at least if all five steps of the positive deviance approach
were undertaken, then BDPD helps big data for development move beyond just proof of concept, by creating a realworld impact from the
analysis of big data.
5.3 |Towards big databased positive deviance analysis
In summary, we have a twosided argument in favour of a big databased positive deviance approach. Particularly, using big data instead ofor in
conjunction withtraditional primary data sources can potentially address many of the challenges currently faced by positive deviance: reducing
time, cost, and effort; identifying positive deviants in new or better ways; and enabling positive deviance to break out of its current path
dependencies. And, conversely, positive deviance provides a systematic basis for extracting realworld development impacts from big data by
putting knowledge about anomalies into action.
We can summarize big databased positive deviance as follows:
The BDPD approach is a problemsolving assetbased approach that uses big data sources to identify objects (positive deviants) performing
unexpectedly well in a specific outcome measure that is digitally recorded, mediated, or observed. The primary objective of the BDPD approach
is to identify the behaviours, strategies, and factors employed by the positive deviants and develop interventions to facilitate the dissemination
and adoption of those strategies.
BDPD objectsthe positive deviantscould be individuals, communities, entities, areas, or countries whose uncommon behaviours and
strategies, in a specific context, can be translated into a performance measure that is digitally recorded, mediated or observed.
We end with Table 3, which compares the PD and BDPD approaches in terms of the data sources used, the type of anomalies detected, the
possible units of analysis, and the employed research methods and techniques.
We will be taking forward work applying BDPD, and we hope that other development researchers and practitioners may be encouraged to do
the same.
TABLE 3 Comparison between the positive deviance (PD) and the big data for development (BDPD) approach
Data Sources Used
Type of Positive
Deviants Unit of Analysis
Data Analysis
PD Surveys, focus groups, interviews, observations Point anomalies Individuals and entities Qualitative,
or mixed
Statistical methods
BDPD Government data, online incl. social media data,
mobile data, physical sensor incl. remote sensing
data, offline and online surveys, focus groups,
interviews, observations
Point and contextual
(time and spatial)
Individuals, entities,
communities, regions,
countries, etc.
or mixed
Advanced analytics
and statistical
Basma Albanna
Richard Heeks
Aday, J., Hyden, A., Osking, J., & Tomedi, A. (2016). Hygiene, sanitation, and behaviors that produce positive deviant outcomes in childhood growth in rural
eastern Kenya: A qualitative positive deviant investigation. Annals of Global Health,82(3), 437.
Ahrari, M., Kuttab, A., Khamis, S., Farahat, A. A., Darmstadt, G. L., Marsh, D. R., & Levinson, F. J. (2002). Factors associated with successful pregnancy
outcomes in upper Egypt: A positive deviance inquiry. Food and Nutrition Bulletin,23(1), 8388.
Andrews, M. (2015). Explaining positive deviance in public sector reforms in development. World Development,74, 197208.
Anino, O. C., Were, G. M., & Khamasi, J. W. (2015). Impact evaluation of positive deviance hearth in Migori County, Kenya, African. Journal of Food,
Agriculture, Nutrition and Development,15(5), 1057810596.
Aruna, M., Vazir, S., & Vidyasagar, P. (2001). Child rearing and positive deviance in the development of preschoolers: A microanalysis. Indian Pediatrics,38(4),
Awofeso, N., Irwin, T., & Forrest, G. (2008). Using positive deviance techniques to improve smoking cessation outcomes in New South Wales prison
settings. Health Promotion Journal of Australia,19(1), 7273.
Babalola, S. (2007). Motivation for late sexual debut in Cote d'Ivoire and Burkina Faso: A positive deviance inquiry. Journal of HIV/AIDS Prevention in Children
and Youth,7(2), 6587.
Babalola, S., Awasum, D., & QuenumRenaud, B. (2002). The correlates of safe sex practices among Rwandan youth: A positive deviance approach. African
Journal of AIDS Research,1(1), 1121.
Bamberger, M. (2016). Integrating big data into the monitoring and evaluation of development programmes. New York, NY: UN Global Pulse.
Batty, M., Axhausen, K. W., Giannotti, F., Pozdnoukhov, A., Bazzani, A., Wachowicz, M., Portugali, Y. (2012). Smart cities of the future. European Physical
Journal: Special Topics,214(1), 481518.
Bengtsson, L., Lu, X., Thorson, A., Garfield, R., & von Schreeb, J. (2011). Improved response to disasters and outbreaks by tracking population movements
with mobile phone network data: A postearthquake geospatial study in haiti. PLoS Medicine,8(8), 19.
Birhanu, M. Y., Girma, A., & Puskur, R. (2017). Determinants of success and intensity of livestock feed technologies use in Ethiopia: Evidence from a positive
deviance perspective. Technological Forecasting and Social Change,115,1525.
Blumenstock, J. E., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science,350(6264), 10731076. https://doi.
Blumenstock, J. E., & Eagle, N. (2012). Divided we call: Disparities in access and use of mobile phones in Rwanda. Information Technologies and International
Development,8(2), 116.
Bolles, K., Speraw, C., Berggren, G., & Lafontant, J. G. (2002). T i foyer (hearth) communitybased nutrition activities informed by the positive deviance
approach in Leogane, Haiti: A programmatic description. Food and Nutrition Bulletin,23(4 Supp), 1117.
Bradley, E. H., Curry, L. A., Ramanadhan, S., Rowe, L., Nembhard, I. M., & Krumholz, H. M. (2009). Research in action: Using positive deviance to improve
quality of health care. Implementation Science,4, 25.
Buckee, C. O., Wesolowski, A., Eagle, N. N., Hansen, E., & Snow, R. W. (2013). Mobile phones and malaria: Modeling human and parasite travel. Travel
Medicine and Infectious Disease,11(1), 1522.
Burgess, R., Hansen, M., Olken, B. A., Potapov, P., & Sieber, S. (2012). The political economy of deforestation in the tropics. The Quarterly Journal of
Economics,127(4), 17071754.
Cartesian (2014). Using Mobile Data for Development. Boston, MA: Cartesian.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection. ACM Computing Surveys,41(3), 158.
Chandy, R., Hassan, M., & Mukherji, P. (2017). Big data for good: Insights from emerging markets. Journal of Product Innovation Management,34(5),
D'Alimonte, M. R., Deshmukh, D., Jayaraman, A., Chanani, S., & Humphries, D. L. (2016). Using positive deviance to understand the uptake of optimal infant
and young child feeding practices by mothers in an urban slum of Mumbai. Maternal and Child Health Journal,20(6), 11331142.
de Macedo, R. C., Jacob, E. M., Silva, V. P., Santana, E. A., Souza, A. F., Gonçalves, P., Edmond, M. B. (2012). Positive deviance: Using a nurse call system to
evaluate hand hygiene practices. American Journal of Infection Control,40(10), 946950.
Delias, P. (2017). A positive deviance approach to eliminate wastes in business processes. Industrial Management & Data Systems,117(7), 13231339.
Doll, C. N. H., & Pachauri, S. (2010). Estimating rural populations without access to electricity in developing countries through nighttime light satellite
imagery. Energy Policy,38(10), 56615670.
Escobar, N. M. O. Márquez IA, Quiroga JA, Trujillo TG, González F, Aguilar MI, EscobarPérez J. (2017) Using positive deviance in the prevention and control
of MRSA infections in a Colombian hospital: A timeseries analysis, Epidemiology and Infection,145(5), 981989, DOI:
Felt, L. J. (2011) Present promise, future potential: Positive deviance and complementary theory. Unpublished manuscript.
FriasMartinez, V., & Virseda, J. (2013). Cell phone analytics: Scaling human behavior studies into the millions. Information Technologies and International
Development,9(2), 3550.
Garrett, J. J., & Barrington, C. (2013). We do the impossible: Women overcoming barriers to cervical cancer screening in rural HondurasA positive
deviance analysis. Culture, Health & Sexuality,15(6), 637651.
Gartner (2013). Big Data. Stamford, CT: Gartner.
Gidado, M., Obasanya, J. O., Adesigbe, C., Huji, J., & Tahir, D. (2010). Role of positive deviants among leprosy selfcare groups in leprosy settlement, Zaria,
Nigeria. Journal of Community Medicine and Primary Health Care,22(12).
Goldstein, M., & Uchida, S. (2016). A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS One,11(4),
Graham, M., & Shelton, T. (2013). Geography and the future of big data, big data and the future of geography. Dialogues in Human Geography,3(3),
Greenhalgh, T., & Peacock, R. (2005). Effectiveness and efficiency of search methods in systematic reviews of complex evidence: Audit of primary sources.
British Medical Journal,331(7524), 10641065.
Guldan, G. S., Zhang, M. Y., Zhang, Y. P., Hong, J. R., Zhang, H. X., Fu, S. Y., & Fu, N. S. (1993). Weaning practices and growth in rural sichuan infants: A
positive deviance study. Journal of Tropical Pediatrics,39(3), 168175.
Haas, E. M., Bartholomé, E., & Combal, B. (2009). Time series analysis of optical remote sensing data for the mapping of temporary surface water bodies in
subSaharan western Africa. Journal of Hydrology,370(14), 5263.
Henderson, J. V., Storeygard, A., & Weil, D. N. (2012). Measuring economic growth from outer space. American Economic Review,102(2), 9941028. https://
Hendrickson, J. L., Dearden, K., Pachón, H., An, N. H., Schroeder, D. G., & Marsh, D. R. (2002). Empowerment in rural Viet Nam: Exploring changes in
mothers and health volunteers in the context of an integrated nutrition project. Food and Nutrition Bulletin,23(4 Supp), 8694.
Herington, M. J., & van de Fliert, E. (2018). Positive deviance in theory and practice: A conceptual review. Deviant Behavior,39(5), 664678.
Hilbert, M. (2016). Big data for development: A review of promises and challenges. Development and Policy Review,34(1), 135174.
Huang, Z., Cao, F., Jin, C., Yu, Z., & Huang, R. (2017). Carbon emission flow from selfdriving tours and its spatial relationship with scenic spotsA traffic
related big data method. Journal of Cleaner Production,142, 946955.
Ifaei, P., Karbassi, A., Lee, S., & Yoo, C. (2017). A renewable energiesassisted sustainable development plan for Iran using technoeconosocio
environmental multivariate analysis and big data. Energy Conversion and Management,153(October), 257277.
Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science,
353(6301), 790794.
Jiang, W., Jia, K., Chen, Z., Deng, Y., & Rao, P. (2017). Using spatiotemporal remote sensing data to assess the status and effectiveness of the underground
coal fire suppression efforts during 20002015 in Wuda, China. Journal of Cleaner Production,142, 565577.
Kanani, S., & Popat, K. (2012). Growing normally in an urban environment: Positive deviance among slum children of Vadodara, India. Indian Journal of
Pediatrics,79(5), 606611.
Khabsa, M., & Giles, C. L. (2014). The number of scholarly documents on the public web. PLoS One,9(5), e93949.
Kim, Y. M., Heerey, M., & Kols, A. (2008). Factors that enable nursepatient communication in a family planning context: A positive deviance study.
International Journal of Nursing Studies,45(10), 14111421.
Kohli, D., Sliuzas, R., Kerle, N., & Stein, A. (2012). An ontology of slums for imagebased classification. Computers, Environment and Urban Systems,36(2),
LackovichVan Gorp, A. (2017). Unearthing local forms of child protection: Positive deviance and abduction in Ethiopia. Action Research,15(1), 3952.
Lapping, K., Marsh, D. R., Rosenbaum, J., Swedberg, E., Sternin, J., Sternin, M., & Schroeder, D. G. (2002). The positive deviance approach: Challenges and
opportunities for the future. Food and Nutrition Bulletin,23(4 Supp), 128135.
Lapping, K., Schroeder, D., Marsh, D., Albalak, R., & Jabarkhil, M. Z. (2002). Comparison of a positive deviant inquiry with a casecontrol study to identify
factors associated with nutritional status among Afghan refugee children in Pakistan. Food and Nutrition Bulletin,23(4 Supp2), 2633.
LeMahieu, P. G., Nordstrum, L. E., & Gale, D. (2017). Positive deviance: Learning from positive anomalies. Quality Assurance in Education,25(1), 109124.
Letouzé, E. & Jütting, J. (2015) Official statistics,big data and human development: Towards a New Conceptual and Operational Approach. Data Pop Alliance.
Levinson, F. J., Barney, J., Bassett, L., & Schultink, W. (2007). Utilization of positive deviance analysis in evaluating communitybased nutrition programs: An
application to the Dular program in Bihar, India. Food and Nutrition Bulletin,28(3), 259265.
Li, S., Ye, X., Lee, J., Gong, J., & Qin, C. (2017). Spatiotemporal analysis of housing prices in China: A big data perspective. Applied Spatial Analysis and Policy,
10(3), 421433.
Long, K. N., Gren, L. H., Rees, C. A., West, J. H., Hall, P. C., Gray, B., & Crookston, B. T. (2013). Determinants of better health: A crosssectional assessment
of positive deviants among women in West Bengal. BMC Public Health,13, 372.
Lu, X., Bengtsson, L., & Holme, P. (2012). Predictability of population displacement after the 2010 Haiti earthquake. Proceedings of the National Academy of
Sciences,109(29), 1157611581.
Lu, X., Wrathall, D. J., Sundsøy, P. R., Nadiruzzaman, M., Wetter, E., Iqbal, A., Bengtsson, L. (2016). Unveiling hidden migration and mobility patterns in
climate stressed regions: A longitudinal study of six million anonymous mobile phone users in Bangladesh. Global Environmental Change,38,17.
Luo, X., Dong, L., Dou, Y., Zhang, N., Ren, J., Li, Y., Yao, S. (2017). Analysis on spatialtemporal features of taxis' emissions from big data informed travel
patterns: A case of Shanghai, China. Journal of Cleaner Production,142, 926935.
Mackintosh, U. A., Marsh, D. R., & Schroeder, D. G. (2002). Sustained positive deviant child care practices and their effects on child growth in Viet Nam.
Food and Nutrition Bulletin,23(4 Supp), 1827.
Marra, A. R., Guastelli, L. R., de Araújo, C. M. P., dos Santos, J. L. S., Lamblet, L. C. R., Silva, M., dos Santos, O. F. P. (2010). Positive deviance: A new
strategy for improving hand hygiene compliance. Infection Control and Hospital Epidemiology,31(1), 1220.
Marra, A. R., Noritomi, D. T., Westheimer Cavalcante, A. J., Sampaio Camargo, T. Z., Bortoleto, R. P., Durao Junior, M. S., Positive Deviance For Hand
Hygiene Study Group (2013). A multicenter study using positive deviance for improving hand hygiene compliance. American Journal of Infection Control,
41(11), 984988.
Marra, A. R., Reis Guastelli, L., Pereira de Araújo, C. M., Saraiva dos Santos, J. L., Filho, M. A. O., Silva, C. V., Edmond, M. B. (2011). Positive deviance: A
program for sustained improvement in hand hygiene compliance. American Journal of Infection Control,39(1), 15.
Marsh, D. R., Schroeder, D. G., Dearden, K. A., Sternin, J., & Sternin, M. (2004). The power of positive deviance. British Medical Journal,329(7475),
Marsh, D. R., Sternin, M., Khadduri, R., Ihsan, T., Nazir, R., Bari, A., & Lapping, K. (2002). Identification of model newborn care practices through a positive
deviance inquiry to guide behaviorchange interventions in Haripur, Pakistan. Food and Nutrition Bulletin,23(4 Supp), 109118.
Mashey, J. R. (1998) Big data and the next wave of infrastress, paper presented at Computer Systems Laboratory Colloquium, Stanford University, Ca, 25 Feb.
Merchant, S. S., & Udipi, S. A. (1997). Positive and negative deviance in growth of urban slum children in Bombay. Food and Nutrition Bulletin,18(4),
Merita, M., Sari, M. T., & Hesty, H. (2017). The positive deviance of feeding practices and carring with nutritional status of toddler among poor families.
Jurnal Kesehatan Masyarakat,13(1), 106112.
Mertens, W., Recker, J., Kohlborn, T., & Kummer, T.F. (2016). A framework for the study of positive deviance in organizations. Deviant Behavior,37(11),
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for systematic reviews and metaanalyses: The PRISMA statement.
British Medical Journal,339, b2535.
Mügge, D. (2014). Poor numbers: How we are misled by African development statistics and what to do about it, by Morten Jerven. Review of International
Political Economy,21(5), 11311133.
Ndiaye, M., Siekmans, K., Haddad, S., & Receveur, O. (2009). Impact of a positive deviance approach to improve the effectiveness of an iron
supplementation program to control nutritional anemia among rural Senegalese pregnant women. Food and Nutrition Bulletin,30(2), 128136.
Nel, H. (2018). A comparison between the assetoriented and needsbased community development approaches in terms of systems changes. Practice,
30(1), 3352.
NietoSanchez, C., Baus, E. G., Guerrero, D., & Grijalva, M. J. (2015). Positive deviance study to inform a Chagas disease control program in southern
Ecuador. Memórias do Instituto Oswaldo Cruz,110(3), 299309.
Nishat, N., & Batool, I. (2011). Effect of Positive Hearth Devianceon feeding practices and underweight prevalence among children aged 624 months in
Quetta district, Pakistan: A comparative cross sectional study. Sri Lanka Journal of Child Health,40(2), 5762.
Njuguna, C., & McSharry, P. (2017). Constructing spatiotemporal poverty indices from big data. Journal of Business Research,70, 318327.
OECD (2017). DAC list of ODA recipients. Paris: Organisation for Economic Cooperation and Development.
Okoli, C. (2015). A guide to conducting a standalone systematic literature review. Communications of the AIS,37, 879910.
Osborne, J. W., & Overbay, A. (2004). The power of outliers (and why researchers should always check for them). Practical Assessment, Research &
Evaluation,9(6), 18.
Pascale, R., Sternin, J., & Sternin, M. (2010). The power of positive deviance: How unlikely innovators solve the world's toughest problems. Boston, MA: Harvard
Business Press.
Petticrew, M., & Roberts, H. (2005). Systematic reviews in the social sciences: A practical guide. Chichester, UK: John Wiley & Sons.
Pfeffer, K., Verrest, H., & Poorthuis, A. (2015). Big data for better urban life?An exploratory study of critical urban issues in two Caribbean cities:
Paramaribo (Suriname) and Port of Spain (Trinidad and Tobago). European Journal of Development Research,27(4), 505522.
Positive Deviance Initiative (2010). Basic field guide to the positive deviance approach. Medford, MA: Positive Deviance Initiative, Tufts University.
Roche, M. L., Marquis, G. S., Gyorkos, T. W., Blouin, B., Sarsoza, J., & Kuhnlein, H. V. (2017). A communitybased positive deviance/hearth infant and young
child nutrition intervention in ecuador improved diet and reduced underweight. Journal of Nutrition Education and Behavior,49(3), 196203. https://doi.
Saïd Business School (2010). Exploring positive devianceNew frontiers in collaborative change. Oxford, UK: Saïd Business School, University of Oxford.
Sengupta, R., Heeks, R., Chattapadhyay, S. & Foster, C. (2017) Exploring big data for development: An electricity sector case study from India, Development
Informatics Working Paper 66. Manchester, UK: Global Development Institute, University of Manchester.
Sethi, V., Kashyap, S., Seth, V., & Agarwal, S. (2003). Encouraging appropriate infant feeding practices in slums: A positive deviance approach. Pakistan
Journal of Nutrition,2(3), 164166.
Sethi, V., Sternin, M., Sharma, D., Bhanot, A., & Mebrahtu, S. (2017). Applying positive deviance for improving compliance to adolescent anemia control
program in tribal communities of India. Food and Nutrition Bulletin,38(3), 447452.
Shekar, M., Habicht, J. P., & Latham, M. C. (1991). Is positive deviance in growth simply the converse of negative deviance? Food and Nutrition Bulletin,13(1),
Shekar, M., Habicht, J. P., & Latham, M. C. (1992). Use of positivenegative deviant analyses to improve programme targeting and services: Example from
the Tamil Nadu integrated nutrition project. International Journal of Epidemiology,21(4), 707713.
Shoenberger, N. A. (2017). Bridging normative and reactivist perspectives: An introduction to positive deviance. In S. E. Brown, & O. Sefina (Eds.), Routledge
handbook on deviance (pp. 4251). Abingdon, UK: Routledge.
Singhal, A. (2011). Turning diffusion of innovation paradigm on its head: The positive deviance approach to social change. In A. Vishwanath, & G. A. Barnett
(Eds.), The diffusion of innovations (pp. 193205). New York, NY: Peter Lang.
Song, M., & Wang, S. (2017). Participation in global value chain and green technology progress: Evidence from big data of Chinese enterprises. Environmental
Science and Pollution Research,24(2), 16481661.
Springer, A., Nielsen, C., & Johansen, I. (2016). Positive deviance by the numbers. Medford, MA: Positive Deviance Initiative, Tufts University.
Sternin, J. (2002). Positive deviance: A new paradigm for addressing today's problems today. The Journal of Corporate Citizenship,5(Spring), 5763.
Sternin, J., & Choo, R. (2000). The power of positive deviancy. Harvard Business Review,78(1), 13.
Sternin, M., Sternin, J., & Marsh, D. (1997). Rapid sustained childhood malnutrition alleviation through a positivedeviance approach in rural Vietnam: Preliminary
findings. Arlington, VA: Partnership for Child Health Care, BASICS.
Sternin, M., Sternin, J., & Marsh, D. (1998). Designing a communitybased nutrition program using the hearth model and the positive deviance approach: A field
guide. Westport, CT: Save the Children.
Su, S., Li, Z., Xu, M., Cai, Z., & Weng, M. (2017). A geobig data approach to intraurban food deserts: Transitvarying accessibility, social inequalities, and
implications for urban planning. Habitat International,64,2240.
Surjandari, I., Dhini, A., Lumbantobing, E. W. I., Widari, A. T., & Prawiradinata, I. (2015). Big data analysis of indonesian scholars' publications: A research
theme mapping. International Journal of Technology,6(4), 650658.
Sutton, P., Elvidge, C., & Ghosh, T. (2007). Estimation of gross domestic product at subnational scales using nighttime satellite imagery. International Journal
of Ecological Economics & Statistics,8(S07), 521.
Tang, H., Yang, X., & Zhang, Y. (2014). Effort at constructing big data sensor networks for monitoring greenhouse gas emission. International Journal of
Distributed Sensor Networks,10(7), 619608.
Tatem, A. J., Huang, Z., Narib, C., Kumar, U., Kandula, D., Pindolia, D. K., Lourenço, C. (2014). Integrating rapid risk mapping and mobile phone call record
data for strategic malaria elimination planning. Malaria Journal,13, 52.
Tatem, A. J., Noor, A. M., von Hagen, C., Di Gregorio, A., & Hay, S. I. (2007). High resolution population maps for low income nations: Combining land cover
and census in East Africa. PLoS One,2(12), e1298.
Tatem, A. J., Qiu, Y., Smith, D. L., Sabot, O., Ali, A. S., & Moonen, B. (2009). The use of mobile phone data for the estimation of the travel patterns and
imported plasmodium falciparum rates among Zanzibar residents. Malaria Journal,8, 287.
Taylor, L., & Broeders, D. (2015). In the name of development: Power, profit and the datafication of the global South. Geoforum,64, 229237. https://doi.
Tekle, L. (2015). Analysis of positive deviance farmer training centers in Northern Ethiopia. American Journal of Rural Development,3(1), 1014.
Tesfaye, K., Sonder, K., Cairns, J., Magorokosho, C., Tarekegn, A., Kassie, G. T., Erenstein, O. (2016). Targeting droughttolerant maize varieties in
Southern Africa: A geospatial crop modeling approach using big data. International Food and Agribusiness Management Review,19(A), 7592.
Tong, T., & Li, H. (2017). Demand for MOOCAn application of big data. China Economic Review. in press, DOI:
UNGP (2012). Big data for development: Challenges & opportunities. New York, NY: UN Global Pulse.
Vossenaar, M., Bermúdez, O. I., Anderson, A. S., & Solomons, N. W. (2010). Practical limitations to a positive deviance approach for identifying dietary pat-
terns compatible with the reduction of cancer risk. Journal of Human Nutrition and Dietetics,23(4), 382392.
Vossenaar, M., Mayorga, E., SotoMéndez, M.́
., MedinaMonchez, S. B., Campos, R., Anderson, A. S., & Solomons, N. W. (2009). The positive deviance
approach can be used to create culturally appropriate eating guides compatible with reduced cancer risk. Journal of Nutrition,139(4), 755762.
Wesolowski, A., Metcalf, C. J. E., Eagle, N., Kombich, J., Grenfell, B. T., Bjørnstad, O. N., Buckee, C. O. (2015). Quantifying seasonal population fluxes
driving rubella transmission dynamics using mobile phone data. Proceedings of the National Academy of Sciences,112(35), 1111411119. https://doi.
Wesolowski, A. et al. (2014) Quantifying travel behavior for infectious disease research: A comparison of data from surveys and mobile phones, Scientific
Reports, 4, 5678.
Wilson, C. (2011). Digital media in the Egyptian revolution: Descriptive analysis from the Tahrir data sets. International Journal of Communication,5,
Wilson, R. et al. (2016) Rapid and near realtime assessments of population displacement using mobile phone data following disasters: The 2015 Nepal
earthquake, PLoS Currents Disasters, 24 Feb, 8.
Wishik, S. M., & Van Der Vynckt, S. (1976). The use of nutritional positive deviantsto identify approaches for modification of dietary practices. American
Journal of Public Health,66(1), 3842.
Wollinka, O., Keeley, E., Burkhalter, B. R., & Bashir, N. (1997). Hearth nutrition model: Applications in Haiti, Viet Nam and Bangladesh. Arlington, VA: Partnership
for Child Health Care, BASICS.
WP (2018) Open Data. Wikipedia.
Zaidi, Z., Jaffery, T., Shahid, A., Moin, S., Gilani, A., & Burdick, W. (2012). Change in action: Using positive deviance to improve student clinical performance.
Advances in Health Sciences Education,17(1), 95105.
Zeitlin, M. (1991). Nutritional resilience in a hostile environment: Positive deviance in child nutrition. Nutrition Reviews,49(9), 259268.
Zeitlin, M., Ghassemi, H., & Mansour, M. (1994). Positive deviance in child nutrition: A discussion. Ecology of Food and Nutrition,31(34), 295302. https://
Zeitlin, M. F., Ghassemi, H., Mansour, M., & United Nations University & Joint WHO/UNICEF Nutrition Support Programme (1990). Positive deviance in
child nutrition: With emphasis on psychosocial and behavioural aspects and amplications for development. Tokyo: United Nations University.
Zhang, N., Chen, H., Chen, J. & Chen, X. (2016) Social media meets big urban data: A case study of urban waterlogging analysis, Computational Intelligence
and Neuroscience, 2016.
Basma Albanna is a Development Informatics PhD student at the Global Development Institute, University of Manchester and an assistant
lecturer at the faculty of computer science Ain Shams University, Cairo, Egypt. She has been working in development organizations since
2012, particularly in supporting social innovations. She has publications in semantic trajectories and locationbased social networks. Basma
holds a Master's degree in computer science and another Master's degree in management of technology.
Richard Heeks is Chair in Development Informatics at the Global Development Institute, University of Manchester and Director of the Centre
for Development Informatics ( He has been consulting and researching on informatics and development for
more than 30 years. His book publications include India's Software Industry (1996), Reinventing Government in the Information Age (1999),
Implementing and Managing eGovernment (2006), ICTs, Climate Change and Development (2012), and Information and CommunicationTech-
nology for Development (2018). His research interests are dataintensive development, eresilience and esustainability, digital development,
and the digital economy in developing countries. He has a PhD in Indian IT industry development, directs the MSc programme in ICTs for
Development, and runs the ICT for Development blog:
How to cite this article: Albanna B, Heeks R. Positive deviance, big data, and development: A systematic literature review. E J Info Sys Dev
Countries. 2019;85:e12063.
... Unlike traditional 'top-down' approaches to development that are often not sustainable, approaches that stimulate learning and behavioral change by beneficiary groups may offer more promising results (Albanna & Heeks, 2019). Scaling indigenous, local solutions may increase adoption, as these solutions are developed from local experience gained over time and are adapted to local culture and environment (Makate, 2020). ...
... Developed initially by nutrition researchers Marsh et al., 2004) and with a history in health-related research (Bradley et al., 2009;Feng et al., 2016), the Positive Deviance approach is increasingly used within international development research. It can reduce the dependence on external expertise while relying on local resources and know-how (Albanna & Heeks, 2019). Examples include agricultural development (Steinke et al., 2019), farming system redesign (Toorop et al., 2020), and environmental stewardship in artisanal mining (Schwartz et al., 2021). ...
Full-text available
Context-adapted interventions are needed to alleviate the burden of food and nutrition insecurity on resource-poor rural households in southeastern Madagascar. The Positive Deviance approach implies identifying locally viable development solutions by focusing on particularly successful, innovative individuals. To identify promising practices that could be promoted as part of food and nutrition security (FNS) interventions in the Atsimo Atsinanana region of southeastern Madagascar, positive deviance was searched among smallholder farmers. Positive deviants are defined as households with overall optimal performance across four aspects of FNS: household-level food security, women’s diet quality, child’s diet quality, and low diarrhea incidence. To identify positive deviants, a two-step procedure was followed. Based on quantitative survey data from 413 rural smallholder households (mother-child pairs) with a child aged between 6 and 23 months, each household’s four performance scores were adjusted by removing the average effects of household resources. Then, households with Pareto-optimal performance were identified regarding the four aspects. Subsequently, 16 positive deviants were revisited and positive deviant practices were identified through in-depth interviews. A set of practices were validated through focus group discussions with local nutrition and agriculture experts. Positive deviant practices include the adoption of agricultural innovation, such as new cash crops, as well as nutrition-sensitive market behaviors and reliance on off-farm activities. In addition, some ethno-cultural factors help to explain positive deviance. These diverse positive deviant practices may serve as examples and inspiration for locally grounded development interventions targeting FNS in southeastern Madagascar.
... The approach focuses on recognising the PD that causes a person to outperform others in the same community [23]. Such practice is eventually shared within the community to deal with the community's problem [24]. This approach also emphasizes on the application of culturallyacceptable solutions in the local setting that can potentially promote positive behavioural changes among mothers. ...
... Elsewhere, a few intervention studies are using the PD approach to improve the nutritional status of undernourished children in countries such as Ecuador, Ethiopia, and Burundi [25][26][27][28][29]. Nonetheless, mixed findings have been shown in other countries. A few studies showed that the approach was successful in improving nutritional status with a reduction in undernutrition prevalence, significant weight gain, and increased nutrient intake of children while some studies showed no effect [24,30,31]. Apart from the positive outcomes in terms of growth and nutrient intake, the effectiveness of a PD programme in improving other aspects such as food security status and nutritional knowledge of mothers remains unknown. ...
Full-text available
Background Childhood undernutrition remains a public health issue that can lead to unfavourable effects in later life. These effects tend to be more devastating among urban poor young children, especially in light of the recent COVID-19 pandemic. There is an immediate need to introduce interventions to reduce childhood undernutrition. This paper described the study protocol of a nutrition programme that was developed based on the positive deviance approach and the evaluation of the effectiveness of the programme among urban poor children aged 3 to 5 years old. Methods This mixed-method study will be conducted in two phases at low-cost flats in Kuala Lumpur. Phase one will involve a focus group discussion with semi-structured interviews to explore maternal feeding practices and the types of food fed to the children. Phase two will involve a two-armed cluster randomised controlled trial to evaluate the effectiveness of a programme developed based on the positive deviance approach. The programme will consist of educational lessons with peer-led cooking demonstrations, rehabilitation, and growth monitoring sessions. Intervention group will participate in the programme conducted by the researcher for three months whereas the comparison group will only receive all the education materials and menus used in the programme after data collection has been completed. For both groups, data including height, weight, and dietary intake of children as well as the nutritional knowledge and food security status of mothers will be collected at baseline, immediate post-intervention, and 3-month post-intervention. Expected results The positive deviance approach helps to recognise the common feeding practices and the local wisdom unique to the urban poor population. Through this programme, mothers may learn from and be empowered by their peers to adopt new feeding behaviours so that their children can achieve healthy weight gain.
... Moreover, in Nepal, DDS varied among different age groups by different ecological regions and rurality [36]; this finding cannot be generalized to other areas. In the context of PD studies, generalizability could sometimes be a limitation; however, PD interventions are a problem-solving approach for a particular community [52]. ...
Full-text available
Background School-based interventions have been implemented in resource-limited settings to promote healthy dietary habits, but their sustainability remains a challenge. This study identified positive deviants (PDs) and negative deviants (NDs) from the control and treatment groups in a nutrition-sensitive agricultural intervention in Nepal to identify factors associated with healthy dietary practices. Methods This is an explanatory mixed methods study. Quantitative data come from the endline survey of a cluster randomized controlled trial of a school and home garden intervention in Nepal. Data were analyzed from 332 and 317 schoolchildren (grades 4 and 5) in the control and treatment group, respectively. From the control group, PDs were identified as schoolchildren with a minimum dietary diversity score (DDS) ≥ 4 and coming from low wealth index households. From the treatment group, NDs were identified as schoolchildren with a DDS < 4 and coming from high wealth index households. Logistic regression analyses were conducted to identify factors associated with PDs and NDs. Qualitative data were collected through in-depth phone interviews with nine pairs of parents and schoolchildren in each PD and ND group. Qualitative data were analyzed thematically and integrated with quantitative data in the analysis. Results Twenty-three schoolchildren were identified as PDs, and 73 schoolchildren as NDs. Schoolchildren eating more frequently a day (AOR = 2.25; 95% CI:1.07–5.68) and whose parents had a higher agricultural knowledge level (AOR = 1.62; 95% CI:1.11–2.34) were more likely to be PDs. On the other hand, schoolchildren who consumed diverse types of vegetables (AOR = 0.56; 95% CI: 0.38–0.81), whose parents had higher vegetable preference (AOR = 0.72; 95% CI: 0.53–0.97) and bought food more often (AOR = 0.71; 95% CI: 0.56–0.88) were less likely to be NDs. Yet, schoolchildren from households with a grandmother (AOR = 1.98; 95% CI: 1.03–3.81) were more likely to be NDs. Integrated results identified four themes that influenced schoolchildren’s DDS: the availability of diverse food, the involvement of children in meal preparation, parental procedural knowledge, and the grandmother’s presence. Conclusion Healthy dietary habit can be promoted among schoolchildren in Nepal by encouraging parents to involve their children in meal preparation and increasing the awareness of family members.
... Furthermore, organisational culture involving efficient big data people (BDP) and big data systems (BDS) also can strategically make decisions through its ability to "detect, anticipate and respond strategically in ambiguous and uncertain business environments" (Rijmenam et al., 2019, p.1). However, despite the potential benefit, studies have raised various issues related to BDA, especially in developing countries, ranging from technological and data complexities to human and organisational dynamics (Alalawneh & Alkhatib, 2020;Albanna & Heeks, 2019). One such issue is that BDA requires the organisation to manoeuvre through data, computational, and system complexities to gain the benefits of big data (Halford & Savage, 2017). ...
Full-text available
Considering the rise of implementation of big data analytics (BDA) in Saudi Arabian higher education institutions but with relatively lesser optimal performance, the study investigated the causality of organisational culture (OC) and BDA's social and technical subsystems, following the Socio- Technical Systems theory, with the strategic decision-making in Saudi Arabian higher education institutions. The study's objectives are based on the ontological positivist paradigm, and the methodology applies a quantitative cross-sectional survey. The sample population involved the IT staff and data scientists representing the big data people (BDP) and top management as the OC in the Saudi Arabian universities. The data was collected using validated scales of previous studies through an online survey, and the hypotheses were evaluated using PLS-SEM. The PLS-SEM analysis conducted to test the hypotheses highlighted the insignificance of organisational culture in big data systems (BDS), although having a positive value. Nonetheless, the organisational culture significantly impacted BDP, implying the influence of a data-driven culture and supportive top management on the workforce's attitude towards BDA-related change and skill development. Besides, the social and technical subsystems of the BDA— the BDS and BDP— are significantly correlated, along with their correlation with strategic decision-making. The study's implications comprised insights guiding the managers and policymakers to acknowledge the importance of organisational culture (hierarchical, adhocratic, market, and clan) while strategising the implementation of BDA and its systems and developing training modules for its BDP accordingly.
... This was on the assumption that endogenously developed practices, although atypical, would be feasible and culturally acceptable, having been developed indigenously and not extraneously in the locality. Since then, positive deviant behavior has attracted research attention and application in public health, agriculture, and even in smallholder livestock systems [10][11][12][13]. ...
Full-text available
This study characterized breeding, housing, feeding and health management practices in positive deviants and typical average performing smallholder dairy farms in Tanzania. The objective was to distinguish management practices that positive deviant farms deploy differently from typical farms to ameliorate local prevalent environmental stresses. In a sample of 794 farms, positive deviants were classified on criteria of consistently outperforming typical farms (p < 0.05) in five production performance indicators: energy balance ≥ 0.35 Mcal NEL/d; disease-incidence density ≤ 12.75 per 100 animal-years at risk; daily milk yield ≥ 6.32 L/cow/day; age at first calving ≤ 1153.28 days; and calving interval ≤ 633.68 days. The study was a two-factor nested research design, with farms nested within the production environment, classified into low- and high-stress. Compared to typical farms, positive deviant farms had larger landholdings, as well as larger herds comprising more high-grade cattle housed in better quality zero-grazing stall units with larger floor spacing per animal. Positive deviants spent more on purchased fodder and water, and sourced professional veterinary services (p < 0.001) more frequently. These results show that management practices distinguishing positive deviants from typical farms were cattle upgrading, provision of larger animal floor spacing and investing more in cattle housing, fodder, watering, and professional veterinary services. These distinguishing practices can be associated with amelioration of feed scarcity, heat load stresses, and disease infections, as well as better animal welfare in positive deviant farms. Nutritional quality of the diet was not analyzed, for which research is recommended to ascertain whether the investments made by positive deviants are in quality of feeds.
... Its genesis has been traced to the 1940s and is credited to Merton (1938) and Sellin (1938), scholars of the Chicago School of Sociology. They defined deviance as a topic of analysis of sociocriminogenesis, a field that encompasses research in criminology, psychiatry, psychology, and sociology [20][21][22][23][24]. Despite the many works devoted to the deviations problem, this issue still needs to be more studied among the students inclined toward deviant behavior. ...
Objectives: Deviant behavior has become a global issue of great concern and requires immediate attention. This study aimed to investigate the course of life concept development in students inclined to deviant behavior. Methods: This experimental and empirical study was performed by a structural correlation as a quasi-experimental design in 2019-2020. The study setting was the Belgorod National State Research University, and the target population was the students aged 18 to 21 who tended toward deviant behavior. The samples were selected based on real groups’ involvement and polar groups’ isolation and comparison. The variables were the correction program aimed at developing personal notions about the life path in students and the qualities that make up the content of the “temporary” and “value-semantic” aspects of the subjects’ notions about the life path. Data on deviant behavior were collected using the questionnaires of a tendency to deviant behavior and the deviant behavior questionnaire of Robinson and Bennett. The structural equation modeling, partial least squares method, and SmartPLS software were used to validate the original model and test the hypotheses. Results: Only 27% of students tended towards deviant behavior (group 1), and 73% were normal students (group 2). The students in group 1 had a higher tendency to nonconformism (P≤0.01), moderate inclination to addictive behavior (P≤0.01), and more aggressive tendencies (P≤0.01) compared to the students in group 2. Also, in the students in group 1, “present” and “past” times were described as joyful, light, real, close, calm, voluminous, bright, and active, but “future” time as passive, motionless, empty, little, flat, petty and narrow. In terms of value-semantic measurement of students’ life concepts, the students prone to deviant behavior did not have meaningful purposes in the future that give life meaningfulness and direction. Discussion: Based on the study findings, a higher tendency for nonconformism, addictive behavior, and aggressive tendencies was found in the students with a tendency to deviant behavior. Also, these students lack meaningful purposes in the future that give life meaning and direction, and they live for today or yesterday. It is suggested that the correctional and development work under a program aimed at the personal course of life concepts development reduce the students’ inclination to deviant behavior.
... Here, positive deviants refer to the cases achieving superior performance compared to others within a community despite sharing similar environment, resources, and barriers. The positive deviance approach was initiated and widely applied in the field of human nutrition and public health (Albanna and Heeks, 2019), for example, to propose dietary interventions against child malnutrition through studying positive deviant households with healthier children under given resource-poor circumstances. This approach was also extended to the agricultural domain to look into positive deviant farms and the practices driving them (Savikurki, 2013;Dumont et al., 2020). ...
Full-text available
CONTEXT Sustainable cropping systems need to balance productivity and profitability with resource and environmental conservation. Within a population of cropping system observations, there might be positive deviants that outperform others in terms of sustainability, which could serve as “model systems” for future development. Wheat-maize double cropping is the dominant system in the North China Plain, which is facing multiple economic, societal, and environmental sustainability challenges. Identifying exemplary positive deviants out of a multitude of wheat-maize observations might provide solutions to enhance overall sustainability. OBJECTIVES We aimed to 1) identify exemplary wheat-maize systems that reached optimal performance across seven sustainability indicators, 2) determine which factors regarding management practices and farming contexts resulted in the sustainability gaps between exemplary and other systems, and 3) propose a sustainable wheat-maize prototype. METHODS Based on a farmer survey dataset (n = 344), we developed a cropping system-level positive deviance approach, including multi-criteria assessment, positive deviant identification (Pareto ranking) and positive deviant clustering, to identify exemplary wheat-maize systems. We then compared exemplary and other systems to quantify the sustainability gaps and identify the key variables explaining sustainability gaps. RESULTS AND CONCLUSIONS Sixteen percent of wheat-maize cases were Pareto-optimal and were classified as positive deviants. These were sorted into seven clusters representing contrasting sustainability patterns. Among these clusters, one comprised exemplary systems due to the best compromise over the indicator set. Compared to remaining wheat-maize cases, exemplary systems, on average, resulted in 49% and 17% higher gross margin and dietary energy output, respectively, and 33–51% lower labor use, groundwater depletion, N loss, net greenhouse gas emission, and pesticide use. Key practices conferring exemplary system performance included higher maize seeding density, lower fertilizer N input in wheat, partial substitution of inorganic fertilizer with manure, a smaller number of irrigation events, and a lower frequency of pesticide and herbicide application. No significant difference in farming context was found between exemplary and other systems. SIGNIFICANCE Since the practices of exemplary systems were already locally adopted and proven, we expect that farmers in the region can increase the sustainability of their wheat-maize production by adjusting their management to resemble the exemplary systems. The positive deviance approach thus provides a pragmatic bottom-up approach to identify practices that can improve the sustainability of cropping systems, and can be used for other cropping systems elsewhere.
Full-text available
This research protocol explains an innovative approach of farmer to farmer scaling based on positive deviance.
Der Beitrag diskutiert das Potenzial des Positive-Deviance-Ansatzes (PD-Ansatz) im Rahmen von Multi-Akteurs-Partnerschaften (MAPs) zur Unterstützung der Sustainable Development Goals (SDG). Unter MAPs werden kollaborative Partnerschaften zwischen verschiedenen Stakeholder-Gruppen, wie etwa Zivilgesellschaft, Politik, Privatwirtschaft und Wissenschaft, verstanden. Das Projektmanagement von MAPs erfolgt oftmals regulär nach dem Ansatz des Project Cycle Managements mit Schwerpunkt auf dem Logical Framework Approach, welcher dem Paradigma vom „Wissen zum Handeln“ folgt. Der Beitrag diskutiert den PD-Ansatz als eine mögliche Alternative (Ergänzung) zu regulärem Projektmanagement. Der PD-Ansatz erfordert ein Umdenken vom „Handeln zum Wissen“. Der Beitrag fokussiert auf das Ablauf-Modell des PD-Ansatzes und zeigt konzeptionell auf, welche a) Paradigmenwechsel, b) Prozessabläufe und c) veränderte Rollenverteilungen der Akteure notwendig sind, um PD-Projekte im Rahmen von MAPs umzusetzen.
Full-text available
This paper presents exploratory research into “data-intensive development” that seeks to inductively identify issues and conceptual frameworks of relevance to big data in developing countries. It presents a case study of big data innovations in “Stelcorp”; a state electricity corporation in India. In an attempt to address losses in electricity distribution, Stelcorp has introduced new digital meters throughout the distribution network to capture big data, and organisation-wide information systems that store and process and disseminate big data. Emergent issues are identified across three domains: implementation, value and outcome. Implementation of big data has worked relatively well but technical and human challenges remain. The advent of big data has enabled some – albeit constrained – value addition in all areas of organisational operation: customer billing, fault and loss detection, performance measurement, and planning. Yet US$ tens of millions of investment in big data has brought no aggregate improvement in distribution losses or revenue collection. This can be explained by the wider outcome, with big data faltering in the face of external politics; in this case the electoral politics of electrification. Alongside this reproduction of power, the paper also reflects on the way in which big data has enabled shifts in the locus of power: from public to private sector; from labour to management; and from lower to higher levels of management. A number of conceptual frameworks emerge as having analytical power in studying big data and global development. The information value chain model helps track both implementation and value-creation of big data projects. The design-reality gap model can be used to analyse the nature and extent of barriers facing big data projects in developing countries. And models of power – resource dependency, epistemic models, and wider frameworks – are all shown as helping understand the politics of big data.
Full-text available
The positive deviance (PD) approach offers an alternative to needs-based approaches for development. The “traditional” application of the PD approach for childhood malnutrition involves studying children who grow well despite adversity, identifying uncommon, model practices among PD families, and designing an intervention to transfer these behaviors to the mothers of malnourished children. A common intervention for child malnutrition, the so-called “hearth,” brings mothers together to practice new feeding and caring behaviors under the encouragement of a village volunteer. Hearths probably work because they modify unmeasured behavioral determinants and unmonitored behaviors, which, in turn, result in better child growth. Some health outcomes require a better understanding of behavioral determinants and are not best served by hearth-like facilitated group skills-building. We propose testing “booster PD inquiries” during implementation to confirm behavioral determinants and efficiently focus interventions. We share early experience with the PD approach for HIV/AIDS and food security. The attributable benefit of the PD approach within a program has not been quantified, but we suspect that it is a catalyst that accelerates change through the processes of community attention getting, awareness raising, problem-solving, motivating for behavior change, advocacy, and actual adopting new behaviors. Program-learners should consider identifying and explicitly attempting to modify the determinants of critical behavior(s), even if the desired outcome is a change in health status that depends on multiple behaviors; measure and maintain program quality, especially at scale; and creatively expand and test additional roles for PD within a given program.
Full-text available
There are poor families with income less than minimum wage (IDR 1,900,000 / month) In Baru Village, Sarolangun Jambi. However, in reality the majority of toddler in the village have a relatively good nutritional status. The purpose of this study was to know the positive deviance of feeding practices and carring with nutritional status of toddler among poor families. This study used a cross-sectional study design. This research was conducted on April until August, 2016 in Villages Baru, Sarolangun, Jambi. The sampling technique in this study was total sampling. The samples was 84 under five age children from poor families. Determination of nutritional status using indicators of Weight for Age, which refers to the standard Kemenkes RI. The data of positive deviance taken using a questionnaire tools. The data collected was analyzed by univariate and bivariat test(chi-square test). The results showed that the positive deviance of infant feeding practice habits (91,7%), toddlers care (85,7%), nutritional status of toddler (90,5%) categorized was good. The conclusion, the are relationship between positive deviance of feeding practices and carring with nutritional status of toddler among poor families (p<0,05).
A Positive Deviance (PD) Hearth intervention is a home and neighborhood-based nutrition program for children who are at risk for protein-energy malnutrition in a low resource community. The intervention uses the ‘Positive Deviance’ approach to identify those behaviors practiced by the mothers or caretakers of well-nourished children from poor families and transfers such positive practices to other mothers who are equally disadvantaged economically. Positive Deviance Hearth intervention is designed to treat malnourished children, enable the families to sustain their rehabilitation at home on their own and to prevent malnutrition in younger siblings. However, PD Hearth intervention monitoring system in Migori only assesses a program’s ability to treat, one of the three PD Hearth objectives. Thus, there was need for impact evaluation to measure outcomes of the PD Hearth intervention to sustain rehabilitation and prevent malnutrition in younger siblings. The objectives of the study were to determine the level to which PD Hearth enables families to sustain rehabilitation at home on their own and to identify the practices which influence PD Hearth outcomes. The study was designed as a pipeline quasi-experimental and mixed method was used to collect data and perform statistical analyses. Single stage cluster sampling was used to identify 53 and 54 children on the intervention and comparison group in five communities. Weight measurements of the children on the intervention aged 6 to 59 months at the entry, exit and graduation stages were retrieved from Kenya Medical Research Institute Family AIDS Care and Education Services programme activities reports. Anthropometry (height measurements) for the children on the intervention and comparison children was taken. Caregivers filled in a questionnaire, assisted by the researchers as necessary. At entry, 18.9% children on the intervention had moderate underweight while 43.4% had mild underweight. At current status though, 3.8% and 34.0% had moderate and mild underweight respectively. The regression model predicted that Weight-for-Height (WAZ) of the children on the intervention at current status lied on 51.5 percentile, thus, normal for underweight. Increased feeding frequency made the largest contribution to weight gain than other caregiver practices. Therefore, the Migori County government in collaboration with the Ministry of Health needs to scale up PD Hearth intervention to reverse cases of Moderate Acute Malnutrition (MAM) and prevent Severe Acute Malnutrition (SAM) in the County.
FTC-based farmer training is an emerging extension strategy geared towards human capital development through need-based, hands-on practical training in order to facilitate agricultural transformation and rural livelihood improvement. Although, FTCs were established and made functional in the Tigray National Regional State and Alamata Woreda but, no systematic assessment of the positively deviated farmers training center. Hence, to alleviate this problem, educating this research was initiated to fill the gap. Specifically the research attempted to address this important question: Are there FTCs with successful experience for scale-out\up 14 DAs and 20 woreda experts by means of semi-structure interview schedule. Qualitative methods that were used at community, organizational and individual levels include: document review, focused/group discussion, personal interviews and direct observation. The quantitative data were also analyzed using descriptive statistics. Based on the indicators of positive deviance like, departure from the norms, intentional behavior and honorable outcomes such as technology dissemination, exemplary demonstration field management, diversified and substantial training outreaches of the four sampled FTCs, Selambkalsi FTC is found to be positively deviating. In this research context, positively deviant FTC is the one that performed better than the other FTCs regardless of similar problems and resource base.Therefore, it is recommended that policy aimed at FTC based training in the area could be the result of this study are taken in to consideration and there should be experience sharing mechanisms among FTCs so as to cross fertilize the successful results throughout the study area and lesson are developed and institutionalized.
Save the Children's (SC) successful integrated nutrition program in Viet Nam, the poverty alleviation and nutrition program (PANP), uses the positive deviance (PD) approach to identify key growth promoting behaviors and provides participatory adult education allowing mothers to develop skills related to these behaviors. We investigated whether improvements seen during a PANP intervention (1993–1995) were sustained three and four years after SC's departure. Cross-sectional surveys were administered to 46 randomly selected households in four communes that had previously participated in the PANP and 25 households in a neighboring comparison community in 1998 and 1999. Two children per household, an older child who had participated in the PANP and a younger sibling who had not, were measured (total n = 142 children), and their mothers were interviewed. Older SC children tended to be better nourished than their counterparts. Their younger siblings were significantly better nourished than those in the comparison group, with adjusted mean weight-for-age Z scores of −1.82 versus −2.45 (p = .007), weight-for-height Z scores of −0.71 versus −1.45 (p < .001), and height-for-age Z scores of −2.11 and −2.37 (ns, p = .4), respectively. SC mothers reporting feeding the younger siblings more than their counterparts did (2.9 versus 2.2 main meals per day, p < .001, and 96.2% versus 52% offering snacks, p < .01]. SC mothers reported washing their hands “often” more than comparison mothers (100% vs. 76%, p < .001). Growth-promoting behaviors identified through PD studies and practiced through neighborhood-based rehabilitation sessions persisted years after program completion. These sustained behaviors contributed to better growth of younger siblings never exposed to the program.
We compared the positive deviance (PD) approach in Save the Children's field guide with a case-control study (CCS) to identify behaviors associated with good nutritional status in Afghan refugee children 6 to 24 months of age in the Northwest Frontier Province (NWFP), Pakistan. The positive deviance inquiry (PDI), utilizing observations and interviews with mothers, fathers, and secondary caregivers in eight households, identified 12 feeding, caring, and health-seeking behaviors that were not widely practiced. The CCS, using the same selection criteria and content as the PDI with 50 mother-child pairs not in the PDI, yielded six significant associations with good nutritional status. Both the PDI and CCS detected feeding behaviors. The PDI alone identified complex phenomena (active feeding and maternal affect). The CCS alone confirmed the beneficial use of health services. The PD approach was an affordable, participatory, and valid method to identify feeding behaviors and other factors associated with good nutrition in this context.
This paper details the steps to design and implement a positive deviance-informed, “Hearth” approach for the nutritional rehabilitation of malnourished children in the district of Leogane, Haiti. Groups of four to five children met daily for two weeks at the home of a local volunteer mother for nutritional and health messages and a well-balanced meal. Health messages and meal components were determined using information gathered from interviews with the mothers of positive deviant children in the community who are well nourished despite their family's limited economic resources. Hearth participants were then followed for six months in their own home by the program “monitrices,” women hired from each village and intensively trained to supervise the Hearth program, periodically weigh the children to evaluate their progress, and liaise between the hospital and the community. Monitoring from the first cycle indicated that 100% of children in eight villages and 66% of children in the remaining five villages continued to gain weight as fast or faster than the international standard median six months after participating in a Hearth program. At the conclusion of this cycle, programmers interviewed participant and non-participant families and made six modifications to the model, including the addition of a microcredit option for participating mothers.