ArticlePDF Available

Abstract and Figures

This article presents the functionalities of the R package SetMethods, aimed at performing advanced set-theoretic analyses. This includes functions for performing set-theoretic multi-method research, set-theoretic theory evaluation, Enhanced Standard Analysis, diagnosing the impact of temporal, spatial, or substantive clusterings of the data on the results obtained via Qualitative Comparative Analysis (QCA), indirect calibration, and visualising QCA results via XY plots or radar charts. Each functionality is presented in turn, the conceptual idea and the logic behind the procedure being first summarized, and afterwards illustrated with data from Schneider et al. (2010).
Content may be subject to copyright.
CONTRIBUTED RESEARCH ARTICLE 1
SetMethods: an Add-on R Package for
Advanced QCA
by Ioana-Elena Oana and Carsten Q. Schneider
Abstract
This article presents the functionalities of the R package
SetMethods
, aimed at performing
advanced set-theoretic analyses. This includes functions for performing set-theoretic multi-method
research, set-theoretic theory evaluation, Enhanced Standard Analysis, diagnosing the impact of
temporal, spatial, or substantive clusterings of the data on the results obtained via Qualitative Com-
parative Analysis (QCA), indirect calibration, and visualising QCA results via XY plots or radar charts.
Each functionality is presented in turn, the conceptual idea and the logic behind the procedure being
first summarized, and afterwards illustrated with data from Schneider et al. (2010).
Introduction
Set-theoretic methods, in general (Goertz and Mahoney,2012), and Qualitative Comparative Analysis,
in particular, are becoming increasingly popular within different disciplines in the social sciences and
neighboring fields (Rihoux et al.,2013). Parallel to conceptual developments and increasing numbers
of applied studies, accelerating progress in terms of software development can be witnessed. While
less than a decade ago only two functioning software packages were available to users (fsQCA Ragin
et al. (2006) and Tosmana Cronqvist (2011)), there are now over a dozen different software solutions
offered (see
http://compasss.org/software.htm
). Many of them are developed within the Rsoftware
environment, with Rpackage
QCA
(Dusa,2007) being not only the one with the longest history, but
also the most complete and complex.
In this paper, we discuss the different functionalities of the Rpackage
SetMethods
(Medzihorsky
et al.,2016). It is best perceived of as an add-on tool to package
QCA
and allows applied researchers to
perform advanced set-theoretic analyses. More precisely,
SetMethods
enables researchers to perform
Set-Theoretic Multi-Method Research, the Enhanced Standard Analysis (ESA), Set-Analytic Theory
Evaluation, to run diagnostics in the presence of clustered data structures, and to display their results
in various ways.
We proceed as follows. Each of the different functionalities within
SetMethods
is presented in
a separate section. Within each section, we first biefly summarize the conceptual idea behind the
analysis in question, then describe the computational logic of the function for performing the analysis,
after which we demonstrate the use of the function by displaying the Rsyntax and selected output by
using an example from published research.
Even though the main purpose is to present the functionality of Rpackage
SetMethods
, this
article is also useful for researchers who perform their QCA in software environments other than R
because we present the logic of several of the main advanced set-analytic procedures in a concise and
transparent manner.
The empirical example
In order to illustrate the use of the different functions in
SetMethods
, we use the empirical example by
Schneider et al. (2010) which uses fuzzy-sets for explaining capitalist variety and export performance
in high-tech industries. More precisely, the research question focuses on the institutional determinants
of export performance in high-tech industries. The outcome consists of the export performance in
high-tech industries (EXPORT). The conditions used are: employment protection (EMP), collective
bargaining (BARGAIN), university training (UNI), occupational training (OCCUP), stock market size
(STOCK), and mergers and aquisitions (MA). The authors analyze 76 cases, representing 19 countries
at four time points.
For the sake of simplicity, we use the same data for illustrating all the functions. Our goal is not, of
course, to contribute to the substantive discussion on varieties of capitalism or institutional context.
This is why we will take the liberty to alter the analytic setup if needed for demonstration purposes,
by, for instance, dropping cases or conditions or by changing the outcome to be explained.
# We load the SetMethods package:
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 2
library(SetMethods)
# First rows of the Schneider et al.(2010) data called SCHF from package
# SetMethods:
data(SCHF)
head(SCHF)
## EMP BARGAIN UNI OCCUP STOCK MA EXPORT
## Australia_90 0.07 0.90 1.00 0.68 0.45 0.33 0.19
## Austria_90 0.70 0.98 0.01 0.91 0.01 0.05 0.25
## Belgium_90 0.94 0.95 0.14 0.37 0.26 0.14 0.14
## Canada_90 0.04 0.21 0.99 0.11 0.62 0.31 0.28
## Denmark_90 0.59 0.78 0.10 0.55 0.53 0.10 0.34
## Finland_90 0.70 0.97 0.20 0.95 0.02 0.13 0.17
Set-theoretic multi-method research (MMR)
The term Set-Theoretic Multi-Method Research (MMR) captures all those empirical approaches for
combining cross-case analyses with within-case studies in which both levels of analysis follow the goal
of investigating sets and their relations. At the cross-case level the most common methodological tool
is Qualitative Comparative Analysis (QCA) and at the within-case level, process tracing. Both tools
can be rooted in (fuzzy) set theory (see, e.g. Ragin (2008) for QCA and Mikkelsen (2017) for process
tracing). In principle and practice any sequence of analyses can and is performed in the applied
literature. In the following, we focus on the sequence ’cross-case QCA first, followed by within-case
analyses’. This sequence may or may not be continued by another QCA. As any decent QCA, it is
certainly preceded by a thorough accumulation of case knowledge in order to select and calibrate
conditions. These crucial research steps, however, fall outside of the definitional scope of set-theoretic
MMR.1
In the following, we limit our discussion to set-theoretic MMR after a cross-case analysis of
sufficiency, as discussed in Schneider and Rohlfing (2013,2016), Schneider and Rohlfing (manuscript),
and Rohlfing and Schneider (2018).
2
We first briefly summarize the different types of cases and the
purpose of their within-case analysis (Section 2.3.1). After this, we discuss the four different feasible
comparative within-case analyses in Figure 1and their analytic purposes (Section 2.3.2). For each form
of within-case analysis, we explain the use of the
mmr
function and the formula used for finding cases
(Schneider and Rohlfing, manuscript)
3
. The general structure of function
mmr
is illustrated in Figure 1.
Users need to specify whether they want to perform single or comparative within-case analysis and
then on which cases the analysis is performed.
Identifying types of cases
Key for combining QCA with process tracing is the sorting of cases to different case types based on
the QCA solution formula. The literature identifies five different types (Schneider and Rohlfing,2013).
Membership in a type is defined by the membership scores of a case in the outcome
Y
, on the one
hand, and the sufficient term
T
or the solution formula
S
, on the other hand. Table 1summarizes
the definition of each case type and the analytic purpose of the within-case analysis in single cases.
Figure 2visualizes the location of each case type in an XY plot.
Typical cases and deviant cases consistency are defined based on their membership in a sufficient
term
T
, whereas deviant cases coverage and IIR cases are defined based on their membership solution
formula
S
. Deviant cases consistency are subdivided into deviant in degree and deviant in kind. The
latter are always preferable for within-case analysis. IIR cases are not useful for single-case studies,
but they play an important role for comparative within-case analyses (see Section 2.3.2).
Table 1is adapted from Schneider and Rohlfing (manuscript).
1For a systematic discussion of the pre-QCA case studies, see Rihoux and Lobe (2009).
2For MMR after an analysis of necessity, see Rohlfing and Schneider (2013).
3
For a systematic test of the mathematical formulas used for selecting single cases or pairs of cases for set-
theoretic MMR see the Appendix of this paper.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 3
Figure 1: Types of Post-QCA Case Studies
Post-QCA MMR
Single Case
(match=FALSE)
Typical
(cases=2)
Deviant Consistency
(cases=3)
Deviant Coverage
(cases=4)
IIR
(cases=5)
Comparative
(match=TRUE)
Typical-IIR
(cases=2)
Typical-Typical
(cases=1)
Typical-Deviant Consistency
(cases=3)
Deviant Coverage-IIR
(cases=4)
Table 1: Types of cases in fsQCA of sufficiency
Membership in Goal of
Type of case T Y within-case analysis
(1) Typical >.5 >.5 TYidentify mechanism M
(2) Deviant consistency (degree) >.5 >.5 TYnot recommended
(3) Deviant consistency (kind) >.5 <.5 identify missing INUS
S
(4) Deviant coverage <.5 >.5 identify missing conjunction
(5) IIR <.5 <.5 not useful
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 4
Figure 2: Types of Cases, XY Plot
Sufficient Term T
Outcome Y
0.5
0.5
(1)
(2)
(3)
Solution Formula S
Outcome Y
0.5
0.5
(4)
(5)
Typical cases
Process tracing in typical cases aims at empirically probing the causal mechanism(s) linking the
sufficient term
S
to outcome
Y
. For conjunction
S
to be causal, each conjunct
C
of
S
must be causal,
i.e. they must make a difference to outcome
Y
by making a difference to mechanism
M
. This requires
as many within-case analyses of typical cases as there are conjuncts in the sufficient conjunction.
For each analysis, one is the focal conjunct
FC
and the others are the complementary conjuncts
CC
.
The focal conjunct
FC
is the conjunct for which we want to find out whether it makes a difference
for the mechanism
M
, while the complementary conjuncts
CC
represent the other conjuncts of the
sufficient term S(Schneider and Rohlfing, manuscript). For causal inference on the configuration we
proceed by taking each conjunct at a time as the focal conjunct
FC
. Additionally, we also apply the test
severity principle. With fuzzy-sets the membership in mechanism
M
can only vary within the corridor
established by the membership in
FC
(the lowest value
M
can take) and
Y
(the highest value
M
can
take) for preserving the causal chain
FC MY
(Schneider and Rohlfing, manuscript). The smaller
the corridor, the smaller the range of membership values
M
can take. Therefore, the most severe test for
M
is the one in which
FC
=
S
=
Y
because the only consistent membership score in
M
equals
FC
=
S
=
Y.
The best-available typical case fulfills the following criteria: a) the focal conjunct is the one that
defines the membership of the typical case in the term (
FC CC
); b) the corridor for mechanism
M
as
defined by the sufficient term
S
(from a) we also have
S=FC
) and
Y
is small; c) membership in the
sufficient term Sis high; d) the case is uniquely covered by the sufficient term S.
Figure 3visualizes the test severity principle in two different ways. The XY plot in the upper
panel shows that for cases closer to the diagonal, test severity increases. The length of the vertical
and horizontal arrows, respectively, visualizes the range of fuzzy set membership scores for
M
that
would still be consistent. The larger this range, the less severe the test. The Euler diagram in the lower
panel visualizes the same by contrasting
S1
almost as big as
Y
with
S2
being much smaller than
Y
. The
former leaves little and the latter a lot of room for M.
The ideal typical case is located in the upper-right corner of the XY plot in Figure 3with
FC =S=
Y=
1. In applied QCA, such cases usually do not exist in the data at hand. Function
mmr()
identifies
the best available typical case in a given data set.
Function
mmr()
first sorts each typical case based on whether
FC CC
(rank 1) or
FC >FC
(rank
2). Cases in each rank are then further sorted according to Formula 1. Smaller values indicate better
suitable cases.4
TYP =(YS)small corridor for mechanism
+(1S)large membership in the sufficient term (1)
where Y= outcome, S= sufficient term
Applied to our example, function
mmr()
works as follows. After minimizing the truth table
TT_y
and producing the parsimonious and intermediate solutions
sol_yp
and
sol_yi
using package
QCA
,
we input these solutions
5
into the
mmr()
function while setting arguments
match
to
FALSE
and
cases
4This holds for all mmr() formulas: the smaller the value, the more suitable the case (pair) is.
5
For inputing solutions that have model ambiguity, argument
sol
can be used to specify which solution the
user wants to work with. If a single number is used, this number indicates which model of the conservative or
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 5
Figure 3: Two visualizations of test severity
Focal Conjunct S
Outcome Y
S1
S2increase severity
increase severity
0
1
1
0.5
0.5
YM
S1
more severe testing
Y
MS2
less severe testing
to
2
. As argument
term
is set to
1
the output shows the typical cases for each focal condition in the first
sufficient term, together with some additional information. The information included in the output
comprises of membership values of the typical cases in the focal conjuct, complementary conjuncts,
the whole sufficient term, and the outcome (in this case
EXPORT
), formula values
St
, whether the
case is the most typical according to the formula, which rank does the case sit in, and whether the case
is uniquely covered by the sufficient term. The order of the information that users should look for in
this output is whether the case is uniquely covered, what rank is the case in (the smaller, the better),
and what formula value
St
does the case have (the smaller, the better). For example, for focal conjunct
emp
in sufficient term
emp bargain OCCUP
,
Switzerland_03
appears to be the best available typical
case, being uniquely covered, being in Rank 1, and having the smallest formula value (St=0.59).
# We create the truth table:
TT_y <- truthTable(SCHF, outcome = "EXPORT",
conditions = c("EMP","BARGAIN","UNI",
"OCCUP","STOCK", "MA"),
parsimonious solution according to the order in the "qca" object the user wants to work with. However, since QCA
solutions (conservative, parsimonious, intermediate) are in a subset relationship with each other, they tend to
have more complicated structures in which model ambiguity is tied from one solution to the other. For this cases
the argument
sol
allows users to specify the models they want to choose by using a character string of the form
"c1p3i2" where c = conservative solution, p = parsimonious solution and i = intermediate solution.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 6
incl.cut = .9,
complete = TRUE,
PRI = TRUE,
sort.by = c("out", "incl", "n"))
# Get the parsimonious solution:
sol_yp <- minimize(TT_y, include = "?", details = TRUE,
show.cases = TRUE)
# Get the intermediate solution:
sol_yi <- minimize(TT_y, include = "?", details = TRUE,
show.cases = TRUE, dir.exp = c(0,0,0,0,0,0))
# Get typical cases for the first term of the second intermediate solution:
mmr (results = sol_yi, outcome = "EXPORT", neg.out = FALSE,
sol = "c1p1i2", match = FALSE, cases = 2, term = 1)
## Typical Cases - Focal Conjunct emp :
## ----------
## Focal Conjunct Comp. Conjunct Term Membership EXPORT St
## Switzerland_03 0.70 0.71 0.70 0.99 0.59
## Switzerland_99 0.75 0.54 0.54 0.98 0.69
## most_typical Rank uniquely_cov
## Switzerland_03 TRUE 1 TRUE
## Switzerland_99 FALSE 2 TRUE
##
## Typical Cases - Focal Conjunct bargain :
## ----------
## Focal Conjunct Comp. Conjunct Term Membership EXPORT St
## Switzerland_99 0.54 0.74 0.54 0.98 0.90
## Switzerland_03 0.76 0.70 0.70 0.99 0.53
## most_typical Rank uniquely_cov
## Switzerland_99 FALSE 1 TRUE
## Switzerland_03 TRUE 2 TRUE
##
## Typical Cases - Focal Conjunct OCCUP :
## ----------
## Focal Conjunct Comp. Conjunct Term Membership EXPORT St
## Switzerland_03 0.71 0.70 0.70 0.99 0.58
## Switzerland_99 0.74 0.54 0.54 0.98 0.70
## most_typical Rank uniquely_cov
## Switzerland_03 TRUE 2 TRUE
## Switzerland_99 FALSE 2 TRUE
Deviant cases consistency
Deviant cases consistency are puzzling because their membership in the sufficient term
S
exceeds that
in the outcome
Y
, i.e.
S>Y
. This becomes even more puzzling if
S>
0.5&
Y<
0.5, that is, if we have
deviant cases consistency in kind rather than just in degree (see Table 1). The more
S
exceeds
Y
, the
bigger the empirical puzzle, especially if membership in
S
is high. Within-case analysis of a deviant
case consistency aims at identifying the reasons why mechanism
M
either absent or prevented from
producing
Y
. The reason must be an INUS condition omitted from
S
. Formula 2identifies the best
available deviant case consistency in a data set.
DCN =[1(SY)] far from to the diagonal
+(1S)large membership in the sufficient term (2)
where Y= outcome, S= sufficient term
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 7
Using the same data from Schneider et al. (2010) and focusing on the parsimonious solution,
function
mmr()
identifies the deviant consistency cases for each sufficient term. For obtaining this
we need to keep argument
match
set to
FALSE
, as we are doing single case identification, but set
argument
cases
to
3
, the identifier for deviant cases consistency (see Figure 1). The output shows the
deviant consistency cases (first column) grouped by sufficient term (second column) together with
term membership, outcome membership, formula value
Sd
, and whether the case is the most deviant
for a particular term. In the output we see that, for example, for term
emp OCCUP
the most deviant
case consistency is
Switzerland_90
with the smallest formula value (Sd=0.67). Figure 4shows all the
deviant cases consistency (cases in the lower right corner) for the first sufficient path
emp OCCUP
of
the parsimonious solution.
# Get deviant cases consistency for the parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = FALSE, cases = 3)
## Deviant Consistency Cases :
## ----------
## cases term term_membership EXPORT Sd
## 2 Switzerland_90 emp*OCCUP 0.82 0.31 0.67
## 1 Australia_90 emp*OCCUP 0.68 0.19 0.83
## 3 Australia_95 emp*OCCUP 0.68 0.31 0.95
## 4 Australia_99 emp*OCCUP 0.68 0.38 1.02
## 14 Australia_95 BARGAIN*UNI*STOCK 0.90 0.31 0.51
## 7 Australia_03 BARGAIN*UNI*STOCK 0.90 0.35 0.55
## 6 Spain_99 BARGAIN*UNI*STOCK 0.79 0.27 0.69
## 8 Norway_03 BARGAIN*UNI*STOCK 0.79 0.32 0.74
## 41 Australia_99 BARGAIN*UNI*STOCK 0.79 0.38 0.80
## 21 Denmark_95 BARGAIN*UNI*STOCK 0.76 0.40 0.88
## 5 Belgium_99 BARGAIN*UNI*STOCK 0.72 0.40 0.96
## 31 Finland_95 BARGAIN*UNI*STOCK 0.73 0.49 1.03
## 42 Spain_03 occup*STOCK*ma 0.74 0.30 0.82
## 32 Denmark_95 occup*STOCK*ma 0.73 0.40 0.94
## 15 Canada_90 occup*STOCK*ma 0.62 0.28 1.04
## 22 Canada_95 occup*STOCK*ma 0.60 0.30 1.10
## most_deviant
## 2 TRUE
## 1 FALSE
## 3 FALSE
## 4 FALSE
## 14 TRUE
## 7 FALSE
## 6 FALSE
## 8 FALSE
## 41 FALSE
## 21 FALSE
## 5 FALSE
## 31 FALSE
## 42 TRUE
## 32 FALSE
## 15 FALSE
## 22 FALSE
# Plot each sufficient path of the parsimonious solution:
pimplot(data = SCHF, results = sol_yp, outcome = "EXPORT", case_labels = FALSE)
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 8
XY plot
emp*OCCUP
EXPORT
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cons.Suf: 0.836; Cov.Suf: 0.353; PRI: 0.596; Cons.Suf(H): 0.776
Figure 4: Deviant Cases Consistency
Deviant cases coverage
Deviant cases coverage are puzzling because they are members of the outcome without, however,
being members of any known sufficient term. Within-case analysis aims at identifying sufficient term
S+omitted from the solution formula, which triggers mechanism a Mand outcome Y.
Since deviant cases coverage are defined by what they are not - members of the solution formula
(see Table 1) - this solution formula is not a good place to start selecting the best available deviant
cases coverage. Instead, this type of case is selected based on their membership in their truth table row
TT
. For each
TT
with at least one deviant case coverage, a within-case analysis can be performed. If
more than one deviant case coverage populates the same
TT
, Formula 3identifies the best available
case for within-case analysis.
DCV =|YTT|small corridor for mechanism
+(1TT)large membership in the truth table row (3)
where Y= outcome, TT = membership in the Truth Table row
Similar to the Formula 1for identifying the best available typical case, the goal is to minimize the
difference between the membership scores in
Y
and
TT
and to prefer higher membership in
TT
. This
is achieved by formula 3. Since the primary goal in this within-case analysis is not to draw causal
inference but to identify a missing conjunction, there is no need to decompose
TT
into its constituent
sets.
Applied to our example, the following code displays the list of deviant cases coverage (notice
argument
cases
is set to
4
), the membership they have in the entire solution formula, their values on
the Formula 3, the truth table row they belong to (columns starting with
TT
indicating the specific
combination of conditions the case presents), the membership they have in that specific truth table
row, and membership in the outcome. The cases are sorted by truth table row and ranked according
to their appropriateness using formula values
Sd
. For example, we can notice that truth table row
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 9
emp bargain UNI occup STOCK MA
(rows 9, 10, 4, and 5 of the output) is populated by 4
deviant coverage cases, out of which
UK_90
is the best available for within case analysis, having the
smallest formula value (Sd=0.51).
# Get deviant cases coverage for the parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = FALSE, cases = 4)
## Deviant Coverage Cases :
## ----------
## case solution_membership Sd TT_EMP TT_BARGAIN TT_UNI
## 8 UK_03 0.12 0.22 0 0 1
## 3 Germany_99 0.17 0.33 1 1 0
## 11 UK_99 0.18 0.34 0 0 1
## 12 USA_99 0.20 0.39 0 0 1
## 7 Sweden_95 0.36 0.47 1 1 0
## 1 France_95 0.41 0.50 1 1 0
## 9 UK_90 0.36 0.51 0 0 1
## 10 UK_95 0.36 0.59 0 0 1
## 4 Ireland_03 0.32 0.64 0 0 1
## 5 Ireland_99 0.32 0.64 0 0 1
## 2 Germany_03 0.40 0.67 1 1 0
## 6 Netherlands_95 0.45 0.68 1 1 0
## TT_OCCUP TT_STOCK TT_MA TT_row_membership EXPORT
## 8 0 1 1 0.88 0.98
## 3 1 1 1 0.71 0.67
## 11 0 1 1 0.82 0.98
## 12 0 1 1 0.80 0.99
## 7 1 1 1 0.62 0.71
## 1 1 0 0 0.56 0.62
## 9 0 1 1 0.64 0.79
## 10 0 1 1 0.64 0.87
## 4 0 1 1 0.68 1.00
## 5 0 1 1 0.68 1.00
## 2 1 0 0 0.51 0.69
## 6 1 1 1 0.51 0.70
Individually irrelevant cases
Individually irrelevant (IIR) cases owe their name to the fact that single within-case analyses in this
type of cases is not useful. IIR cases do play a crucial role in two forms of comparative within-case
analysis (see Section 2.3.2). Even if not useful for single case studies, identifying IIR cases is informative
as their list - together with the deviant cases coverage - indicate the diversity among cases without the
outcome. The more different truth table rows are populated by IIR cases (and deviant cases coverage),
the more heterogeneous this group of cases is.
Function
mmr()
lists all individually irrelevant cases with respect to the entire solution formula
(also called globally uncovered IIR cases) and sorts each of them into the truth table to which they
belong best. Since these cases are not informative for single case studies and are being used just to
indicate diversity among the cases without the outcome, the function does not involve a formula
ranking of IIR cases.
# Get individually irrelevant cases for the parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = FALSE, cases = 5)
## Individually Irrelevant Cases :
## ----------
## case solution_membership TT_EMP TT_BARGAIN TT_UNI TT_OCCUP
## 19 New Zealand_90 0.17 0 0 0 0
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 10
## 18 New Zealand_03 0.25 0 0 1 0
## 21 New Zealand_99 0.25 0 0 1 0
## 6 Canada_03 0.38 0 0 1 0
## 7 Canada_99 0.20 0 0 1 0
## 20 New Zealand_95 0.20 0 0 1 0
## 4 Belgium_90 0.26 1 1 0 0
## 10 France_90 0.23 1 1 0 0
## 14 Italy_90 0.01 1 1 0 0
## 25 Spain_90 0.03 1 1 0 0
## 1 Austria_90 0.30 1 1 0 1
## 2 Austria_95 0.30 1 1 0 1
## 3 Austria_99 0.30 1 1 0 1
## 9 Finland_90 0.30 1 1 0 1
## 11 Germany_90 0.05 1 1 0 1
## 12 Germany_95 0.17 1 1 0 1
## 13 Italy_03 0.25 1 1 0 1
## 15 Italy_95 0.03 1 1 0 1
## 17 Netherlands_90 0.14 1 1 0 1
## 8 Denmark_90 0.45 1 1 0 1
## 16 Italy_99 0.38 1 1 0 1
## 5 Belgium_95 0.21 1 1 0 1
## 27 Sweden_90 0.07 1 1 0 1
## 26 Spain_95 0.06 1 1 1 0
## 22 Norway_90 0.12 1 1 1 1
## 23 Norway_95 0.49 1 1 1 1
## 24 Norway_99 0.45 1 1 1 1
## TT_STOCK TT_MA TT_row_membership EXPORT
## 19 0 1 0.58 0.06
## 18 0 1 0.75 0.13
## 21 0 1 0.75 0.08
## 6 1 1 0.62 0.36
## 7 1 1 0.80 0.40
## 20 1 1 0.76 0.09
## 4 0 0 0.63 0.14
## 10 0 0 0.54 0.42
## 14 0 0 0.70 0.20
## 25 0 0 0.69 0.17
## 1 0 0 0.70 0.25
## 2 0 0 0.70 0.25
## 3 0 0 0.70 0.47
## 9 0 0 0.70 0.17
## 11 0 0 0.92 0.27
## 12 0 0 0.74 0.32
## 13 0 0 0.67 0.31
## 15 0 0 0.71 0.18
## 17 0 0 0.57 0.39
## 8 1 0 0.53 0.34
## 16 1 0 0.62 0.29
## 5 1 1 0.53 0.20
## 27 1 1 0.74 0.36
## 26 0 0 0.84 0.19
## 22 0 0 0.65 0.14
## 23 0 0 0.51 0.14
## 24 0 1 0.55 0.32
Identifying best-matching pairs of cases for comparative process tracing
The literature identifies four feasible within-case comparisons after a QCA between two types of
cases each (Schneider and Rohlfing, manuscript). With each comparison a different analytic goal is
pursued. Figure 5summarizes these goals. The two comparisons ’along the main diagonal’ pursue
a causal inference goal, whereas the two ’vertical’ comparisons aim at improving the QCA model
specification by identifying either an INUS condition missing from a known sufficient term or an
entire new sufficient term missing from the solution formula.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 11
Figure 5: Forms of feasible comparisons
Sufficient Term S
Outcome Y
0.5
0.5
is mechanism regular
identify missing INUS
is conjunction causal
Truth Table Row TT
Outcome Y
identify missing conjunction
0.5
0.5
Matching typical and individually irrelevant (IIR) cases
The purpose of the within-case comparison between a typical case and and IIR case is to empirically
investigate whether a sufficient term is a difference-maker, i.e. causal, not only for the outcome (
Y
) at
the cross-case level, but also for the mechanism
M
at the within-case level. Similarly, the within-case
comparison of two typical cases empirically probes whether the same mechanism
M
links the sufficient
term
S
to outcome
Y
in typical cases that are as different from each other as possible. For both forms
of comparison, it holds that if
S
is a conjunction, each of its conjuncts
C
must be a difference-maker.
Hence, the comparisons between a typical case and an IIR case (or another typical case) must be
performed for each single conjunct
C
at a time. The following sections provide more details for each
form of comparison and spells out the sorting mechanisms and mathematical formulas that underly
the respective functions in SetMethods.
Table 2:
Possible membership constellations between focal (
FC
) and complementary conjuncts (
CC
)
in comparison of typical and IIR case
Difference Deter- Attribution Attribution
Rank Typical IIR FC minate typical IIR
1FC CC FC <0.5 <CC Yes Yes Yes Yes
2FC >CC FC <0.5 <CC Yes Yes No Yes
3FC CC FC CC <0.5 Yes No Yes Yes
4FC CC CC <FC <0.5 Yes No Yes No
4FC >CC FC CC <0.5 Yes No No Yes
6FC >CC CC <FC <0.5 Yes No No No
7FC CC CC <0.5 <FC No No Yes No
8FC >CC CC <0.5 <FC No No No No
taken from Schneider and Rohlfing (manuscript)
Function
mmr()
first sorts each pair of typical and IIR cases into ranks 1-8 as defined in Table 2.
Cases in smaller rank numbers are more adequate for the analytic goal of the comparative within-case
analysis of these two case types. For case pairs in rank 1, for example, it holds that the difference-
making quality can be attributed to the focal conjunct
FC
both on the typical and the IIR case, and that
it is determinate.
Within each rank, Formula 4maximizes the following criteria: between both cases, the difference
in
FC
and in
Y
, respectively, should be small; both should have high membership in
CC
; and both
should be close to the diagonal. Within each rank, case pairs with smaller formula values are more
appropriate. Additionally, typical cases should be uniquely covered by the sufficient term under
inverstigation, while IIR cases should be globally uncovered (not covered by any of the sufficient
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 12
terms).
TYP IIR =[1(FCTY P FCI IR )] large difference in focal condition
+[1(YTYP YII R )] large difference in outcome
+|min(CCTYP)min(CCI I R)|small difference in complementary conditions
+2|(YTYP min (FCTYP ,CCTYP ))|typical case close to diagonal
+2|(YII R min (FCI I R ,CCI I R ))|IIR case close to diagonal
(4)
where Y= outcome, FC = focal condition, and CC = complementary condition
For switching to comparative MMR and identifying pairs of cases, argument
match
must be set to
TRUE
. Additionally, for getting the best available pairs of typical and IIR cases we set
cases
to
2
. In the
output for the first sufficient term (notice argument
term
set to
1
) we will get the best available pairs
for each focal conjunct in turn as separate tables. The output lists the names of the typical and IIR case,
their value on the above formula (
Distance
), which rank does the pair come from, whether the typical
case is uniquely covered, and whether the IIR case is globally uncovered. Researchers should strive to
pick cases that are uniquely covered and globally uncovered, have the smallest rank possible, and
have the smallest
Distance
value. For example, for focal conjunct
OCCU P
, typical case
Denmark_99
and IIR case
New Zealand_90
are the best pair available as they are in Rank 1, they have the smallest
formula value (Distance=1.24), and the typical case is uniquely covered by the term, while the IIR case
is globally uncovered by the solution.
# Get matching pairs of typical and IIR cases for the first term
# of the parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = TRUE, cases = 2, term = 1)
## Focal Conjunct emp :
## ----------
## Typical IIR Distance PairRank UniqCovTyp GlobUncovIIR
## 44 Switzerland_03 Norway_90 1.30 1 TRUE TRUE
## 43 Denmark_03 Norway_90 1.38 1 TRUE TRUE
## 88 Switzerland_03 Norway_95 1.38 1 TRUE TRUE
## 80 Switzerland_03 Italy_95 1.40 1 TRUE TRUE
## 60 Switzerland_03 Belgium_95 1.55 1 TRUE TRUE
##
## Focal Conjunct OCCUP :
## ----------
## Typical IIR Distance PairRank UniqCovTyp
## 37 Denmark_99 New Zealand_90 1.24 1 TRUE
## 81 Denmark_99 New Zealand_95 1.30 1 TRUE
## 133 Denmark_99 New Zealand_03 1.31 1 TRUE
## 38 Switzerland_99 New Zealand_90 1.39 1 TRUE
## 82 Switzerland_99 New Zealand_95 1.45 1 TRUE
## GlobUncovIIR
## 37 TRUE
## 81 TRUE
## 133 TRUE
## 38 TRUE
## 82 TRUE
Matching two typical cases
The matching of two typical cases follows a logic similar to the one between a typical and an IIR case.
The goal is to probe the difference-making properties of each conjunct (
FC
) in sufficient term
S
to
mechanism
M
. Table 3defines the four ranks that can occur based on two typical cases’ membership
in
FC
and the complementary conditions
CC
. After sorting each possible pair of typical cases into one
of these ranks, Formula 5further ranks those pairs such that their difference in
FC
and the outcome,
respectively, is minimized; that their membership in
CC
is maximized; and that both are close to the
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 13
diagonal (test severity principle). Additionally, the two typical cases should be uniquely covered by
the sufficient term.
Table 3:
Possible membership constellations between focal (
FC
) and complementary conjuncts (
CC
)
in comparison of two typical cases
Rank Typical 1 Typical 2 Attribution typical 1 Attribution typical 2
1FC CC FC CC Yes Yes
2FC CC FC >CC Yes No
2FC >CC FC CC No Yes
4FC >CC CC >FC No No
taken from Schneider and Rohlfing (manuscript)
TYP1TYP2=[0.5 (FCTYP1FCTYP2)] large difference in focal condition
+[0.5 (YTYP1YTYP2)] large difference in outcome
+
min(CCTYP1)min(CCTY P2)
small difference in complementary conditions
+2
(YTYP1min (FCTY P1,CCTYP1))
typical case close to diagonal
+2|(YTYP2min (FCTY P2,CCTYP2))|typical case close to diagonal
(5)
For getting the best available pairs of two typical cases argument
cases
in function
mmr()
must be
set to
1
. The output is similar to the one for typical-IIR pairs of cases, the best available pair of typical
cases for each focal conjunct being situated in as low a rank as possible, having the smaller formula
value (Distance), and being both uniquely covered. Looking at the first term of the parsimonious
solution, we can see that pair
Switzerland_03-Denmark_03
is the best available for focal conjunct
emp
,
while pair
Switzerland_99-Denmark_99
is the best available for focal conjunct
OCCU P
, both pairs
containing uniquely covered typical cases, being in Rank 1, and having the smallest
Distance
value for
their respective focal conjunct.
# Get matching pairs of typical and typical cases for the first term
# of the parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = TRUE, cases = 1, term = 1)
## Focal Conjunct emp :
## ----------
## Typical1 Typical2 Distance PairRank UniqCov1 UniqCov2
## 12 Switzerland_03 Denmark_03 1.72 1 TRUE TRUE
## 4 Switzerland_03 Denmark_99 1.44 2 TRUE TRUE
## 3 Denmark_03 Denmark_99 1.62 2 TRUE TRUE
## 8 Switzerland_03 Switzerland_99 2.13 2 TRUE TRUE
## 7 Denmark_03 Switzerland_99 2.31 2 TRUE TRUE
##
## Focal Conjunct OCCUP :
## ----------
## Typical1 Typical2 Distance PairRank UniqCov1 UniqCov2
## 2 Switzerland_99 Denmark_99 1.27 1 TRUE TRUE
## 4 Switzerland_03 Denmark_99 1.44 3 TRUE TRUE
## 3 Denmark_03 Denmark_99 1.62 3 TRUE TRUE
## 8 Switzerland_03 Switzerland_99 2.13 3 TRUE TRUE
## 7 Denmark_03 Switzerland_99 2.31 3 TRUE TRUE
Matching typical and deviant cases consistency
The comparative within-case analysis of a typical and a deviant case consistency aims at identifying
the INUS condition missing from the sufficient term
S
in question. The best available pair of cases
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 14
maximizes the following criteria: their membership in
S
should be as high and similar as possible
and their membership in
Y
is different as possible. Formula 6translates these matching criteria into
practice.
TYP DCON =[(1STY P )+(1SDCO N )] large membership in term
+[1(YTYP YDCON )] large difference in outcome
+|STYP SD CON |similar membership in term
(6)
Setting
cases
to
3
we get best available pair of typical and deviant consistency cases for each
sufficient term in the parsimonious solution
sol_yp
. For identifying a missing INUS in sufficient term
emp OCCUP
, the best available pair of cases that we could choose for process-tracing would be the
one between typical case
Switzerland_03
and deviant consistency case
Australia_90
, as they have
the smallest formula value (Distance=0.84).
# Get matching pairs of typical and deviant consistency cases for the
# parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = TRUE, cases = 3)
## Term emp*OCCUP :
## ----------
## typical deviant_consistency distance term best_matching_pair
## 1 Switzerland_03 Australia_90 0.84 emp*OCCUP TRUE
## 2 Switzerland_99 Australia_90 0.85 emp*OCCUP FALSE
## 3 Switzerland_99 Switzerland_90 0.85 emp*OCCUP FALSE
## 4 Switzerland_03 Switzerland_90 0.92 emp*OCCUP FALSE
## 5 Switzerland_03 Australia_95 0.96 emp*OCCUP FALSE
##
## Term BARGAIN*UNI*STOCK :
## ----------
## typical deviant_consistency distance term
## 1 Netherlands_03 Australia_95 0.55 BARGAIN*UNI*STOCK
## 2 Netherlands_03 Australia_03 0.59 BARGAIN*UNI*STOCK
## 3 Netherlands_99 Australia_95 0.65 BARGAIN*UNI*STOCK
## 4 Netherlands_99 Australia_03 0.69 BARGAIN*UNI*STOCK
## 5 Netherlands_99 Spain_99 0.73 BARGAIN*UNI*STOCK
## best_matching_pair
## 1 TRUE
## 2 FALSE
## 3 FALSE
## 4 FALSE
## 5 FALSE
##
## Term occup*STOCK*ma :
## ----------
## typical deviant_consistency distance term best_matching_pair
## 1 USA_03 Spain_03 0.84 occup*STOCK*ma TRUE
## 2 Japan_99 Spain_03 0.86 occup*STOCK*ma FALSE
## 3 Japan_03 Spain_03 0.88 occup*STOCK*ma FALSE
## 4 USA_90 Spain_03 0.89 occup*STOCK*ma FALSE
## 5 USA_95 Spain_03 0.91 occup*STOCK*ma FALSE
Matching deviant cases coverage and IIR cases
The comparative within-case analysis of a deviant case coverage and an IIR case aims at identifying
the sufficient conjunction
S+
missing from the sufficient solution formula generated with QCA. The
point of reference for matching cases is their membership in the truth table row
TT
to which they
belong. Analogous to the within-case comparison of a typical and a deviant case consistency case,
the goal is to maximize both cases’ membership and their similarity in
TT
and their difference in
Y
.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 15
Formula 7achieves this.
DCOV I I R =[(1TTDCO V )+(1TTI IR)] large membership in TT row
+[1(YDCOV YII R )] large difference in outcome
+|TTDCOV T TI IR|similar membership in TT row
(7)
Cases for this forth type of comparison can be identified by setting
cases
to
4
. Since for deviant
coverage and IIR cases we are interested in identifying an entire missing sufficient term, the output
for these pairs is focused on matching pairs in truth table rows, rather than in sufficient terms.
Therefore, the output is sorted by truth table rows (the columns starting with
TT
showing the
combination of conditions) and for each truth table row we can identify a best matching pair of
cases according to formula values in column
distance
. For example, if we focus on truth table row
EMP BARGAI N uni OCCUP stock ma
(rows 6, 7, 8, 9, 10 in the output), the deviant case
coverage
France_95
and the IIR case
Finland_90
constitute the best matching pair, having the smallest
formula value for this specific truth table row (distance=1.43).
# Get matching pairs of deviant coverage and IIR cases for the
# parsimonious solution:
mmr (results = sol_yp, outcome = "EXPORT", neg.out = FALSE,
sol = 1, match = TRUE, cases = 4)
## Matching Deviant Coverage-IIR Cases :
## ----------
## deviant_coverage individually_irrelevant distance best_matching_pair
## 1 USA_99 New Zealand_95 0.58 TRUE
## 2 UK_03 New Zealand_95 0.59 FALSE
## 3 UK_99 New Zealand_95 0.59 FALSE
## 4 Ireland_03 New Zealand_95 0.73 FALSE
## 5 Ireland_99 New Zealand_95 0.73 FALSE
## 6 France_95 Finland_90 1.43 TRUE
## 7 France_95 Italy_95 1.44 FALSE
## 8 Germany_03 Finland_90 1.46 FALSE
## 9 Germany_03 Italy_95 1.47 FALSE
## 10 France_95 Austria_90 1.51 FALSE
## 11 Sweden_95 Sweden_90 1.41 TRUE
## 12 Sweden_95 Belgium_95 1.43 FALSE
## 13 Netherlands_95 Belgium_95 1.48 FALSE
## 14 Netherlands_95 Sweden_90 1.64 FALSE
## TT_EMP TT_BARGAIN TT_UNI TT_OCCUP TT_STOCK TT_MA
## 1 0 0 1 0 1 1
## 2 0 0 1 0 1 1
## 3 0 0 1 0 1 1
## 4 0 0 1 0 1 1
## 5 0 0 1 0 1 1
## 6 1 1 0 1 0 0
## 7 1 1 0 1 0 0
## 8 1 1 0 1 0 0
## 9 1 1 0 1 0 0
## 10 1 1 0 1 0 0
## 11 1 1 0 1 1 1
## 12 1 1 0 1 1 1
## 13 1 1 0 1 1 1
## 14 1 1 0 1 1 1
Enhanced standard analysis (ESA)
Limited empirical diversity is an omnipresent feature in social science data. The treatment of logical
remainders rows has been a major theme since the Ragin’s path-breaking book (Ragin,1987, esp.
chapter 7). In Ragin (2008, chapters 8 and 9), three approaches towards remainders are proposed
under the label of the Standard Analysis (SA). Researchers can decide not to include them into the
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 16
logical minimization (yielding the conservative or complex solution
CS
), to include all remainders that
are simplifying (yielding the most parsimonious solution
PS
), or to include only those simplifying
assumption that are easy based on so-called directional expectations (yielding the intermediate solution
(IS)).
Schneider and Wagemann (2012, chapter 8) propose the Enhanced Standard Analysis (ESA), which
argues that simplifying assumptions on specific remainders can be untenable. There are three sources
of untenability. Incoherent counterfactuals, which are either logical remainders contradicting claims
of necessity
6
or assumptions made for the negated outcome
7
, and implausible counterfactuals, which
consist of claims about impossible remainders
8
. ESA simply stipulates that no QCA solution formula
can be based on untenable assumptions.
Figure 6provides a graphical representation of the different types of assumptions as defined
by SA and ESA. Both approaches only allow for simplifying assumptions
9
(i.e. those in the inner
circle) and both distinguish between difficult and easy counterfactuals (i.e. the vertical line inside the
circle)
10
. ESA but not SA does block any untenable assumption (i.e. the gray area on the lower part).
A risk of making untenable assumption is given whenever a researcher is claiming the presence of a
necessary condition, when statements of sufficiency for both the outcome and its negation are made,
and/or when two or more conditions with mutually exclusive categories are used in a truth table.
ESA requires that researchers identify those logical remainder rows whose inclusion into the logical
minimization would amount to an untenable claim. As a result, one obtains the enhanced PS and the
enhanced IS.11
Function
esa()
provides a straightforward tool for avoiding untenable assumptions and thus
putting ESA into practice. First, function
esa()
can exclude remainders that contradict single necessary
conditions, unions of necessary conditions, or more complicated expressions of necessity. For example,
assuming that the disjunction
STOCK +MA
is necessary for the outcome EXPORT, we ban all
remainder rows implied by this necessity claim in the
nec_cond
argument. All the logical remainder
rows that are subsets of
¬STOCK¬M A
are subsequently set ot OUT = 0 in the truth table object
ttnew
and thus excluded from further logical minimization.
# Ban logical reminders contradicting necessity:
# Let's assume that "STOCK + MA" is necessary for "EXPORT":
newtt <- esa(oldtt = TT_y, nec_cond = "STOCK + MA")
## EMP BARGAIN UNI OCCUP STOCK MA OUT n incl PRI cases
## 1 0 0 0 0 0 0 0 0 - -
## 3 0 0 0 0 1 0 ? 0 - -
## 4 0 0 0 0 1 1 ? 0 - -
## 5 0 0 0 1 0 0 0 0 - -
## 6 0 0 0 1 0 1 ? 0 - -
## 7 0 0 0 1 1 0 ? 0 - -
## 9 0 0 1 0 0 0 0 0 - -
## 13 0 0 1 1 0 0 0 0 - -
## 14 0 0 1 1 0 1 ? 0 - -
## 15 0 0 1 1 1 0 ? 0 - -
## 17 0 1 0 0 0 0 0 0 - -
## 18 0 1 0 0 0 1 ? 0 - -
## 20 0 1 0 0 1 1 ? 0 - -
## 21 0 1 0 1 0 0 0 0 - -
## 22 0 1 0 1 0 1 ? 0 - -
## 23 0 1 0 1 1 0 ? 0 - -
## 24 0 1 0 1 1 1 ? 0 - -
## 25 0 1 1 0 0 0 0 0 - -
## 26 0 1 1 0 0 1 ? 0 - -
## 30 0 1 1 1 0 1 ? 0 - -
6XY (implies ¬X→ ¬Y) & ¬XY
7XY&X→ ¬Y
8For instance, X = ‘rich-poor country’ etc & X Y
9
Schneider and Wagemann (2012) propose Theory-Guided ESA (TESA) as an approach in which parsimony is
not the primary decision rule for choosing remainders.
10
It can be noted that all assumptions that contribute to parsimony are either easy or difficult, they cannot be
neither. This is because all assumptions not constrainted by directional expectations (if there are any or just some)
are by default easy.
11For an application of ESA, see, for instance Thomann (2015).
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 17
## 31 0 1 1 1 1 0 ? 0 - -
## 33 1 0 0 0 0 0 0 0 - -
## 34 1 0 0 0 0 1 ? 0 - -
## 35 1 0 0 0 1 0 ? 0 - -
## 36 1 0 0 0 1 1 ? 0 - -
## 37 1 0 0 1 0 0 0 0 - -
## 38 1 0 0 1 0 1 ? 0 - -
## 39 1 0 0 1 1 0 ? 0 - -
## 40 1 0 0 1 1 1 ? 0 - -
## 41 1 0 1 0 0 0 0 0 - -
## 42 1 0 1 0 0 1 ? 0 - -
## 44 1 0 1 0 1 1 ? 0 - -
## 45 1 0 1 1 0 0 0 0 - -
## 46 1 0 1 1 0 1 ? 0 - -
## 47 1 0 1 1 1 0 ? 0 - -
## 48 1 0 1 1 1 1 ? 0 - -
## 50 1 1 0 0 0 1 ? 0 - -
## 51 1 1 0 0 1 0 ? 0 - -
## 52 1 1 0 0 1 1 ? 0 - -
## 54 1 1 0 1 0 1 ? 0 - -
## 58 1 1 1 0 0 1 ? 0 - -
## 59 1 1 1 0 1 0 ? 0 - -
Secondly, the
esa()
function can also ban implausible counterfactuals to produce truth tables
in which specific logical remainders identifyied through conjunctions are excluded. For example,
we can ban all remainder rows that have
BARG A I N+OCCU P
by using the Boolean expression
in argument
untenable_LR
. Finally, the function can exclude contradictory simplifying assumptions
(which are another form of untenable assumptions) and empirically observed rows that are part of
simultaneous subset relations
12
by just using the unique truth table row identifier in the argument
contrad_rows
. While argument
untenable_LR
accepts Boolean expression for excluding only logical
remainders, argument
contrad_rows
can exclude both empirically observed rows and remainder rows
through their unique identifier (row number).13
# Ban impossible logical remainders:
newtt <- esa(oldtt = TT_y, untenable_LR = "BARGAIN*~OCCUP")
# Ban contradictory rows:
newtt <- esa(oldtt = TT_y, contrad_rows = c("19", "14", "46", "51"))
Set-analytic theory evaluation
Ragin (1987, chapter 7) spells out the notion of theory evaluation. In essence, it consists of identifying
the overlap between a researcher’s theory (
T
) as formulated prior to the empirical analysis and the
empirical results (
S
) obtained via QCA. With both
T
and
S
being represented in the form of Boolean
expressions, all four logically possible combinations between
T
and
S
can be expressed in Boolean
terms as well and each case’s membership in each of these four expressions be calculated. Theory
evaluation reveals which aspects of
T
are empirically corroborated by
S
and which ones are not. It
also reveals how strong this emirical support is. Last but not least, theory evaluation can serve as a
case selection device by identifying cases that display membership scores in the empirical solution
and the outcome that are expected or utterly unexpected based on T.
Schneider and Wagemann (2012, chapter 11) refine Theory Evaluation by taking into account
each case’s membership score not only in
T
and
S
, but also outcome
Y
. This is necessary due to
the development of parameters of fit based on which by now it has become widespread practice to
allow for solution formulas with less than perfect consistency and coverage scores. This means that
in applied QCA, there are cases of
S
&
¬Y
and of
¬S
&
Y
. This gives rise to eight different areas. Each
area can be defined as a Boolean expression, provides different analytic information, and defines
12
Simultaneous subset relations happen when an empirically observed truth table row is a consistent enough
subset of both the outcome and its negation.
13Due to space restrictions, output for these functions is not shown here.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 18
Figure 6: Graphical Representation of Different Types of Assumptions on Remainders
Simplifying Assumptions
Easy
Counterfactuals Difficult
Counterfactuals
Untenable Assumptions, not simplifying
Tenable Assumptions, not simplifying
Untenable Easy
CounterfactualsUntenable Difficult
Counterfactuals
different types of cases. Each case has partial membership in all areas but only in one of higher than
0.5. Figure 7provides a visualization of the areas and the kinds of cases in each area.14
Figure 7: Graphical Representation of Theory Evaluation
Theory TSolution S
T¬S TS
¬T¬S
¬TS
Y: uncovered most likely
¬Y: most likely
Y: covered most likely
¬Y: inconsistent most likely
Y: covered least likely
¬Y: inconsistent least likely
Y: uncovered least likely ¬Y: inconsistent least likely
Function
theory.evaluation()
performs the theory evaluation procedure between a theory spec-
ified in Boolean terms and results obtained using the
QCA
package. Assuming that the theory can
be summarized as
EMP*MA + STOCK
, the example below shows how theory evaluation works using
the second intermediate solution for outcome
EXPORT
. The first part of the output shows the names
and proportion of cases in each of the intersections between theory and the empirical solution. The
second part of the output shows parameters of fit for the solution, the theory, and their intersections,
which indicate how much each of these areas are in line with a statement of sufficiency for
EXPORT
.
Additionally, the function also stores the membership of each case in each intersection between theory
and empirics, which can be accessed by setting argument print.data to TRUE.
14
For applied examples of theory evaluation, see, for instance, Sager and Thomann (2017); Schneider and Maerz
(2017).
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 19
# Assuming the theory can be summarized as "EMP*~MA + STOCK",
# perform theory evaluation using the second intermediate solution:
theory.evaluation(theory = "EMP*~MA + STOCK", empirics = sol_yi,
outcome = "EXPORT", sol = 2, print.data=FALSE)
##
## Cases:
## ----------
##
## Covered Most Likely (T*E and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 23 / 76 = 30.26 %
##
## Case Names:
## [1] "Ireland_90" "Japan_90" "USA_90" "Ireland_95"
## [5] "Japan_95" "Switzerland_95" "USA_95" "Denmark_99"
## [9] "Finland_99" "France_99" "Japan_99" "Netherlands_99"
## [13] "Sweden_99" "Switzerland_99" "Belgium_03" "Denmark_03"
## [17] "Finland_03" "France_03" "Japan_03" "Netherlands_03"
## [21] "Sweden_03" "Switzerland_03" "USA_03"
##
## Covered Least Likely (t*E and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 0 / 76 = 0 %
##
## Case Names:
## [1] "No cases in this intersection"
##
## Uncovered Most Likely (T*e and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 12 / 76 = 15.79 %
##
## Case Names:
## [1] "UK_90" "France_95" "Netherlands_95" "Sweden_95"
## [5] "UK_95" "Germany_99" "Ireland_99" "UK_99"
## [9] "USA_99" "Germany_03" "Ireland_03" "UK_03"
##
## Uncovered Least Likely (t*e and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 0 / 76 = 0 %
##
## Case Names:
## [1] "No cases in this intersection"
##
## Inconsistent Most Likely (T*E and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 12 / 76 = 15.79 %
##
## Case Names:
## [1] "Canada_90" "Switzerland_90" "Australia_95" "Canada_95"
## [5] "Denmark_95" "Finland_95" "Australia_99" "Belgium_99"
## [9] "Spain_99" "Australia_03" "Norway_03" "Spain_03"
##
## Inconsistent Least Likely (t*E and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 1 / 76 = 1.32 %
##
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 20
## Case Names:
## [1] "Australia_90"
##
## Consistent Most Likely (T*e and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 23 / 76 = 30.26 %
##
## Case Names:
## [1] "Austria_90" "Belgium_90" "Denmark_90" "Finland_90"
## [5] "France_90" "Germany_90" "Italy_90" "Netherlands_90"
## [9] "Norway_90" "Spain_90" "Sweden_90" "Austria_95"
## [13] "Belgium_95" "Germany_95" "Italy_95" "New Zealand_95"
## [17] "Norway_95" "Spain_95" "Austria_99" "Canada_99"
## [21] "Italy_99" "Canada_03" "Italy_03"
##
## Consistent Least Likely (t*e and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 4 / 76 = 5.26 %
##
## Case Names:
## [1] "New Zealand_90" "New Zealand_99" "Norway_99" "New Zealand_03"
##
##
## Fit:
## ----------
##
## Cons.Suf Cov.Suf PRI Cons.Suf(H)
## emp*bargain*OCCUP 0.909 0.194 0.721 0.865
## BARGAIN*UNI*STOCK 0.796 0.497 0.665 0.704
## emp*UNI*OCCUP*ma 0.919 0.171 0.611 0.894
## emp*occup*STOCK*ma 0.904 0.298 0.802 0.859
## UNI*occup*STOCK*ma 0.894 0.341 0.795 0.853
## Sol.Formula 0.799 0.705 0.691 0.716
## Theory 0.639 0.973 0.515 0.550
## T*E 0.811 0.705 0.707 0.726
## t*E 0.825 0.165 0.423 0.764
## T*e 0.651 0.547 0.419 0.592
## t*e 0.697 0.203 0.232 0.640
Diagnostic tools for clustered data structures
Most of the data analyzed in the social sciences and neighboring disciplines contains structures, or
layers, that might be analytically relevant but are not captured by the models used to analyze that
data. García-Castro and Arino (2016) discuss clustering along a temporal dimension. This can be days,
years, decades, or substantively important periods (before - after a crisis). Clusters can also be of
different origin. For instance, cases can be clustered along geographic units, such as world regions or
subnational units. There can also be clusterings along substantive lines, such as e.g. economic sectors,
parties, political regime types.
Whenever a researcher is not capturing these differences via a condition in her QCA model, she
de facto assumes that the analytic difference does not matter. There are often good reasons to not
include additional conditions into an analysis, with keeping limited diversity at bay being one of them.
It should, however, been put to an empirical test whether it is ok to pool cases across different time
periods, geographic units, and/or substantive areas.
Function
cluster()
provides the tools for performing such a test. It analyzes whether the QCA
solution formula obtained from the pooled data also can be found in each of the sub-populations in
the data. If it can, then pooling the data is fine. If it cannot, then pooling the data is not fine because
it produces a solution that does not hold for all sub-populations. Rather, it is an artifact of having
pooled cases that follow different causal logics. In this case, researchers might decide to drop from
their analysis those sub-populations that do not follow the general pattern or to include a condition
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 21
into their model that captures the difference.
For using the
cluster()
function, researchers need data in the long format, with a column
identifying the unit of analysis and a column identifying the clustering element. In our example, using
the Schneider et al. (2010) data, the column identifyign the units is the
COUNTRY
column, while the
cluster element is stored in the
YEAR
column. After having the data in the long format, we can get the
diagnostic of how our solution holds throughout the different units and clusters by just imputing the
solution (in this case
sol_yi
) in the
cluster()
function, while also specifying the data, the outcome, the
unit identifier, and the cluster identifier. The first part of the output shows the consistency sufficiency
for the overall, pooled sample for each sufficient term and the entire solution to be diagnosed. This
first row of the output should be equivalent to the consistency measures obtained when producing
the solution. The rows below show consistency values of the same terms and solution, but for each
cluster and each unit in part. These are the values we would obtain if the analysis where to be run for
each cluster subsample and each unit subsample separately. For example, the pooled consistency of
sufficient term
emp UNI OCCUP ma
is 0.919 but only 0.733 for the year 1990. This might be an
indication for the researcher that her sufficiency statement is not as consistent and might not work
in the same way for the cluster of cases from 1990. In general, if consistency values between clusters
differ greatly from pooled consistency for a term, we might want to rexamine the setup of the analysis
to account for this. The
Distances:
section of the output reports on how much the parameters of fit
differ from the clusters to the pooled data. Finally, the last part of the output displays a similar table
for coverage, with pooled, between clusters, and within units measures.
# Perfom cluster diagnostic:
# First we need to load the Schneider et. al. (2010) data in the long format:
data(SCHLF)
# This data has a column identifying the unit (country)
# and the clustering element (year):
head(SCHLF)
## EMP BARGAIN UNI OCCUP STOCK MA EXPORT COUNTRY YEAR
## Australia_90 0.07 0.90 1.00 0.68 0.45 0.33 0.19 Australia 1990
## Austria_90 0.70 0.98 0.01 0.91 0.01 0.05 0.25 Austria 1990
## Belgium_90 0.94 0.95 0.14 0.37 0.26 0.14 0.14 Belgium 1990
## Canada_90 0.04 0.21 0.99 0.11 0.62 0.31 0.28 Canada 1990
## Denmark_90 0.59 0.78 0.10 0.55 0.53 0.10 0.34 Denmark 1990
## Finland_90 0.70 0.97 0.20 0.95 0.02 0.13 0.17 Finland 1990
# Get the intermediate solution:
sol_yi <- minimize(SCHLF, outcome = "EXPORT",
conditions = c("EMP","BARGAIN","UNI",
"OCCUP","STOCK", "MA"),
incl.cut1 = .9,
include = "?",
details = TRUE, show.cases = TRUE,
dir.exp = c(0,0,0,0,0,0))
# Get pooled, within, and between consistencies for the intermediate solution:
cluster(data = SCHLF, results = sol_yi, outcome = "EXPORT", unit_id = "COUNTRY",
cluster_id = "YEAR")
## Consistencies:
## ---------------
## emp*bargain*OCCUP BARGAIN*UNI*STOCK emp*UNI*OCCUP*ma
## Pooled 0.909 0.796 0.919
## Between 1990 0.839 0.873 0.733
## Between 1995 0.903 0.727 0.953
## Between 1999 0.928 0.802 1.000
## Between 2003 0.951 0.818 1.000
## Within Australia 1.000 0.405 0.634
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 22
## Within Austria 1.000 1.000 1.000
## Within Belgium 1.000 0.803 1.000
## Within Canada 1.000 1.000 1.000
## Within Denmark 1.000 0.757 1.000
## Within Finland 1.000 0.835 0.957
## Within France 1.000 0.916 1.000
## Within Germany 1.000 1.000 1.000
## Within Ireland 1.000 1.000 1.000
## Within Italy 1.000 0.800 1.000
## Within Japan 1.000 1.000 1.000
## Within Netherlands 1.000 1.000 1.000
## Within NewZealand 0.414 0.875 0.727
## Within Norway 0.965 0.486 0.930
## Within Spain 1.000 0.524 1.000
## Within Sweden 1.000 0.926 1.000
## Within Switzerland 0.880 1.000 1.000
## Within UK 1.000 1.000 1.000
## Within USA 1.000 1.000 1.000
## emp*occup*STOCK*ma bargain*occup*STOCK*ma
## Pooled 0.904 0.913
## Between 1990 0.858 0.903
## Between 1995 0.847 0.884
## Between 1999 1.000 1.000
## Between 2003 0.995 0.916
## Within Australia 0.865 1.000
## Within Austria 1.000 1.000
## Within Belgium 1.000 1.000
## Within Canada 0.587 0.587
## Within Denmark 0.732 1.000
## Within Finland 1.000 1.000
## Within France 1.000 1.000
## Within Germany 1.000 1.000
## Within Ireland 1.000 1.000
## Within Italy 1.000 1.000
## Within Japan 1.000 0.997
## Within Netherlands 1.000 1.000
## Within NewZealand 0.710 0.710
## Within Norway 0.930 0.961
## Within Spain 1.000 0.628
## Within Sweden 1.000 1.000
## Within Switzerland 1.000 1.000
## Within UK 1.000 1.000
## Within USA 1.000 1.000
##
##
## Distances:
## ----------
## emp*bargain*OCCUP BARGAIN*UNI*STOCK
## From Between to Pooled 0.023 0.032
## From Within to Pooled 0.031 0.050
## emp*UNI*OCCUP*ma emp*occup*STOCK*ma
## From Between to Pooled 0.060 0.039
## From Within to Pooled 0.024 0.030
## bargain*occup*STOCK*ma
## From Between to Pooled 0.024
## From Within to Pooled 0.032
##
##
## Coverages:
## ----------
## emp*bargain*OCCUP BARGAIN*UNI*STOCK emp*UNI*OCCUP*ma
## Pooled 0.194 0.497 0.171
## Between 1990 0.231 0.246 0.193
## Between 1995 0.206 0.466 0.271
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 23
## Between 1999 0.174 0.589 0.042
## Between 2003 0.184 0.570 0.214
## Within Australia 0.415 1.000 0.675
## Within Austria 0.075 0.041 0.279
## Within Belgium 0.138 0.959 0.283
## Within Canada 0.328 0.545 0.246
## Within Denmark 0.273 0.894 0.317
## Within Finland 0.059 0.937 0.282
## Within France 0.070 0.805 0.118
## Within Germany 0.236 0.374 0.205
## Within Ireland 0.113 0.352 0.098
## Within Italy 0.173 0.367 0.112
## Within Japan 0.161 0.064 0.161
## Within Netherlands 0.150 0.748 0.183
## Within NewZealand 1.000 0.778 0.667
## Within Norway 0.598 0.978 0.435
## Within Spain 0.204 0.710 0.204
## Within Sweden 0.061 0.761 0.054
## Within Switzerland 0.738 0.244 0.032
## Within UK 0.075 0.282 0.052
## Within USA 0.037 0.045 0.037
## emp*occup*STOCK*ma bargain*occup*STOCK*ma
## Pooled 0.298 0.278
## Between 1990 0.452 0.450
## Between 1995 0.459 0.345
## Between 1999 0.069 0.117
## Between 2003 0.321 0.290
## Within Australia 0.675 0.317
## Within Austria 0.041 0.034
## Within Belgium 0.228 0.103
## Within Canada 0.701 0.701
## Within Denmark 0.480 0.238
## Within Finland 0.218 0.042
## Within France 0.132 0.048
## Within Germany 0.205 0.169
## Within Ireland 0.291 0.118
## Within Italy 0.265 0.184
## Within Japan 0.456 0.947
## Within Netherlands 0.196 0.070
## Within NewZealand 0.611 0.611
## Within Norway 0.435 0.533
## Within Spain 0.204 0.581
## Within Sweden 0.054 0.057
## Within Switzerland 0.093 0.093
## Within UK 0.072 0.072
## Within USA 0.717 0.717
Function
cluster()
can be applyied in a similar fashion for necessary relationships by just setting
argument
necessity
to
TRUE
and inputting the necessary condition to be diagnosed in the field
results
. Additionally, we can also diagnose Boolean expressions by just entering this into the
results
argument.
# Get pooled, within, and between consistencies for ~EMP
# as necessary for EXPORT:
cluster(data = SCHLF, results = "~EMP", outcome = "EXPORT",
unit_id = "COUNTRY", cluster_id = "YEAR", necessity=TRUE)
# Get pooled, within, and between consistencies for EMP*~MA*STOCK
# as sufficient for EXPORT:
cluster(data = SCHLF, results = "EMP*~MA*STOCK", outcome = "EXPORT",
unit_id = "COUNTRY", cluster_id = "YEAR")
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 24
Additional functions
Plotting sufficient terms and solutions, truth table rows, and necessity relations
Package
SetMethods
also includes a function
pimplot()
for plotting each sufficient term and the
solution formula (obtained by using the
minimize()
function in package
QCA
). The function can also
plot truth table rows against the outcome by using arguments
incl.tt
or
ttrows
as in the examples
below. Additionally, the function can plot results obtained from necessity analyses using an object
of class
"sS"
(obtained by using the
superSubset()
function in package
QCA
) by setting argument
necessity to TRUE.15
# Plot the prime implicants of the parsimonious solution:
pimplot(data = SCHF, results = sol_yp, outcome = "EXPORT")
# Plot all truth table rows with a consistency higher than 0.9:
pimplot(data=SCHF, results = sol_yi, incl.tt=0.9, outcome = "EXPORT", sol = 1)
# Plot truth table rows "60" and "61":
pimplot(data=SCHF, results = sol_yi, ttrows =c("60","61"),
outcome = "EXPORT", sol = 1)
# For plotting results of necessity analyses using superSubset,
# the first stept is to obtain an "sS" object:
SUPSUB <- superSubset(SCHF, outcome="EXPORT",
conditions = c("EMP","BARGAIN","UNI","OCCUP","STOCK", "MA"),
relation = "necessity", incl.cut = 0.8)
SUPSUB
# This can be imputed as result and necessity should be set to TRUE:
pimplot(data = SCHF, results = SUPSUB, outcome = "EXPORT",
necessity = TRUE)
QCAradar
Another function included in the package is the
QCAradar()
function which allows visualization of
QCA results or simple Boolean expressions in the form of a radar chart
16
. The function accepts in the
argument
results
sufficient solutions obtained through the function
minimize()
in package
QCA
, or
Boolean expressions involving more than three conditions, as in the second example below.
# Display radar chart for the second intermediate solution:
QCAradar(results = sol_yi, outcome = "EXPORT", fit=TRUE, sol = 2)
Figure 8a shows a radar chart for the second intermediate solution formula. The different sufficient
terms are overlapping on the radar in different shades. For example, we can see the first term
emp*bargain*OCCUP
, as condition
EMP
is missing it is set to 0 for that respective corner, condition
BARG A I N
is missing and set to 0, and condition
OCCU P
is present and set to 1. Since the rest of the
conditions are not specified in this term, they are all left at -.
15The plots resulting from these functions are not included in the paper due to length reasons.
16See Maerz (2017) for an applied example of radar charts.
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 25
0
1
EMP
BARGAIN
UNI
OCCUP
STOCK
MA
Solution Formula
Cons.Suf: 0.799 Cov.Suf: 0.705 PRI: 0.691 Cons.Suf(H): 0.716
(a)
Radar Chart Intermediate Solution
Formula 2
0
1
A
B
C
D
A*~B*C*~D
(b) Expression A* B*C* D
Figure 8: Radar Charts
# Show a radar chart for the following boolean expression "A*~B*C*~D"
QCAradar(results = "A*~B*C*~D")
Figure 8b shows a radar chart for the Boolean expression ”
A∗ ∼ BC∗ ∼ D
”. Conditions A and
C that are present are set to 1 for their respective corners, while conditions B and D that are missing
and set to 0. There are not conditions left at - in this figure, as all conditions are specified.
Indirect calibration
SetMethods
also includes a function for performing the indirect calibration procedure described by
Ragin (2008)
17
. This procedure assumes that the cases included in the analysis have interval-scale raw
scores which can be initially sorted broadly into different levels of fuzzy set membership. Subsequently,
the raw scores are transformed into calibrated scores using a binomial or a beta regression. Assuming
that vector
x
contains the initial raw scores, while vector
x_cal
contains the rough grouping of those
values into set membership scores, function
indirectCalibration()
can produce a vector of fuzzy-set
scores aby fitting the xto x_cal using a binomial regression if binom is set to TRUE.
# Generate fake data
set.seed(4)
x <- runif(20, 0, 1)
# Find quantiles
quant <- quantile(x, c(.2, .4, .5, .6, .8))
# Theoretical calibration
x_cal <- NA
x_cal[x <= quant[1]] <- 0
x_cal[x > quant[1] & x <= quant[2]] <- .2
x_cal[x > quant[2] & x <= quant[3]] <- .4
x_cal[x > quant[3] & x <= quant[4]] <- .6
x_cal[x > quant[4] & x <= quant[5]] <- .8
x_cal[x > quant[5]] <- 1
x_cal
# Indirect calibration (binomial)
17The QCA package can also perform indirect calibration through its calibrate function
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 26
a <- indirectCalibration(x, x_cal, binom = TRUE)
a
Conclusions
In this article, we have presented the main functionalities of
R
package
Setmethods
. It is true that
starting to perform QCA in
R
is more onerous than starting with a point-and-click software. Yet, the
flexibility offered by
R
is also its strength, especially for a young method like QCA. As set methods
continue to develop, software implementations need to be updated and improved at a fast rate.
Package
SetMethods
is designed to do precisely this: providing a tool for implementing new ideas
that enhance set-theoretic analyses for applied researchers.
Acknowledgements
We thank Juraj Medzihorsky and Mario Quaranta for their intput into previous versions of the
SetMethods
package. We also thank the participants of various ECPR Summer and Winter Schools in
Methods and Techniques whose questions and testing are continuously improving the package.
Bibliography
L. Cronqvist. Tosmana: Tool for Small-n Analysis, Version 1.3.2.0 [Computer Program]. Record ID:
8880, 2011. [p1]
A. Dusa. User Manual for the QCA(GUI) Package in R. Journal of Business Research, 60(5):576–586, 2007.
[p1]
R. García-Castro and M. A. Arino. A General Approach to Panel Data Set-Theoretic Research. Journal
of Advances in Management Sciences & Information Systems, 2:63–76, 2016. [p20]
G. Goertz and J. Mahoney. A Tale of Two Cultures: Contrasting Qualitative and Quantitative Paradigms.
Princeton University Press, Princeton, N.J, 2012. [p1]
S. Maerz. The Many Faces of Authoritarian Persistence. PhD thesis, Central European University, Doctoral
Dissertation, Central European University, 2017. [p24]
J. Medzihorsky, I.-E. Oana, M. Quaranta, and C. Q. Schneider. SetMethods: Functions for Set-
Theoretic Multi-Method Research and Advanced QCA, R Package, Version 2.1. https://cran.r-
project.org/web/%0Apackages/SetMethods/index.html, 2016. [p1]
K. S. Mikkelsen. Fuzzy-Set Case Studies. Sociological Methods & Research, 46(3):422–455, 2017. ISSN
0049-1241. URL https://doi.org/10.1177/0049124115578032. [p2]
C. C. Ragin. The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. University
of California Press, Berkeley, 1987. ISBN 0520058348. [p15,17]
C. C. Ragin. Redesigning Social Inquiry: Fuzzy Sets and Beyond. University of Chicago Press, Chicago,
2008. [p2,15]
C. C. Ragin, K. A. Drass, and S. Davey. Fuzzy-Set/Qualitative Comparative Analysis 2.0.
http://www.u.arizona.edu/ cragin/fsQCA/, 2006. [p1]
B. Rihoux and B. Lobe. The Case for Qualitative Comparative Analysis (QCA): Adding Leverage for
Thick Cross-Case Comparison. In D. Byrne and C. C. Ragin, editors, Sage Handbook Of Case-Based
Methods, pages 222–242. Sage, 2009. [p2]
B. Rihoux, P. Alamos, D. Bol, A. Marx, and I. Rezsohazy. From Niche to Mainstream Method? A
Comprehensive Mapping of QCA Applications in Journal Articles from 1984 to 2011. Political
Research Quarterly, 66(1):175–184, 2013. [p1]
I. Rohlfing and C. Q. Schneider. Improving Research On Necessary Conditions: Formalized Case
Selection for Process Tracing after QCA. Political Research Quarterly, 66(1):220–235, 2013. [p2]
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLE 27
I. Rohlfing and C. Q. Schneider. A Unifying Framework for Causal Analysis in Set-Theoretic Multi-
Method Research. Sociological Methods & Research, 47(1):37–63, 2018. URL
https://doi.org/10.
1177/0049124115626170. [p2]
F. Sager and E. Thomann. Multiple Streams in Member State Implementation: Politics, Problem
Construction and Policy Paths in Swiss Asylum Policy. Journal of Public Policy, 37(3):287–314, 2017.
ISSN 0143-814X. URL https://doi.org/10.1017/s0143814x1600009x. [p18]
C. Q. Schneider and S. Maerz. Legitimation, Cooptation, and Repression and the Survival of Electoral
Autocracies. Zeitschrift für Vergleichende Politikwissenschaft, 11:213–235, 2017. ISSN 1865-2646. URL
https://doi.org/10.1007/s12286-017- 0332-2. [p18]
C. Q. Schneider and I. Rohlfing. Combining QCA and Process Tracing in Set-Theoretic Multi-Method
Research. Sociological Methods and Research, 42(4):559–597, 2013. [p2]
C. Q. Schneider and I. Rohlfing. Case Studies Nested in Fuzzy-Set QCA on Sufficiency: Formalizing
Case Selection and Causal Inference. Sociological Methods & Research, 45(3):526–568, 2016. ISSN
0049-1241. URL https://doi.org/10.1177/0049124114532446. [p2]
C. Q. Schneider and C. Wagemann. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative
Comparative Analysis. Cambridge University Press, Cambridge, 2012. [p16,17]
M. R. Schneider, C. Schulze-Bentrop, and M. Paunescu. Mapping the Institutional Capital of High-Tech
Firms: A Fuzzy-Set Analysis of Capitalist Variety and Export Performance. Journal of International
Business Studies, 41(2):246–266, 2010. ISSN 0047-2506. URL
https://doi.org/10.1057/jibs.2009.
36. [p1,7,21]
E. Thomann. Customizing Europe: Transposition as Bottom-up Implementation. Journal of European
Public Policy, 22(10):1368–1387, 2015. URL
https://doi.org/10.1080/13501763.2015.1008554
.
[p16]
Ioana-Elena Oana
Central European University (CEU)
Nador utca 9, 1051 Budapest
Hungary
Oana_Ioana-Elena@phd.ceu.edu
Carsten Q. Schneider
Central European University (CEU)
Nador utca 9, 1051 Budapest
Hungary
schneiderc@ceu.edu
The R Journal Vol. XX/YY, AAAA 20ZZ ISSN 2073-4859
... Our QCA combined fuzzy and crisp sets (Ragin, 1987). We used R Studio software with the packages QCA and SetMethods (Dusa, 2019;Oana & Schneider, 2018) to examine which conditions and configurations of conditions were necessary and sufficient for national governments to adopt pro-(outcome, Y) or anti-RoL (absence of the outcome, ∼Y) enforcement attitudes. To study the necessity of the listed conditions explaining preferences for RoL enforcement, a consistency threshold of 0.9 was set, which meant that at least 90% of all cases with the outcome of interest should exhibit a necessary condition. ...
... conditions (Schneider & Wagemann, 2012) and we do so separately for the outcomes high diversity (DIVERSE), low diversity (~DIVERSE), high segregation (SEG), low segregation (~SEG). All this analysis is done using R-software, packages QCA (Thiem & Dusa, 2013) and SetMethods (Oana & Schneider, 2018). ...
... Consequently, the causal combinations for the SCCR may differ according to the intrinsic SC environment of each country. To determine whether sufficient pattern (or necessary pattern) of the FsQCA solution terms derived from pooled data holds across the different clusters, we introduced the cluster diagnostics argument in the "SetMethods" package in R software (Oana & Schneider, 2018). This allowed us to determine whether it would be better to analyze them separately by income level (Beynon et al., 2020;Bhattacharya, 2023;Duşa, 2019). ...
Article
Full-text available
This study derives asymmetric configurations of risk-mitigating conditions for the supply chain country risk (SCCR) using a configurational comparative method. This study analyzed five risk mitigating conditions to reduce country-specific supply chain risks and explored the causal complexities of risk mitigants for SCCR employing fuzzy-set qualitative comparative analysis (FsQCA). The causal configurations of FsQCA confirmed that regulatory and infrastructure quality are critical factors of sufficiency patterns for the presence (high level) or absence (low level) of SCCR. Additionally, the control of corruption in the SC network is a prerequisite for reducing SCCR. Countries have different causal configurations for mitigating SCCR. Thus, each country should consider its specificities when determining the optimal configurations of causal conditions to respond better to low or high levels of SCCR. This is the first attempt to explore the causal complexity in various aspects of mitigating SCCR and to propose the necessary and sufficient conditions for risk-mitigating configurations for SCCR.
... conditions (Schneider & Wagemann, 2012) and we do so separately for the outcomes high diversity (DIVERSE), low diversity (~DIVERSE), high segregation (SEG), low segregation (~SEG). All this analysis is done using R-software, packages QCA (Thiem & Dusa, 2013) and SetMethods (Oana & Schneider, 2018). ...
Chapter
Full-text available
What variation can be identified in urban diversities? In this chapter we provide an empirical mapping of diversity characteristics in cities as outlined in Chap. 2 : diversity of origins and residential segregation between people with and without a (first generation) migration background. Can we identify cities that have distinct combinations of the two main dimensions? Inductive analysis in this chapter will provide a second step towards developing a typology of cities of migration. At the end of this chapter, we will identify clusters of cities and select typical cities to be examined more in-depth in subsequent chapters.
... The analysis was carried out using fsQCA 3.0 software [55] and the R package SetMethods [56]. ...
Article
Full-text available
Background The effectiveness of crisis response can be influenced by various structural, cultural, and functional aspects within a social system. This study uses a configurational approach to identify combinations of sociopolitical conditions that lead to a high case fatality rate (CFR) of COVID-19 in OECD countries. Methods A Fuzzy set qualitative comparative analysis (QCA) is conducted on a sample of 38 OECD countries. The outcome to be explained is high COVID-19 CFR. The five potentially causal conditions are level of democracy, state capacity, trust in government, health expenditure per capita, and the median age of population. A comprehensive QCA robustness test protocol is applied, which includes sensitivity ranges, fit-oriented robustness, and case-oriented robustness tests. Results None of the causal conditions in both the presence and negation form were found to be necessary for high or low levels of COVID-19 CFR. Two different combinations of sociopolitical conditions were usually sufficient for the occurrence of a high CFR of COVID-19 in OECD countries. Low state capacity and low trust in government are part of both recipes. The entire solution formula covers 84 percent of the outcome. Some countries have been identified as contradictory cases. The explanations for their COVID-19 CFR require more in-depth case studies. Conclusions From a governance perspective, the weakness of government in effectively implementing policies, and the citizens’ lack of confidence in their government, combined with other structural conditions, serve as barriers to mounting an effective response to COVID-19. These findings can support the idea that the effects of social determinants of COVID-19 outcomes are interconnected and reinforcing.
Article
This article examines the capability of various welfare states to mitigate youth vulnerability, operationalized as a low NEET rate. It aims to complement existing empirical knowledge with a novel set of indicators and Europe-wide configurational comparison of youth welfare regimes. A QCA-based analysis of 26 European countries revealed two routes with different sets of compensatory and social investment policies that lead to the effective mitigation of the NEET rate. The study confirmed that generous social benefits for young unemployed people are a crucial element in every ‘route’ to keep the NEET rate low. Beyond this compensatory measure, successful policy configurations revealed the growing convergence of skills regimes in the pursuit of inclusive education policy design. We also found evidence that in mitigating youth vulnerabilities, housing support to young adults can compensate for active labour market policy measures. These findings have implications for policymakers who must take a holistic approach in devising policies and being mindful of the interplay between different policies. The study also provides insights into contemporary dynamics of the youth welfare regimes by making associations with growth regimes and housing regimes.
Article
To understand ethical consumer choice, it should be studied from a holistic, configurational perspective. We use fuzzy-set qualitative comparative analysis (fsQCA) ( N = 715) with a randomized experiment in the context of animal welfare to examine (a) the interdependencies of factors aiding or impeding ethical choice, and (b) whether ethical choices occur differently in a loss frame than in a gain frame. We identify several alternative pathways to ethical choice and non-choice, and within these pathways, we reveal substitution effects, complementarities, and contingencies, reflecting the complexities of consumer choice. Furthermore, we demonstrate how ethical choice results more easily in a loss frame, and non-choice more easily in a gain frame, but how framing can also be irrelevant in certain situations. We contribute theoretically to ethical consumer choice in general and to food choice in particular by showing how it is the interplay of several factors in complex configurations that determines whether the situation favors ethical choice or non-choice. We outline important management and policy implications of our findings.
Article
Full-text available
Across Europe, contention has emerged over the Istanbul Convention, a treaty combatting violence against women. The Convention has become a main arena for contention over gender and sexual equality. Right-wing forces mobilize nationally—and transnationally—to advocate for traditional values and oppose so-called ‘gender ideology’, while progressive actors resist efforts to curtail women’s rights. Consequently, while many have ratified the Convention, several countries have not. This article asks which causes motive ratification; which causes underlie non-ratification? We present a qualitative comparative analysis (QCA) on 40 European states to disentangle the causal complexity of ratification decisions. We identify four pathways for ratification, driven by feminist egalitarian norms, international conditionality, pro-European governments at odds with social opposition, and societies unwilling to mobilize for conservative religious institutions. We unpack these causal patterns in four minimalist case studies. The article reveals causation underlying contention between pro-gender, anti-gender, and state actors, and resultant policy outcomes.
Article
Full-text available
Conceptualizing the “three pillars of stability”, Gerschewski (2013) proposes legitimation, cooptation and repression as the fundamental principles of lasting autocratic rule. Recent studies put this so-called WZB model to an empirical test and probe the effects these three factors have on regime survival in light of autocratic elections (Lueders and Croissant 2014). Their finding that the WZB model has only limited explanatory power in competitive autocracies has sparked a broader debate about the empirical application of the model as such (Kailitz and Tanneberg 2015; Lueders and Croissant 2015). Our paper contributes to this debate in several ways: (1) rather than analyzing each pillar’s effect in isolation, we investigate their combined effect; (2) rather than assuming causal symmetry, we expect to find different explanations for autocratic stability and breakdown, respectively; (3) by focusing on configurations of the pillars, we are in the position to identify distinct types – or “worlds” (Gerschewski 2013) – of (un)stable autocracies. Using the data from Lueders and Croissant (2014) on elections in hegemonic and competitive authoritarian regimes between 1990 and 2009, we apply fuzzy-set Qualitative Comparative Analysis to empirically investigate which, if any, combination of the dimensions of legitimation, cooptation, and repression lead to the survival of autocratic regimes and which ones to their breakdown. Our findings suggest that single pillars in isolation are causally irrelevant; that the WZB model is, indeed, capable of identifying stable autocracy types but it does not perform well in identifying the reasons why autocracies break down; and that the two viable types of autocracies identified by us are meaningfully distinguished by their different legitimation strategies.
Article
Full-text available
Set-theoretic methods and Qualitative Comparative Analysis (QCA) in particular are case-based methods. There are, however, only few guidelines on how to combine them with qualitative case studies. Contributing to the literature on multi-method research (MMR), we offer the first comprehensive elaboration of principles for the integration of QCA and case studies with a special focus on case selection. We show that QCA's reliance on set-relational causation in terms of necessity and sufficiency has important consequences for the choice of cases. Using real world data for both crisp-set and fuzzy-set QCA, we show what typical and deviant cases are in QCA-based MMR. In addition, we demonstrate how to select cases for comparative case studies aiming to discern causal mechanisms and address the puzzles behind deviant cases. Finally, we detail the implications of modifying the set-theoretic cross-case model in the light of case-study evidence. Following the principles developed in this article should increase the inferential leverage of set-theoretic MMR.
Article
Full-text available
European Union (EU) implementation research has neglected situations when member states go beyond the minimum requirements prescribed in EU directives (gold-plating). The top-down focus on compliance insufficiently accounts for the fact that positive integration actually allows member states to transcend the EU’s requirements to facilitate context-sensitive problem-solving. This study adopts a bottom-up implementation perspective. Moving beyond compliance, it introduces the concept of ‘customization’ to depict how transposition results in tailor-made solutions in a multi-level system. The study analyzes the hitherto unexplored veterinary drug regulations of four member states. Using Fuzzy-set Qualitative Comparative Analysis and formal theory evaluation, this paper assesses how policy and country-level factors interact. Results reveal the countries’ different customization styles. The latter simultaneously reflect the interplay of domestic politics with institutions, and the ‘fit’ of EU regulatory modes with domestic, sectoral interventionist styles. Compliance approaches cannot fully explain these fine-grained patterns of Europeanization. http://www.tandfonline.com/doi/abs/10.1080/13501763.2015.1008554#.VQ6hShtFDIU
Article
Qualitative Comparative Analysis (QCA) is a method for cross-case analyses that works best when complemented with follow-up case studies focusing on the causal quality of the solution and its constitutive terms, the underlying causal mechanisms, and potentially omitted conditions. The anchorage of QCA in set theory demands criteria for follow-up case studies that are different from those known from regression-based multimethod research (MMR). Based on the evolving research on set-theoretic MMR, we introduce principles for formalized case selection and causal inference after a fuzzy-set QCA on sufficiency. Using an empirical example for illustration, we elaborate on the principles of counterfactuals for intelligible causal inference in the analysis of three different types of cases. Furthermore, we explain how case-based counterfactual inferences on the basis of QCA solutions are related to counterfactuals in the course of processing a truth table in order to produce a solution. We then flesh out two important functions that ideal types play for QCA-based case studies: First, they inform the development of formulas for the choice of the best available cases for with-case analysis and, second, establish the boundaries of generalization of the causal inferences.
Article
This article applies the multiple streams approach to a multilevel implementation setting to analyse why Swiss member states enabled the labour market integration of asylum seekers between 2000 and 2003. It argues for integrating the social construction of target groups into the problem stream, and complementing the policy stream with inherited policy paths. A fuzzy-set qualitative comparative analysis reveals that institutionalised policy paths trump politics in explaining the enabling of labour market integration of asylum seekers. Conversely, a weak political left combined with negative problem constructions aces out policy paths in explaining restrictions of labour market integration. The results illustrate how social constructions influence problem framing. Historical institutionalism theory helps us understand how inherited policy logics feed back with actors’ problem perceptions. Because of the parallels in their multilevel systems, political contexts and problem pressures, this historical case offers salient lessons for the refugee crisis in the European Union today.
Article
The combination of Qualitative Comparative Analysis (QCA) with process tracing, which we call set-theoretic multimethod research (MMR), is steadily becoming more popular in empirical research. Despite the fact that both methods have an elected affinity based on set theory, it is not obvious how a within-case method operating in a single case and a cross-case method operating on a population of cases are compatible and can be combined in empirical research. There is a need to reflect on whether and how set-theoretic MMR is internally coherent and how QCA and process tracing can be integrated in causal analysis. We develop a unifying foundation for causal analysis in set-theoretic MMR that highlights the roles and interplay of QCA and process tracing. We argue that causal inference via counterfactuals on the level of single cases integrates QCA and process tracing and assigns proper and equally valuable roles to both methods.
Article
Contemporary case studies rely on verbal arguments and set theory to build or evaluate theoretical claims. While existing procedures excel in the use of qualitative information (information about kind), they ignore quantitative information (information about degree) at central points of the analysis. Effectively, contemporary case studies rely on crisp sets. In this article, I make the case for fuzzy-set case studies. I argue that the mechanisms that are the focal points of contemporary case study methods can be modeled as set-theoretic causal structures. I show how case study claims translate into sufficiency statements. And I show how these statements can be evaluated using fuzzy-set tools. This procedure permits the use of both qualitative and quantitative information throughout a case study. As a consequence, the analysis can determine whether one or more cases are both qualitatively and quantitatively consistent with its claims. Or whether some or all cases are consistent by kind but not by degree.