Content uploaded by Fan Li

Author content

All content in this area was uploaded by Fan Li on Jul 17, 2015

Content may be subject to copyright.

Do debit cards decrease cash demand? Evidence from a causal

analysis using Principal Stratiﬁcation

Andrea Mercatanti 1Fan Li 2

ABSTRACT

It has been argued that innovation in transaction technology may modify the cash holding be-

haviour of agents, as debit card holders may either withdraw cash from ATMs or purchase

items using POS devices at retailers. In this paper, within the Rubin Causal Model, we inves-

tigate the causal effects of the use of debit cards on the cash inventories held by households

using data from the Italy Survey of Household Income and Wealth (SHIW). We adopt the

principal stratiﬁcation approach to incorporate the share of debit card holders who do not use

this payment instrument. We use a regression model with the propensity score as the single

predictor to adjust for the imbalance in observed covariates. We further develop a sensitivity

analysis approach to assess the sensitivity of the proposed model to violation to the key uncon-

foundedness assumption. Our empirical results suggest statistically signiﬁcant negative effects

of debit cards on the household cash level in Italy.

KEY WORDS: causal inference, potential outcomes, principal stratiﬁcation, propensity score,

sensitivity, unconfoundedness.

1Economic and Financial Statistics Department, Bank of Italy, Rome, Italy. Email: mercatan@libero.it

2Department of Statistical Science, Duke University, Durham, NC, USA. Email: ﬂi@stat.duke.edu

Mercatanti’s research was partially supported by the U.S. National Science Foundation (NSF) under Grant DMS-

1127914 to the Statistical and Applied Mathematical Sciences Institute (SAMSI). Li’s research was partially

funded by NSF-SES grants 1155697 and 1424688. The content is solely the responsibility of the authors and

does not necessarily represent the ofﬁcial views of Bank of Italy, SAMSI or NSF.

1

1 Introduction

Since the early 1970s, the diffusion of payment cards, such as debit, credit, pre-paid cards,

raises the concern whether our societies are transforming into cashless societies. Cash has not

disappeared in the meanwhile, however the interest in the question remains, particularly for

what concerns the wide diffusion of debit cards. Debit cards are deﬁned as cards enabling

the holder to have purchases directly charged to funds on his account at a deposit-banking

institution (C.P.S.S., 2001). The fact that debit cards allow withdraws at ATM points justiﬁes

the need of less cash holdings; at the same time the possibility to pay directly at POS devices

at retailers makes this payment instrument a close substitute for cash. Moreover, both of the

operations, withdrawal and payment, involve very low, or most of the times null, costs for

debit cards compared to the rest of noncash instruments.

The possible effect of the use of debit cards on cash holdings has also important implica-

tions for central banks as the sole issuer of cash. First, substitution of cash for cards could lead

to decrease in seignorage incomes. Because banknotes represent non-interest-bearing central

bank liabilities, early concerns emerged among policymakers fearing that the cash shrinkage

would have lead to a decline in central bank asset holdings and, consequently, in seigniorage

revenue (Stix, 2004). Second, in low interest rate regimes, the cash-card substitution can be

very sensitive to interest rate, and consequently interest rate adjustments as a monetary policy

to inﬂuence bank lending may be difﬁcult (Markose and Loke, 2003; Yilmazkuday and Yaz-

gan, 2009). The existing literature has been focused on the impact that payment cards may

have on the functional relationship of the cash demand to other variables that central banks use

to implement monetary policy (Duca and Whitesell, 1995; Attanasio et al., 2002; Lippi and

Secchi, 2009; Alvarez and Lippi, 2009) while relatively little attention has been directed to

quantiﬁcation of the effects of debit cards on the level of cash inventories held by individuals

or households.

2

This paper aims to evaluate the causal effect of debit cards on cash inventories using data

from the Survey on Household Income and Wealth (SHIW), a bi-annually national survey run

by Bank of Italy on several aspects of Italian household economic and ﬁnancial behaviour.

We adopt the Rubin Causal Model (RCM) (Rubin, 1974, 1978) to conduct the causal analysis.

Under the RCM, for each post-treatment variable, each unit has a potential outcome corre-

sponding to each treatment level, and a causal effect on that post-treatment variable is deﬁned

as a comparison between the corresponding potential outcomes on a common set of units. A

critical requirement of the RCM is that a “cause” (or a treatment without distinction) must

or at least conceptually be manipulatable – the principle of “no causation without manipula-

tion” (Holland, 1986). This raises a conceptual challenge in our application because using

debit cards is largely a voluntary activity, and it is not clear that we could expose a person to

use debit cards in any veriﬁable sense. Instead it is more natural to conceive the possession

of cards—a status controlled by banks that issue cards—as the treatment variable. However,

the primary research interest obviously lies in the effect of using cards rather than possessing

cards. In fact, a signiﬁcant portion of Italian households in the SHIW who have debit cards

do not use them. Moreover, even among card users, there is various degree of usage: some

use debit cards only to withdraw cash from ATM occasionally, while others use them fre-

quently for both cash withdraw and payment at retailers. The issue is not limited to Italy, for

example data from a survey conducted in Austria in 2003 reports evident shares of non-users

and different use frequencies among debit cards holders (Stix, 2004). Naturally there may be

heterogeneous effects of debit cards on cash holding among different groups of card users.

We propose to address these complications via principal stratiﬁcation (Frangakis and Ru-

bin, 2002), a uniﬁed framework for causal inference in the presence of post-treatment vari-

ables. The key is to treat the possession of debit cards as the treatment and the use of cards

as a post-treatment intermediate variable between the treatment and the outcome. Principal

3

stratiﬁcation is a cross-classiﬁcation of the population based on the joint potential values of

an intermediate variable under each of the treatment, i.e., principal strata, and the interest lies

in estimating causal effects local to certain principal strata. For example, the focus in our

application is the causal effect in the stratum of units who possess and use debit cards – the

“compliers”. This is similar to the instrumental variable approach to noncompliance in ran-

domized experiments (Angrist et al., 1996; Imbens and Rubin, 1997; Hirano et al., 2000), a

special case of principal stratiﬁcation. Recently a rapid growing literature has extended prin-

cipal stratiﬁcation to a wide range of settings in both experimental and observational studies,

including “censoring by death” (Rubin, 2006; Mattei and Mealli, 2007; Zhang et al., 2009),

missing data (Mattei et al., 2014), surrogate endpoints (Gilbert et al., 2003; Li et al., 2009,

2011), mediation analysis (Gallop et al., 2009; Elliott et al., 2010), and designs (Mattei and

Mealli, 2011). More complex settings such as ordinal or continuous intermediate variables

have also been explored (Frangakis et al., 2004; Jin and Rubin, 2008; Grifﬁn et al., 2008;

Schwartz et al., 2011).

Because only one of the two potential outcomes is observed for each unit, principal strata

are latent, and thus identiﬁcation of principal causal effects relies on assumptions such as

unconfoundedness and exclusion restrictions. Unconfoundedness is particularly crucial in our

study given the observational nature of SHIW. We tackle this issue in two steps. First, we

propose a model-based regression approach where, to reduce the risk of mis-speciﬁcation due

to the imbalance in the multiple covariates between treatment groups, we use the estimated

propensity score (Rosenbaum and Rubin, 1983b) as the sole predictor. Then, we design and

conduct a comprehensive sensitivity analysis on unconfoundedness: we encode the degree

of violation to unconfoundedness as sensitivity parameters in the assumed regression model

and examine how estimates change over plausible range of the sensitivity parameters. This

vein of sensitivity analysis in causal inference originates from Rosenbaum and Rubin (1983a).

4

In particular, our method is built upon that of Schwartz et al. (2012), who elaborated the

various pathways of confounding and examined the sensitivity to both unconfoundedness and

exclusion restriction, in the context of principal stratiﬁcation. Alternative sensitivity analysis

approaches have also been developed in the literature (e.g. Grilli and Mealli, 2008; Sj¨olander

et al., 2009; Jo and Vinokur, 2011; Stuart and Jo, 2013; Gilbert et al., 2003).

The rest of the article is organized as follows. In Section 2, we introduce the basic setup,

estimand and assumptions under principal stratiﬁcation. In Section 3, we propose a model-

based approach for estimation and a sensitivity analysis under the model. Section 4 presents

the data and the empirical results. Section 5 concludes.

2 Basic setup under principal stratiﬁcation

2.1 Notions and estimands

Because debit cards are typically issued to individuals, the natural statistical units would be

individuals possessing debit cards in our analysis. However, SHIW collects information only

on the household level. To mitigate the problem, we set household as the unit, but limit the

sample of treated units to the households possessing one and only one debit card during the

study period. This ensures a possible effect on household cash holding will be due only to the

individual possessing the card, who is usually the head of the household.

Consider the study sample consists of Nunits. For unit i, let Zibe the binary treatment,

equal to 1 if the household possesses one and only one debit card and 0 otherwise; Dibe the

binary post-treatment variable, equal to 1 if the household uses a debit card and 0 otherwise3;

Yibe the outcome, the average amount of cash held by the household; and Xibe the set of

pre-treatment covariates. Because utilization of debit cards is a post-treatment event, we can

3We will deﬁne two different characterization for the use of debit cards in Section 4.1.

5

deﬁne its corresponding potential outcomes: let Di(1) and Di(0) be the potential debit card

usage status if unit idoes and does not have a card, respectively, equal 1 if the unit uses

the card and 0 otherwise. Similarly, let Yi(1) and Yi(0) be the potential outcome, if unit i

does and does not possess a debit card, respectively. These notations of potential outcomes

imply the acceptance of the Stable Unit Treatment Value Assumption (SUTVA; Rubin, 1980),

that is, no interference between the units and no different versions of a treatment. SUTVA is

deemed reasonable in this study, because the holding of debit cards in one household does not

seem to affect the potential debit card utilization or cash inventory of other households. For

each unit i, we only observe the potential outcomes corresponding to the observed treatment:

Di=Di(Zi),Yi=Yi(Zi).

A principal stratiﬁcation with respect to a post-treatment intermediate variable Dis a parti-

tion of units based on the joint potential values of D, i.e., principal strata: Si= (Di(0), Di(1)).

When both Zand Dare binary, there are four principal strata in theory: Si∈ {(0,0),(0,1),(1,0),(1,1)}.

In this application, obviously one can not use debit cards without possessing one, and also there

are units who possess cards but not use them. As such, our study sample automatically satisﬁes

a strong ‘monotonicity’ condition:

Assumption 1 (Monotonicity). (1) Di(0) = 0; (2) 0<Pr(Di= 0|Xi, Zi= 1) <1, for all i.

Under monotonicity,there are only two principal strata: Si= (0,0) = n, units who would not

use debit cards irrespective of whether possessing one, and Si= (0,1) = c, units who would

use debit cards if possessing one but would not use if otherwise. We refer to these two strata

as never-users and compliers, respectively, following the nomenclature of noncompliance in

Angrist et al. (1996).

By deﬁnition the principal stratum membership Siis not affected by the treatment assign-

ment. Therefore, comparisons of Y(1) and Y(0) within a principal stratum are well-deﬁned

6

causal effects because they compare quantities deﬁned on a common set of units. Here our

interest lies in estimating the causal effects for the treated compliers, that is, units possessing

and using debit cards. Thus we deﬁne the targeted estimand to be the average causal effect of

the treated compliers (CATT):

CATT ≡E[Yi(1)−Yi(0) |Si=c, Zi= 1] = Ex|Z=1{E[Yi(1)−Yi(0) |Si=c, Zi= 1,Xi=x]}.

(1)

Analogously we can deﬁne the compliers average treatment effect (CATE), also known as the

local average treatment effect (LATE) Imbens and Angrist (1994):

CATE ≡E[Yi(1) −Yi(0) |Si=c] = Ex{E[Yi(1) −Yi(0) |Si=c, Xi=x]}.(2)

Both CATE and CATT are intention-to-treat (ITT) effects, representing effects of possessing

debit cards, rather than effects of using cards. To attribute these effects to the use of cards,

we make the following exclusion restriction assumption for the compliers, following the es-

tablished literature in the IV approach to noncompliance (e.g. Angrist et al., 1996; Imbens and

Rubin, 1997):

Assumption 2 (Exclusion Restriction for compliers). For all units with Si= (0,1), the effect

of card possession is only through using the card.

A formalized version of this assumption, which requires double-indexed notations, is given

in (Imbens and Rubin, 2015) (Assumption 23.4). This type of exclusion restriction is in fact

routinely made, often implicitly, in randomized experiments with full compliance.

2.2 Identiﬁcation of the causal effects

For the same unit, only one of the two potential outcomes (Di(0), Di(1)) is observed, and

thus the principal stratum Siis at most partially observed. The following assumptions are

7

necessary for establishing the nonparametric identiﬁability of CATT and CATE (Imbens and

Angrist, 1994).

Assumption 3 (Overlap). 0<Pr(Zi= 1|Xi)<1, for all i.

Assumption 4 (Unconfoundedness). {Yi(1), Yi(0), Di(1), Di(0)} ⊥ Zi|Xi.

Assumption 5 (Exclusion Restriction for never-takers)

E[Yi(1)|Xi, Si=n] = E[Yi(0)|Xi, Si=n].

Assumption 3 is the standard overlap condition, stating that each unit has a positive prob-

ability of possessing a card. Assumption 4 states that the treatment assignment is independent

of all the potential outcomes conditional on observed pre-treatment variables; it is also known

as the assumption of “no unmeasured confounders”. Assumption 5 states that possessing debit

cards does not affect the outcome for never-takers. Though both are exclusion restrictions,

Assumption 5 and Assumption 2 are of very different nature: the former is a necessary con-

dition for identifying the causal effects, whereas the latter is made solely for interpreting the

ITT effects as effects of the actual treatment received. More discussions on the difference can

be found in Mealli and Pacini (2013) and Imbens and Rubin (2015)(Chapter 23).

In randomized experiments, Assumptions 3-5 automatically hold without conditioning on

covariates, and CATT equals CATE. In observational studies, unconfoundedness generally

relies on conditioning on a number of observed covariates, and CATT and CATE are usually

different. Consequently analysts often adopt regression models to adjust for the covariates to

estimate the causal effects. However, two complications often arise, as in this application: ﬁrst,

distributions of some covariates can be signiﬁcantly imbalanced between treatment groups,

leading a regression analysis to rely heavily on model speciﬁcation (Imbens, 2004); second,

unconfoundedness may still be questionable even conditioning on a large number of observed

8

covariates. To address the ﬁrst issue, we propose to combine the propensity score method and

regression adjustment. To address the second issue, we conduct a comprehensive sensitivity

analysis around the regression models.

3 Model-based estimation and sensitivity analysis

3.1 Models

In the context of principal stratiﬁcation, six quantities are associated with each unit: Yi(0),

Yi(1),Di(0),Di(1),Zi,Xi. Under nonconfoundedness, the joint distribution of these quanti-

ties can be written as:

Pr(Yi(0), Yi(1), Di(0), Di(1), Zi, Xi)

= Pr(Yi(0), Yi(1)|Si, Xi) Pr(Si|Xi) Pr(Zi|Xi) Pr(Xi)

= Pr(Yi(0), Yi(1)|Si, e(Xi)) Pr(Si|e(Xi))e(Xi) Pr(Xi),(3)

where e(Xi) = Pr(Zi= 1|Xi)is the propensity score. In the analysis, we will condition on the

observed distribution of covariates, so that Pr(Xi)does not need to be modeled. Therefore,

three models are required for inference: one for the propensity score, one for the principal

strata given the propensity score, and one for the potential outcomes given principal stratum

and the propensity score.

Compared to the nonparametric approach, the model-based approach is more ﬂexible, can

reduce bias and improve precision, and also offers conceptually straightforward ways to incor-

porate complexities like multilevel structure, multiple outcomes, and latent variables. How-

ever, parametric models on a large number of pre-treatment variables are also more sensi-

tive to mis-speciﬁcation (Rubin, 1979). In particular, imbalance in the pre-treatment vari-

ables between treatment groups or between different strata renders causal effects estimation to

9

rely heavily on extrapolation, and consequently, on the functional speciﬁcation. Because the

propensity score reduces the dimension from the space of covariates to one, and balance of the

propensity score leads to balance of each observed covariate, an attractive alternative is to use

the estimated propensity score as the sole covariate (e.g. Heckman et al., 1998). This method

is not as efﬁcient as the regression estimator based on adjustment for all covariates when the

model is correctly speciﬁed (Hahn, 1998), and one has to estimate the propensity score, which

is also subject to mis-speciﬁcation. However, simulation studies in Mercatanti and Li (2014)

show that, in the presence of covariate imbalance between treatment groups, mis-speciﬁcation

of the regression model leads much larger biases and mean square error (MSE) than mis-

speciﬁcation of the propensity score. Moreover, in a recent simulation-based study, Hade and

Lu (2014) suggests that if the distributions of the estimated propensity score in the treated and

untreated groups have different shapes but roughly the same support, as in our application,

then regression on the estimated propensity score performs well compared to the conventional

regression model on the entire set of covariates and other propensity score based methods

(e.g. matching, stratiﬁcation and weighting). Therefore, given the relatively large number of

covariates in our application, we choose the regression on propensity score approach.

Since there are only two strata in our application, we use a logistic regression model for

the principal stratum membership S:

logit(Pr(Si=n|Xi=x)) = α0+e(x)·α. (4)

And we assume a linear regression model for continuous potential outcomes, with different

intercepts and slopes for different strata:

Pr(Yi(z)|Si, Xi=x) = 1Si=c·(βc0+z·θc+e(x)·βc1) + 1Si=n·(βn0+e(x)·βn1) + ǫi,

(5)

where ǫi∼N(0, σ2)and 1Si=sis an indicator function that equals one if Si=sand equals

10

zero otherwise. It is easy to show that θcis the CATE, and the CATT can be subsequently

estimated by averaging the differences between the observed outcomes for treated compliers

and their estimated counterfactuals:

\

CATT =PiDi·Zi·[Yi−(ˆ

βc0+e(Xi)·ˆ

βc1)]

PiDi·Zi

.

The maximum likelihood (ML) estimates of the parameters are obtained using an EM (expectation-

maximization) algorithm. In the E-step the unobserved principal strata are replaced by their

expectations given the data and the current estimates of the parameters; then in the M-step,

the likelihood conditional on the expected principal strata is maximized. Standard errors are

obtained by the outer product of gradients evaluated at the ML estimate for the parameters in

(4) and (5), and by the bootstrap for the CATT.

3.2 Sensitivity analysis

Our sensitivity analysis is in the same spirit of Rosenbaum and Rubin (1983a), where the

assignment to treatment is assumed to be unconfounded given the observed covariates Xand

an unobserved covariate U, but is confounded given only X. Rosenbaum and Rubin suggest

to specify a set of parameters characterizing the distribution of Uand the association of U

with Zand Y(z)given observed covariates. Assuming a parametric model, the full likelihood

for Z, Y (0), Y (1), U given Xis derived and maximized, ﬁxing the sensitivity parameters to

a range of known values, and the results are compared. In order to incorporate the additional

complexity of the immediate variable in our setting, we adopt a simpler setup similar to that in

Schwartz et al. (2012). In particular, we do not directly model the distributions involving U,

instead we model the consequences of an unmeasured confounder. Speciﬁcally, in the presence

of U, Equation (3) no longer stands, instead we have

Pr(Yi(0), Yi(1), Si|Zi, Xi, Ui) = Pr(Yi(0), Yi(1)|Zi, Si, Xi, Ui) Pr(Si|Zi, Xi, Ui).

11

Therefore, even conditional on the observed covariates, the proportion of principal strata and

the distribution of the potential outcomes within a principal stratum can differ across treat-

ment groups. These two channels of confounding are referred to as S-confounding and Y-

confounding in Schwartz et al. (2012), and are encoded in the following two models modiﬁed

from (4) and (5), respectively.

The principal strata model (4) is expanded to account for the S-confounding:

logit(Pr(Si=n|Zi=z, Xi=x)) = α0+e(x)·α+ξ·z, (6)

where ξis the log odds ratio of being a never-taker among card-holders versus among non

card-holders conditional on covariates:

exp(ξ) = Pr(Si=n|Zi= 1, Xi)/Pr(Si=c|Zi= 1, Xi)

Pr(Si=n|Zi= 0, Xi)/Pr(Si=c|Zi= 0, Xi).

The parameter ξis estimable from the observed data, but nevertheless can also be viewed as

a sensitivity parameter: when the unconfoundedness assumption holds, the odds ratio should

be 1 and ξ= 0, and consequently large absolute value of the estimated ξsuggests severe

imbalance in the proportions of principal strata between treatment groups and thus large S-

confounding. Here ξis imposed to be the same across different values of X. We have also

ﬁtted the model with an interaction term between Xand Zbut observed little difference in our

application.

For the potential outcomes models, in the presence of unmeasured confounding, it is impor-

tant to differentiate the assignment zin the deﬁnition of potential outcomes and the observed

assignment Z, and we make the distinction using different subscripts z1and z2. Speciﬁcally,

we expand model (5) by adding two sensitivityparameters as follows:

Pr(Yi(z1)|Si, Zi=z2, Xi=x)

=1Si=c·(βc0+z1θc+z2ηc+e(x)·βc) + 1Si=n·(βn0+z2ηn+e(x)·βn) + ǫi,(7)

12

It is straightforward to show that ηcand ηnaccount for the Y-confounding for compliers and

never-takers, respectively:

ηs= Pr(Yi(z1)|Z2i= 1, Si=s, Xi)−Pr(Yi(z1)|Z2i= 0, Si=s, Xi),

for s=n, c. For parsimony, the model assumes that ηcand ηnare constant across xand z.

We have also conducted analysis with interaction terms between Xand Z, and the results are

similar. In our application, ηn(or ηc) is the difference in cash holding between a never-user

(or complier) who has a debit card and a never-user (or complier) who does not have a card.

When the unconfoundedness assumption holds, ηc=ηn= 0, and model (7) reduces to model

(5).

The sensitivity analysis is carried out by comparing the ML estimates of the CATT while

ﬁxing the sensitivity parameters at a range of plausible values. Among the three parameters,

(ξ, ηc, ηn),ξis estimable from the data and we ﬁx it at the estimated value. Since for each unit

only the potential outcome corresponding to the observed treatment is observed, z1=z2in all

observed units in Model (7), and thus only the sum θc+ηcbut not each individual parameter

is identiﬁable. For example, one can not differentiate the two sets of parameters (θc, ηc) and

(θc+v, ηc−v) for any v. Therefore, in the sensitivity analysis, we will vary the values of ηn

while ﬁxing ηcto 0 for convenience, and explore possible range of vin the interpretation.

4 Application to the Italian SHIW data

4.1 The Data

The SHIW has been run every two years since 1965 with the only exception being that the

1997 survey was delayed to 1998. We denote by tthe generic survey year, and by (t+ 1)

the subsequent survey year. We deﬁne the target population as the set of households having

13

at least one bank current account but neither debit cards nor credit cards at t. The treatment

Zis posed equal to 1 if, at t+ 1, the household (all members combined) possesses one and

only one debit card and no credit cards, equal to 0 if, at t+ 1, the household possesses neither

debit cards nor credit cards. The households with more than one debit card other than those

possessing at least one credit card are excluded from the sample. The reason to exclude the

households holding credit cards is that these households usually already possessed at least one

debit card (Mercatanti, 2008), and thus inclusion of these households would lead to imbalance

in credit card holdings between treated and untreated households. Given that credit card is a

payment instrument potentially affecting cash demand, this would overestimate the effect of

debit cards. Therefore, a household for which Z= 1 is characterized by having acquired their

ﬁrst (and only) debit card during the span t→(t+ 1). The two binary post-treatment variables

Didentify the way in which the debit card is actually used: Dis posed equal to 1 if debit card

is used to make ATM withdrawals at least one time per month on average (withdrawers), or

debit card is used to make payment at POS devices at least one time per month on average

(POS users). For the post-treatment variable withdrawers, the relevant survey question asks

the households how many withdrawals were made on average per month at ATM points during

the survey’s year. For POS users, the relevant survey question asks the number of times, on

average per month, the debit card was used directly at supermarket or shops to make payments

by means of POS terminals. We conduct separate analysis with each of the two Dvariables.

The outcome Yis the average cash inventory held by the household and is observed at t+ 1.

The relevant survey question asks the sum of cash household usually have in the house to meet

normal household needs.

The covariaties Xinclude the lagged outcome, some background demographic and social

variables referred either to the household or to the head householder. The subset of covariates

referred to the household includes the overall household income, wealth, the monthly average

14

spending of the household on all consumer goods, and the following categorical variables: the

number of earners, average age of the household, family size, the Italian geographical macro-

area where the household lives, the number of inhabitants of the town where the household

lives. Those related to the head householder include age and education, both of which are

categorical. As shown in Mercatanti and Li (2014), the probability of a household having one

debit card increases with income, the town size, education of the head householder, from the

South to North of Italy, while decreases with the average age of the household.

Table 1 shows, for the years 1995, 1998 and 2000 a non-negligible share of debit card

holders who rarely use the card to withdraw cash from ATMs or to pay for purchases at POS

devices at retailers. The share of non-users is less for withdraws at ATMs than for POS pay-

ments.

Table 1: Per cent of households with bank account, possessing one debit card and no credit

cards, by debit card usage.

Year Sample size Less than one ATM withdrawal Less than one POS payment

per month on average per month on average

1995 1727 23.2 87.2

1998 1645 16.8 68.7

2000 1857 19.3 59.2

A simple descriptive cross-sectional analysis on the subsample of households observed in a

single sweep of the survey shows the difference in average cash inventory between households

possessing one debit card and households without a debit card is -121.6, -118.1, and -169.1

thousands of Italian Liras in 1995, 1998, and 2000, respectively. Though not sufﬁcient to

establish causal effects of debit cards on cash holding, this shows that consumers in Italy who

15

possess debit cards hold less cash compared to those who do not on average.

SHIW is a repeated cross-section with a panel component, namely only a part of the sample

comprises households that were interviewed in previous surveys. Our analysis will focus on

the households observed for two consecutive surveys. Table 2 reports the samples sizes for

each span, t→(t+ 1), where t= 1993,95,98, respectively by treated and untreated units.

The relative frequency of untreated units alongside the total sample size has a considerable

drop after 2000. Accordingly, the analysis will be limited to the span until 1998-00, the latest

span with considerable share of both untreated units and total sample size.

Table 2: Sample sizes and relative frequency of treated (Zi= 1) and untreated (Zi= 0) units

for each span.

t→(t+ 1) Zi= 1 Zi= 0 Total

size rel. freq. size rel. freq.

1993-95 164 .177 764 .823 928

1995-98 143 .274 379 .726 522

1998-00 114 .182 513 .818 627

4.2 Assessment of covariate overlap and balance

We ﬁrst assess covariate overlap and balance in the studied sample. Figure 1 presents the his-

tograms of estimated propensity scores for treated and untreated groups for each span, which

were estimated from a logistic model with main effects of each covariate. The histograms

show a satisfactory overlap in the support of the propensity score for all three spans, so that no

trimming is needed, and further, this provides basis for our propensity score regression method

based on the suggestions of Hade and Lu (2014).

16

Figure 2 reports, for each span, the boxplots of the absolute standardized differences (ASD)

in covariates between treated and untreated group:

ASD =PN

i=1 XiZi

PN

i=1 Zi

−PN

i=1 Xi(1 −Zi)

PN

i=1(1 −Zi),qs2

1/N1+s2

0/N0,(8)

where Nzis the number of units and s2

zis the standard deviation of the covariates in group

Z=zfor z= 0,1. The boxplots reveal signiﬁcant imbalance in a large number of covariates

between groups. Therefore, adjustment of covariate imbalance is crucial in this application.

Figure 1: Histograms of the estimated propensity score for the treated (blue) and the untreated

(red). The ﬁrst is for the span 1993 to 1995, the second is for the span 1995 to 1998, and the

third is for the span 1998 to 2000.

0 100 200 300

Frequency

0 .1 .2 .3 .4 .5

0 20 40 60 80 100

Frequency

0 .2 .4 .6 .8

0 50 100 150 200

Frequency

0 .2 .4 .6

Figure 2: Boxplots of the absolute standardized difference of all covariates.

Span 1993−1995 Span 1995−1998 Span 1998−2000

01234567

Abs Standardized Difference

17

4.3 Results

We ﬁt the models in Section 3 via the EM algorithm. Table 3 compares the proportion of

never-users among the treated units estimated from the model with and without sensitivity

parameters, calculated as follows:

Pr(S=n|Z= 1) = Ze(x)

Pr(S=n|Z= 1, e(x)) ·Pr(e(x)|Z= 1) dx,

where Pr(S=n|Z= 1, e(x)) is the model for Sand Pr(e(x)|Z= 1) is approximated by

the observed distributions of the estimated e(x)among the treated units. As a reference, we

also present the moment estimates of this quantity, which is the proportion of non-users in the

group of households with one debit card. Table 3 shows that the moment estimates closely

resemble estimates from the model with the sensitivity parameter ξ(Model (6)) in all spans,

but differ much from those from the model without the parameter (Model (4)), suggesting the

latter is subject to large bias due to unmeasured confounding (elaborated later).

Table 4 reports the results obtained from the sensitivity model (7) with ηc= 0 and varying

values of ηn. The parameter ηnrepresents the level of Y-confounding for never-users by the

difference in average cash inventories between never-users with one debit card and without

debit card. Absolute values of ηngreater than 400 thousands of Italian Lira (LIT hereafter)

is considered too high given the reported values of average cash inventories, therefore we

limit the range of examined values for ηnbetween -400 and 400 thousands LIT. The average

cash inventories for never-users with one debit card is observable and is in the ranges from

630 to 788, and from 703 to 990 thousands of LIT for POS users and withdrawers, respec-

tively. Estimates of ξare negative under each scenario. This suggests that, conditionally on

the propensity score, the probability of being non-users is lower among treated units (card-

holding households) than untreated units (non card-holding households), revealing a violation

of the unconfoundedness assumption due to S-confounding. Indeed we found the untreated

18

compliers most consist of a small group of households with high cash inventories. The span

1995-1998 emerges with the highest estimated value of α0(the intercept of the model for S),

implying a high proportion of non-users in the group of untreated households. Estimates of

the CATEs and CATTs are consistently negative with small standard errors. The span 1995-

1998 shows larger values of the two estimated causal effects, likely due to the detected higher

S-confounding. The estimated values of α0,α,ξand in particular of our targeted estimand,

CATT, remain stable across the range of ηc6= 0, while the estimated CATEs are equal to those

obtained under ηc= 0 minus the alternative value of ηc. Table 4 shows that, for each of the

six considered scenario, the estimated CATT is insensitive to the degree of unmeasured con-

founding within a reasonable range. The estimated CATTs for withdrawers are comparatively

more stable, in particular the span 1998-2000 that shows values contained in the small range

from -1547 to -1572 thousands LIT. The estimated CATTs for POS users are more sensitive to

the proposed values of ηnpresumably because of the larger shares of non-users in comparison

to withdrawers (Table 1). However, these values remain within plausible ranges, with the only

exception of the CATT corresponding to span 1993-1995 and ηn=−400 that is appreciably

different to the CATTs obtained for the alternative values of ηn. The span 1998-2000 shows

again stabler results: -1573 to -1687 thousands LIT, presumably due to the smaller share of

POS non-users in comparison to the other two spans.

To better understand the scale of these results, Table 5reports the percentage reduction in

the cash inventories due to the use of the debit card, CATT

AOTC−CATT, where AOTC is the Aver-

age Outcome for the Treated Compliers. For 1993-95 and 1998-00, the ratios are greater for

POS users than for withdrawers. This is reasonable because individuals who used debit cards

to pay at POS usually also used cards to withdraw cash at ATM. For 1995-98 the difference

between POS users and withdrawers decrease in that people usually start to use debit cards to

withdraw and subsequently to pay at POS, so that the longer the span the more likely that the

19

groups of POS users and withdrawers coincide. Overall the estimated reduction in household

cash inventory due to the use of the debit card is remarkable, ranging between 78% and 81%

for the span 1995-1998, between 75% and 78% for the POS users and between 67% and 73%

for the withdrawers for the other two spans.

Table 3: Estimated proportion of never-users in the group of treated, Pr(S=n|Z= 1),

from the model with sensitivity parameters in comparison to the model without sensitivity

parameters and moment estimates.

1993-1995 1995-1998 1998-2000

POS users Model with sensitivity .933 .780 .690

Model w/o sensitivity .886 .308 .612

Moment Estimate .933 .783 .690

Withdrawers Model with sensitivity .301 .297 .140

Model w/o sensitivity .700 .635 .407

Moment Estimate .299 .300 .140

5 Discussion

In this paper we quantify the causal effect of the use of debit cards on households cash inven-

tories in Italy. A principal stratiﬁcation model integrated with sensitivity parameters allows

to simultaneously account for issues including the non-negligible share of households who

hold one debit card but do not use it, the questionable deﬁnition of the use of cards as a treat-

ment under the potential outcome approach, and possible violation of the unconfoundedness

assumption.

20

Our results suggest considerable causal effects: the reduction on cash inventories for

households who use the debit card is consistently between 70 and 80 per cent during 1993-

2000. We have evaluated short-term effects here with only one to one and a half years long

study period on average. In fact, the SHIW data does not provide information about the mo-

ment a household has acquired its debit cards; we only know it has happened during the two,

or three, years of the considered span. Therefore the high estimated effects on cash holding

also signal that the use of cards quickly affects the reduction of the amount of cash held at

home.

Via the sensitivity analysis we have identiﬁed a high level of unmeasured confounding that

otherwise would have biased the results. Source for the confounding primarily lies in the part

of compliers who have high level of cash inventories even without possessing debit cards. In-

deed Mercatanti and Li (2014) also shows debit cards holders generally have higher level of

income, wealth and education of the householder in comparison to households without debit

cards. Therefore it is plausible that compliers are in high social and economic statuses. Quan-

tiﬁcation of the causal effects for a larger sub-population of compliers that include households

in a broad range of social classes, could be achieved by extending the observed period, which

would reduce the proportion of never-users. In fact, it is plausible that less reactive households

would start to use the cards over time. However, this analysis is not feasible with SHIW data

for two reasons: ﬁrst, extended period of observations would greatly reduce the sample sizes;

second, given the increased diffusion of debit cards, the control group size would collapse over

time. Nonetheless, one could still apply the same causal model to suitable datasets that allow

for extending the temporal effects of debit cards while maintaining an adequate sample size

for the untreated group.

Our sensitivity model for the outcome is not identiﬁable from a frequentist perspective,

with only the sum of the two but not individual sensitivity parameters identiﬁable. From a

21

Bayesian perspective, the model is weakly identiﬁable given proper prior distributions for the

parameters. When available, a secondary outcome variable modeled jointly with the primary

outcome would sharpen the analysis (Mealli and Pacini, 2013; Mercatanti et al., 2014) regard-

less of the mode of inference. Searching for a suitable auxiliary outcome is a direction of our

future research.

References

Alvarez, F. and Lippi, F. (2009). Financial innovation and the transactions demand for cash.

Econometrica,77(2), 363–402.

Angrist, J., Imbens, G., and Rubin, D. (1996). Identiﬁcation of causal effects using instrumen-

tal variables. Journal of the American Statistical Association,91(434), 444–455.

Attanasio, O., Guiso, L., and Jappelli, T. (2002). The demand for money, ﬁnancial innovation

and the welfare cost of inﬂation: an analysis with household data. Journal of Political

Economy,110, 318–351.

C.P.S.S. (2001). A glossary of terms used in payments and settlement systems. Committee on

Payment and Settlement System, Bank for International Settlements, Basel, Switzerland.

Duca, J. and Whitesell, W. (1995). Credit cards and money demand: a cross-sectional study.

Journal of Money, Credit and Banking,27, 604–623.

Elliott, M., Raghunathan, T., and Li, Y. (2010). Bayesian inference for causal mediation effects

using principal stratiﬁcation with dichotomous mediators and outcomes. Biostatistics,11(1),

353–372.

22

Frangakis, C. and Rubin, D. (2002). Principal stratiﬁcation in causal inference. Biometrics,

58(1), 21–29.

Frangakis, C., Brookmeyer, R., Varadhan, R., Mahboobeh, S., Vlahov, D., and Strathdee, S.

(2004). Methodology for evaluating a partially controlled longitudinal treatment using prin-

cipal stratiﬁcation, with application to a needle exchange program. Journal of the American

Statistical Association,99, 239–249.

Gallop, R., Small, D., Lin, J., Elliot, M., Joffe, M., and Ten Have, T. (2009). Mediation

analysis with principal stratiﬁcation. Statistics in Medicine,28(7), 1108–1130.

Gilbert, P., Bosch, J., and Hudgens, M. (2003). Sensitivity analysis for the assessment of

causal vaccine effects on viral load in HIV vaccine trials. Biometrics,59(1), 531–541.

Grifﬁn, B., McCaffery, D., and Morral, A. (2008). An application of principal stratiﬁcation

to control for institutionalization at follow-up in studies of substance abuse treatment pro-

grams. The annals of applied statistics,2(3), 1034–1055.

Grilli, L. and Mealli, F. (2008). Nonparametric bounds on the causal effect of universitystudies

on job opportunities using principal stratiﬁcation. Journal of Educational and Behavioral

Statistics,33(1), 111–130.

Hade, E. and Lu, B. (2014). Bias associated with using the estimated propensity score as a

regression covariate. Statistics in Medicine,33, 74–87.

Hahn, J. (1998). On the role of the propensity score in efﬁcient semiparametric estimation of

average treatment effects. Econometrica,66(2), 315–331.

Heckman, J., Ichimura, H., and Todd, P. (1998). Matching as an econometric evaluation

estimator. The Review of Economic Studies,65(2), 261–294.

23

Hirano, K., Imbens, G., Rubin, D., and Zhou, X.-H. (2000). Assessing the effect of an in-

ﬂuenza vaccine in an encouragement design. Biostatistics,1(1), 69–88.

Holland, P. (1986). Statistics and causal inference (with discussion). Journal of the American

Statistical Association,81, 945–970.

Imbens, G. (2004). Nonparametric estimation of average treatment effects under exogeneity:

A review. The Review of Economics and Statistics,86(1), 4–29.

Imbens, G. and Angrist, J. (1994). Identiﬁcation and estimation of local average treatment

effects. Econometrica,62, 467–476.

Imbens, G. and Rubin, D. (1997). Bayesian inference for causal effects in randomized experi-

ments with noncompliance. The Annals of Statistics,25(1), 305–327.

Imbens, G. and Rubin, D. (2015). Causal Inference for Statistics, Social, and Biomedical

Sciences: An Introduction. Cambridge University Press, New York.

Jin, H. and Rubin, D. (2008). Principal stratiﬁcation for causal inference with extended partial

compliance. Journal of the American Statistical Association,103, 101–111.

Jo, B. and Vinokur, A. (2011). Sensitivity analysis and bounding of causal effects with al-

ternative identifying assumption. Journal of Educational and Behavioral Statistics,36(4),

415–440.

Li, Y., Taylor, J., and Elliott, M. (2009). A Bayesian approach to surrogacy assessment using

principal stratiﬁcation in clinical trials. Biometrics,66(2), 523–531.

Li, Y., Taylor, J., and Elliott, M. (2011). Causal assessment of surrogacy in a metanalysis of

colorectal cancer trials. Biostatistics,12, 478–492.

24

Lippi, F. and Secchi, A. (2009). Technological change and the households’ demand for cur-

rency. Journal of Money and Banking,56(2), 222–230.

Markose, S. and Loke, Y. (2003). Network effects on cash-card substitution in transactions

and low interest rate regimes. The Economic Journal,113, 456–476.

Mattei, A. and Mealli, F. (2007). Application of the principal stratiﬁcation approach to the

faenza randomized experiment on breast self-examination. Biometrics,63, 437–446.

Mattei, A. and Mealli, F. (2011). Augmented designs to assess principal strata causal effects.

Journal of the Royal Statistical Society, Series B,73(5), 729–752.

Mattei, A., Mealli, F., and Pacini, B. (2014). Identiﬁcation of causal effects in the presence of

nonignorable missing outcome values. Biometrics,70(2), 278–288.

Mealli, F. and Pacini, B. (2013). Using secondary outcomes to sharpen inference in random-

ized experiments with noncompliance. Journal of the American Statistical Association,108,

1120–1131.

Mercatanti, A. (2008). Assessing the effect of debit cards on households’ spending under the

uncounfoundedness assumption. Report n. 304, Dipartimento di Statistica e Matematica

Applicata all’Economia, Universit di Pisa.

Mercatanti, A. and Li, F. (2014). Do debit cards increase household spending? Evidence

from a semiparametric causal analysis of a survey. The Annals of Applied Statistics,8(4),

2485–2508.

Mercatanti, A., Li, F., and Mealli, F. (2014). Evaluating the effects of university grants using

regression discontinuity designs. Statistical Analysis and Data Mining,8(1), 34–48.

25

Rosenbaum, P. and Rubin, D. (1983a). Assessing sensitivity to an unobserved binary covariate

in an observational study with binary outcome. Journal of the Royal Statistical Society.

Series B (Methodological),45(2), 212–218.

Rosenbaum, P. and Rubin, D. (1983b). The central role of the propensity score in observational

studies for causal effects. Biometrika,70(1), 41–55.

Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized

studies. Journal of Educational Psychology,66(1), 688–701.

Rubin, D. (1978). Bayesian inference for causal effects: The role of randomization. The

Annals of Statistics,6(1), 34–58.

Rubin, D. (1979). Using multivariate matched sampling and regression adjustment to control

bias in observational studies. Journal of the American Statistical Association,74, 318324.

Rubin, D. (2006). Causal inference through potential outcomes and principal stratiﬁcation:

Application to studies with “censoring” due to death. Statistical Science,21(3), 299–309.

Schwartz, S., Li, F., and Mealli, F. (2011). A bayesian semiparametric approach to interme-

diate variables in causal inference. Journal of the American Statistical Association,31(10),

949–962.

Schwartz, S., Li, F., and Reiter, J. (2012). Sensitivity analysis for unmeasured confounding in

principal stratiﬁcation. Statistics in Medicine,106(496), 1331–1344.

Sj¨olander, A., Humphreys, K., Vansteelandt, S., Bellocco, R., and Palmgren, J. (2009). Sensi-

tivity analysis for principal stratum direct effects, with an application to a study of physical

activity and coronary heart disease. Biometrics,65(2), 514–520.

26

Stix, H. (2004). How do debit cards affect cash demand? Survey data evidence. Empirica,31,

93–115.

Stuart, E. and Jo, B. (2013). Assessing the sensitivity of methods for estimating principal

causal effects. Statistical Methods in Medical Research.

Yilmazkuday, H. and Yazgan, M. (2009). Effects of credit and debit cards on the currency

demand. Applied Economics,41, 2115–2123.

Zhang, J., Rubin, D., and Mealli, F. (2009). Likelihood-based analysis of the causal effects

of job-training programs using principal stratiﬁcation. Journal of the American Statistical

Association,104, 166–176.

27

Table 4: MLE of α0,α,ξ,θ(CATE) and CATT when ηc= 0 and for ﬁxed values of ηn. CATE

and CATT denominated in thousands of Italian Lira. Standard errors are in the parenthesis.

ηn-400 -200 0 200 400

ˆα03.38 (.43) 2.32 (.27) 2.34 (.26) 2.42 (.25) 2.48 (.25)

93-95 ˆα.35 (2.46) -1.52 (1.40) -2.00 (1.28) -2.86 (1.21) -3.52 (1.17)

POS users ˆ

ξ-.84 (.48) -.67 (.36) -.77 (.35) -.91 (.35) -1.03 (.35)

CATE -2401.9 (291.4) -1708.6 (168.4) -1673.5 (158.2) -1617.5 (156.2) -1574.9 (161.7)

CATT -2429.6 (911.8) -1730.6 (460.5) -1698.1 (455.9) -1647.9 (450.1) -1609.0 (471.8)

ˆα02.39 (.24) 2.38 (.23) 2.37 (.23) 2.37 (.23) 2.38 (.23)

93-95 ˆα-1.84 (1.11) -1.97 (1.08) -2.06 (1.07) -2.14 (1.06) -2.22 (1.07)

withdrawers ˆ

ξ-2.80 (.24) -2.76 (.23) -2.74 (.22) -2.72 (.22) -2.71 (.22)

CATE -1506.4 (58.7) -1493.6 (55.2) -1483.2 (53.4) -1484.3 (52.9) -1469.2 (53.4)

CATT -1562.7 (229.5) -1551.3 (229.1) -1542.8 (216.9) -1536.4 (220.4) -1531.0 (217.9)

ˆα03.26 (.48) 3.19 (.46) 3.16 (.46) 3.12 (.45) 3.11 (.45)

95-98 ˆα2.23 (1.13) 1.79 (1.07) 1.64 (1.05) 1.31 (1.02) 1.11 (.99)

POS users ˆ

ξ-2.94 (.48) -2.68 (.43) -2.59 (.41) -2.42 (.39) -2.33 (.39)

CATE -2881.9 (207.2) -2767.1 (182.6) -2720.6 (174.6) -2637.0 (170.6) -2596.3 (174.3)

CATT -2902.8 (479.5) -2805.6 (512.6) -2763.2 (477.2) -2691.6 (432.8) -2658.3 (493.8)

ˆα03.67 (.45) 3.65 (.44) 3.62 (.44) 3.59 (.43) 3.57 (.43)

95-98 ˆα-.26 (.96) -.31 (.94) -.37 (.93) -.49 (.92) -.62 (.91)

withdrawers ˆ

ξ-4.41 (.43) -4.37 (.42) -4.32 (.41) -4.23 (.40) -4.15 (.39)

CATE -2775.4 (169.4) -2744.1 (159.5) -2706.7 (152.8) -2646.3 (147.3) -2592.1 (144.4)

CATT -2739.6 (305.0) -2710.0 (423.9) -2674.9 (426.8) -2619.3 (433.0) -2571.0 (403.4)

ˆα02.35 (.23) 2.25 (.22) 2.19 (.22) 2.15 (.22) 2.11 (.22)

98-00 ˆα-1.89 (.93) -1.75(.91) -1.69(.90) -1.70 (.89) -1.87 (.88)

POS users ˆ

ξ-1.01 (.30) -.94 (.29) -.90 (.28) -.85 (.29) -.77 (.29)

CATE -1785.4 (137.5) -1761.4 (126.0) -1739.1 (121.2) -1709.9 (121.0) -1652.8 (63.6)

CATT -1687.3 (391.0) -1655.1 (379.4) -1632.3 (375.5) -1608.3 (373.4) -1573.0 (366.0)

ˆα02.00 (.24) 1.98 (.24) 1.97 (.23) 1.96 (.24) 1.95 (.23)

98-00 ˆα-.45 (1.02) -.42 (1.02) -.41 (1.02) -.41 (1.02) -.43 (1.03)

withdrawers ˆ

ξ-3.69 (.34) -3.68 (.33) -3.67 (.32) -3.66 (.32) -3.65 (.33)

CATE -1643.9 (72.6) -1635.6 (70.9) -1628.8 (69.7) -1623.4 (69.5) -1619.4 (69.9)

CATT -1572.0 (244.6) -1563.1 (248.1) -1555.6 (239.6) -1550.9 (246.3) -1547.8 (244.2)

28

Table 5: Percentage reduction in the cash inventories due to the use of the debit card,

CATT

AOTC−CATT, across a range of ηnvalues.

ηn=−400 ηn=−200 ηn= 0 ηn= 200 ηn= 400

93-95: POS users -.832 -.779 -.776 -.770 -.766

93-95: withdrawers -.678 -.676 -.675 -.674 -.673

95-98: POS users -.813 -.808 -.806 -.802 -.797

95-98: withdrawers -.793 -.791 -.789 -.785 -.782

98-00: POS users -.758 -.755 -.752 -.749 -.745

98-00: withdrawers -.730 -.729 -.728 -.728 -.727

29