Page 1

ARTICLE

Clinical Trials 2006; 3: 73–98

© Society for Clinical Trials 200610.1191/1740774506cn139oa

Introduction

In October 2004, the United States Food and Drug

Administration (FDA) issued a letter to the manu-

facturers of newer generation antidepressants

informing them of the need to “caution practition-

ers, patients, family members or caregivers about an

increased risk of suicidal thinking and behavior

(suicidality) in children and adolescents ... who are

taking antidepressant medications”. The FDA

ordered manufacturers to include a “black box”

warning on the labels of their antidepressants, the

most severe regulatory action short of a total ban.

Since this warning went into effect, there has been,

as expected, a decline in prescriptions for antide-

pressants for children and adolescents [1].

Do antidepressants cause suicidality in children?

A Bayesian meta-analysis

Eloise E Kaizara, Joel B Greenhousea,b, Howard Seltmanaand Kelly Kelleherc

Background

children who use antidepressants, the FDA collected randomized placebo-

controlled trials of antidepressant efficacy in children. Although none of the 4487

children completed suicide, 1.7% exhibited suicidality. The FDA meta-analyzed

these studies and found sufficient evidence of an increased risk to require a black-

box warning on antidepressants for children.

Purpose

The FDA considered different drug formulations and psychiatric

diagnoses to be equivalent in their effect on suicidality. If this assumption does not

hold, the FDA analysis may have underestimated the variance of the risk estimate.

We investigate the consequences of relaxing these assumptions.

Methods

We extend the FDA analysis using a Bayesian hierarchical model that

allows for a study-level component of variability and facilitates extensive sensitivity

analyses.

Results

We found an association between antidepressant use and an increased risk

of suicidality in studies where the diagnosis was major depressive disorder (odds ratio

2.3 [1.3, 3.8]), and where the antidepressant was an SSRI (odds ratio 2.2 [1.3, 3.6]).

We did not find evidence for such an association in the complement sets of trials.

Although the results based on the hierarchical model are insensitive to model pertur-

bations, the robustness of the FDA’s meta-analysis to model assumptions is less clear.

These data have limited generalizability due to exclusion of patients with baseline

risk of suicide and the use of relatively short duration trials.

Conclusions

Because of model specification and interpretation issues raised in this

paper, we conclude that the evidence supporting a causal link between antidepres-

sant use and suicidality in children is weak. The use of Bayesian hierarchical models

for meta-analysis has facilitated the incorporation of potentially important sources

of variability and the use of sensitivity analysis to assess the consequences of model

specifications and their impact on important regulatory decisions.

2006; 3: 73–98. www.SCTjournal.com

To quantify the risk of suicidal behavior/ideation (suicidality) for

Clinical Trials

aDepartment of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA,bDepartment of Psychiatry, University of

Pittsburgh Medical Center, Pittsburgh, PA, USA,cDepartment of Pediatrics, Ohio State University and Children’s

Research Institute, Columbus, OH, USA

Author for Correspondence: Joel Greenhouse, Department of Statistics, Carnegie Mellon University, Pittsburgh, PA

15217, USA. E-mail: joel@stat.cmu.edu

Page 2

Before examining the evidence upon which the

FDA’s decision was based, it will be useful to place

this decision into a wider context. An excellent

review of the deliberations of the FDA’s scientific

advisory committees and of the history of the con-

troversy surrounding the use of antidepressants and

suicidality in children and adolescents can be

found in Leslie et al. [2]. Here, we briefly highlight

some important scientific and policy issues that

help provide a context for our analyses. Mental

health disorders in pediatric populations are sur-

prisingly prevalent. Recent estimates suggest that at

least one in 10 children and adolescents has mental

illness severe enough to cause some level of

impairment [3]. For example, the estimated point

prevalence for major depressive disorder is approxi-

mately 1–2% in school-aged children (eg, ages

6–12) and 2–5% in adolescents with 14–25% of

youth experiencing at least one major depression

before adulthood. In addition to causing serious

morbidity, many of the mood disorders, such as

major depressive disorder, dysthymic disorder and

bipolar disorder, are also characterized by an

increased risk of suicidal ideation, suicide attempts,

and suicide completion (1995–1998 US age-

adjusted suicide death rates, all races, both sexes,

ages 0–19: 2.8 per 100 000).

The use of antidepressants in pediatric patients,

particularly the selective serotonin reuptake

inhibitors (SSRIs) and the atypical antidepres-

sants, has rapidly increased in the past decade, in

part, because of their favorable side effects profile

compared to earlier families of antidepressants

such as the tricyclics. It is important to note,

however, that much of the use of antidepressants

in pediatric populations with affective disorders

has occurred “off-label” without adequate testing

regarding their safety and efficacy. Even though in

adults antidepressants have been shown to treat

depression better than placebo, only one, fluoxe-

tine, has FDA approval for the treatment of

depression in children and adolescents. Finally,

we note that even in adults there has been a

concern dating back to the early 1990s about a

relationship between the use of SSRIs and suicida-

lity [see, eg, 4,5].

The primary source of evidence that informed

the FDA’s decision to require the black box warning

was a meta-analysis of 24 randomized placebo-

controlled trials designed to assess the efficacy of

antidepressants for different psychiatric disorders

in children and adolescents [2]. The FDA examined

studies of both selective serotonin reuptake

inhibitors (SSRIs) and atypical antidepressants, but

did not consider older classes of antidepressants

such as tricyclics and monoamine oxidase

inhibitors (MAOIs). Among 4487 patients studied,

there were no completed suicides. The overall rate

of suicidal ideation and/or behavior, however, was

1.7%. Since there were no completed suicides, the

FDA used suicidal behavior and/or ideation

(referred to as suicidality) as their primary outcome.

To quantify the risk of suicidality due to antide-

pressant use, the FDA combined the data from the

randomized trials using standard fixed and

random-effects models. Because the FDA considered

the different drug formulations and psychiatric

diagnoses to be equivalent in their effect on suici-

dality, we wondered whether the FDA analysis

might have underestimated the variance associated

with the risk ratio for suicidality, thus making the

case for a true “signal” stronger than it might actu-

ally be. In this paper, we compare the FDA’s analy-

sis with a fully Bayesian analysis of the same data.

We consider patients with different illnesses or

patients being treated with different antidepres-

sants (formulations) to be heterogeneous in their

risk of suicidality, thus allowing for more sources of

variance. Specifically, we consider a four-stage hier-

archical model that includes a level for study-level

variables such as drug formulation and illness type.

In this paper, we are particularly interested in how

issues in the practice of meta-analysis such as

model specification and sensitivity analysis inform

the regulatory decision process.

Methods

At the request of the FDA, manufacturers identi-

fied all placebo-controlled trials of antidepressants

conducted in children and adolescents, regardless

of the indication studied, and provided informa-

tion from these trials to the FDA. The database

consists of 24 randomized placebo-controlled

efficacy trials of antidepressants in children and

adolescents. All studies except one were sponsored

by pharmaceutical companies. The exception was

an NIMH funded multisite trial of Prozac for the

treatment of depression in adolescents [6]. The

FDA used these trials for the meta-analysis that

they presented to an expert advisory panel in

September 2004 [7,8]. Table 1 presents characteris-

tics of the studies; for more detailed information,

please refer to the table presented at the FDA

advisory committees meeting [9]. Among the 24

placebo-controlled trials, 16 studied efficacy for

major depressive disorder (MDD), four studied

efficacy for obsessive compulsive disorder (OCD),

three studied efficacy for social or general anxiety

disorders, and one studied efficacy for ADHD.

The drug formulations investigated included

five SSRIs (citalopram [Celexa], fluvoxamine

[Luvox], paroxetine [Paxil], fluoxetine [Prozac]

and sertraline [Zoloft]) and four atypical anti-

depressants (venlafaxine [Effexor], mirtazapine

74 EE Kaizar et al.

Clinical Trials 2006; 3: 73–98www.SCTjournal.com

Page 3

[Remeron], nefazodone [Serzone] and bupropion

[Wellbutrin]). We use the FDA’s definition of

suicidality that was based on the assessment and

classification of adverse events by an independent

adjudication committee [2,7]. The outcome of

interest is the number of subjects per treatment

group (within each study) for whom suicidal

behavior and/or ideation was reported. In this

paper, our primary measure of association between

suicidal behavior and/or ideation with treatment

is the odds ratio (OR), which we subsequently also

refer to as the “drug effect”.

The FDA used Mantel-Haenszel [10: 64] and the

methods of DerSimonian and Laird [11] to estimate

the overall association of antidepressant use with

suicidality. In their written report [7], the FDA used

similar fixed and random-effects analyses on

various subsets of the data. Within both their

overall analysis and sub-group analyses, the FDA

considered the different drug formulations and

psychiatric diagnoses to be equivalent in their effect

on suicidality.

We specify a four-level hierarchical model for the

drug effect. Level 1 models the number of patients

who have suicidal behaviors and/or ideations in

each treatment group as Binomial random vari-

ables; Level 2 models the log of the study drug

effects as Normal random variables [12]; Level 3

models the log of the effects of study-level variables

(such as drug formulation or psychiatric diagnosis)

as Normal random variables; and Level 4 models

the log of the overall drug effect as a Normal

random variable. The hierarchical model also

allows us to include treatment arms with zero

events, whereas the four studies in which there

were no events in either arm were excluded from

the FDA meta-analysis; for studies in which only

one arm had no events, the FDA used a correction

factor of 0.5 to adjust all arms of that study.

As an example, we specify the model for the case

where the study-level variable is drug formulation.

Figure 1 is a diagram of this model. We let rij

the observed number of subjects with suicidal

behaviors/ideations in the control group of study

i ? 1,... , I of drug j ? 1,... , J and nij

number of subjects in the control group of study i

of drug j. We let ?ij

subject in the control group of study i of drug j

having suicidal behavior/ideation. Corresponding

values for the treatment groups are rij

We let ?ijdenote the true log OR for study i of drug

j, ?jdenote the true log OR for drug j and ? denote

the true overall log OR for all drug formulations

(including both SSRI and atypical antidepressants)

together. The complete specification for our four-

level hierarchical model in which the drug effect is

Cdenote

Cdenote the

Cdenote the probability of a

T, nij

T, and ?ij

T.

Do antidepressants cause suicidality in children?75

www.SCTjournal.comClinical Trials 2006; 3: 73–98

Table 1

Properties of Studies

Number of

patients in

drug group

Number of

patients in

placebo group

Drug

formulation

Psychiatric

diagnosis

Length of

trial (weeks) Drug classStudy ID

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

SSRI

Atypical

Atypical

Atypical

Atypical

Atypical

Atypical

Atypical

Atypical

Celexa

Celexa

Luvox

Paxil

Paxil

Paxil

Paxil

Paxil

Prozac

Prozac

Prozac

Prozac

Prozac

Zoloft

Zoloft

Zoloft

Effexor

Effexor

Effexor

Effexor

Remeron

Serzone

Serzone

Wellbutrin

18

94404

01

701

377

329

704

676

TADS

X065

HCJE

HCCJ

HCJW

501001

501017

0498

382

394

397

396

045

187

141

75

MDD

MDD

OCD

MDD

MDD

MDD

OCD

Anxiety

MDD

MDD

MDD

MDD

OCD

MDD

MDD

OCD

MDD

MDD

Anxiety

Anxiety

MDD

MDD

MDD

ADHD

93 858

124

57

104

180

93

99

165

109

48

109

21

71

97

92

92

80

102

77

80

170

184

95

72

120

63

102

95

88

107

156

112

48

110

19

32

91

93

95

85

94

79

84

89

94

95

37

12

10

8

12

8

10

16

16

8

9

6

13

10

10

12

8

8

8

8

8

8

8

4

SSRI denotes Selective Serotonin Reuptake Inhibitor, MDD denotes Major Depressive Disorder, OCD denotes Obsessive-Compulsive

Disorder, and ADHD denotes Attention Deficit Hyperactivity Disorder.

Page 4

measured by the log OR is given by Equations (1)

through (6):

r

ij

∼ Bin(

(1)

(2)

(3)

(4)

(5)

(6)

The variance component for the second level, ?j2

(Equation 4), denotes the within-drug formulation

variance, a measure of the variability of individual

study log ORs within drug formulation j. For

example, ?2Celexais a measure of the variability of

the study-level log ORs within the collection of

Celexa studies. The variance component for the

third level, ?2(Equation 5), denotes the between-

drug formulation variance, a measure of the vari-

ability of the formulation-level log ORs among the

different drug formulations. Finally, ?2denotes the

prior variance of the overall log OR, which quanti-

fies our a priori uncertainty of the value of the

overall log OR. To specify the joint distribution of

the data and parameters, we assume that each study

is independent from all the other studies in the

same group, and that each group mean log OR is

independent from the other group mean log ORs.

We fit this hierarchical model using Bayesian

methods, which not only provide more clinically

meaningful summary measures and make the

calculations associated with this complex model

feasible, but also allow us to explicitly explore the

sensitivity of our inferences to various model

inputs.

?? ?

,

∼ N( ).

2

?? ?

,

j∼ N( ),

2

?? ?

jijj

∼ N(,),

2

log)it( logit()+ ,

???

ij

T

ij

C

ij

T

=

rn

ij

T

ij

T

ij

T

∼ Bin(, ),

?

n

C

ij

C

ij

C

, ),

?

Although we only present the explicit model

specification for the case where the study-level vari-

able is drug formulation, we use the same formalism

for our other three groupings: drug class, psychi-

atric diagnosis and study length. In each of these

cases, respectively, ?jrepresents the log OR of

suicidality associated with antidepressant use for 1)

those using an antidepressant in the jth drug class,

2) those with the jth psychiatric diagnosis, and 3)

those participating in a study with the jth length.

The goal of our analysis is 1) to accurately synthe-

size the evidence concerning the risk of suicidal

behaviors/ideations in young people who use anti-

depressants and 2) to bring differing opinions con-

cerning the risk of the use of antidepressants in

young people to consensus. We present two sum-

maries of the evidence for a relationship between

antidepressant use and suicidality: 1) the mean of the

posterior distribution of the odds ratio along with its

95% credible interval, and 2) the posterior probabi-

lity that the OR is greater than 1, ie, Pr(OR > 1).

We use the deviance information criterion (DIC) to

evaluate the fit of our models. The DIC is a measure of

“fit”, “adequacy” or predictive ability that is defined as

the average deviance (?2 times the log likelihood)

penalized by the effective number of parameters

[13,14]. In non-hierarchical settings with noninfor-

mative prior distributions, it is identical to Akaike’s

criterion (AIC) [13]. However, the DIC is suitable for

assessing the fit of hierarchical models. DIC calcula-

tions require the specification of a single likelihood and

single prior distribution, collapsing the problem to a

two-level hierarchy. Thus, we must specify the level of

the hierarchy that is of interest (ie, the “focus”). For

example, for comparisons with fixed or random effects

models, we chose to focus on Level 2 (?Cand ?) and

for all other comparisons to focus on Level 3 (?Cand

?). Guidelines for the interpretation of DIC are similar

to that of AIC [13, also see discussion for critiques of

76EE Kaizar et al.

Clinical Trials 2006; 3: 73–98www.SCTjournal.com

Figure 1

Bayesian hierarchical model

Page 5

DIC]: If two models have DIC values within two points

of each other, there is little evidence that either fits

better than the other; if two models have DIC values

that differ by three points or more, the model with the

smaller DIC value is considered to fit the data better.

Initially, we specify a prior distribution for the

overall drug effect using a diffuse (φ ? 3) Normal

distribution centered at no effect (ν ? 0). By center-

ing this diffuse prior at “no effect” we are initially

taking a skeptical view towards there being a

relationship between antidepressant use and suici-

dal behavior or ideation and are requiring the data

to provide strong evidence for such a relationship.

Such a “skeptical” prior might well reflect the scien-

tific and clinical communities’ predominant belief

prior to the FDA’s issuance of the black box

warning. As part of our sensitivity analyses, we also

consider the influence of an a priori strong drug

effect (ν ? 2). The choice of a prior distribution on

? with standard deviation φ ? 3 assigns probability

of ~1/2 to the odds ratio (e?) of a drug effect

in the range {1/8 ? e?? 8}. We also consider in our

sensitivity analysis tighter distributions on the

odds ratio corresponding to ? ? 1.6 which assigns

probability 1/2 to {1/3 ? e?? 3} and ? ? 0.7 which

assigns probability 1/2 to {5/8 ? e?? 1.6}. For prior

distributions on the variance parameters ?j2and ?2,

we specify conjugate Inverse Gamma distributions

with shape and rate 0.001 (Gamma parameteriza-

tion: mean ? shape/rate). These distributions

approximate a flat prior on the variance parameters.

Finally, we must also specify a weak proper prior dis-

tribution for the πiC

from a single beta distribution with hyperparameters

? and ? and we assume hyperprior distributions for

both ? and ? that are uniform(1, 200) [15].

In our analysis we consider four subsets of the

data, going from the most specific set of study

characteristics to the most general:

j. We assume the πiCjare drawn

• Subset A: Those studies that examined the effects

of SSRIs on MDD, thus estimating ?MDD?SSRI.

These studies are of particular interest since the

antidepressant drugs were originally approved for

this indication in adults and we expect the

patients in these studies to be at higher risk of

suicide or suicidal behavior/ideation.

• Subset B: Those studies that examined the effects

of antidepressants (both SSRIs and atypicals) on

MDD, thus estimating ?MDD.

• Subset C: The complement of subset (B) – those

studies that examined the effects of antidepres-

sants (both SSRIs and atypicals) on OCD, anxiety

or ADHD.

• Subset D: All studies that examined the effects of

antidepressants (both SSRIs and atypicals) on any

diagnosis (MDD, OCD, anxiety or ADHD), thus

estimating ?overall.

Our analytic plan is as follows. For each of these

data subsets we first replicate the FDA’s fixed- and

random-effects analyses using Mantel–Haenszel and

DerSimonian and Laird methods, respectively, with

the odds ratio rather than the risk ratio as the

summary measure of risk. Second, we consider the

Bayesian version of the fixed- and random-effects

models by setting to zero different variance compo-

nents in the Bayesian hierarchical model described

in Equations (1) through (6). For the random-effects

model, this means setting ?j? ?, ? ? 0 and ?j? ? for

all j. For the fixed-effects model, we further collapse

the model by setting ?ij? ? for all i, j and ? ? 0 [16].

By embedding the random- and fixed-effects model

into the broader framework of the multilevel hierar-

chical model, we are able to investigate the relative

fit of the fixed-effects, random-effects and full hierar-

chical models using DIC since the only distinctions

among the models are the restrictions on the vari-

ance components. In addition, this framework will

facilitate model sensitivity analyses.

Next, we explicitly incorporate different sources

of study-level variation using the four-level model.

We take drug formulation, drug class, study length, and

psychiatric diagnosis, in turn, as our Level 3 study

factor, as illustrated for drug formulation in Figure 1.

We take all four subsets of the data when applying

our model with drug formulation as the Level 3 vari-

able, but limit ourselves to only the entire data set

(subset [d]) when examining the other study-level

variables, since the other subsets of the data in part

reflect the drug class and diagnosis. In each of these

models, we consider studies of the same study-level

group (eg, drug formulation) to be identically dis-

tributed. We use DIC focused on Level 2 to compare

each of these models to their fixed- and random-

effects counterparts, and DIC focused on Level 3 to

compare the four-level models to each other.

Finally, we explore the sensitivity of our results

to the specification of prior distributions for ? and

for the variance parameters ?2and ?2, as well as to

the influence of individual studies and drug formu-

lations. To investigate the sensitivity of our results

to the specification of the original skeptical prior

distribution for ? we reanalyze the data by setting

the prior mean for ? at ? ? 2 (odds ratio ? e2? 7),

indicating an a priori belief in an increase in risk of

suicidal behavior/ideation due to the use of

antidepressants, and by setting the prior standard

deviation for ? at ? ? 0.7, 1.6 and 3.0. To assess

sensitivity to prior variance specification, we fix

the between Level 3 variance (?2) at different

plausible values and examine how our results

change as a result of changing the variance. We

repeat this analysis for the within-Level 3 variance

(?2). Lastly, to assess sensitivity to the influence of a

single study, we leave each study out of our model

one at a time, allowing us to examine how our

Do antidepressants cause suicidality in children?77

www.SCTjournal.comClinical Trials 2006; 3: 73–98

Page 6

results change as a result of the influence of a single

study.

We implemented the multi-level model using

BUGS [17]. BUGS (Bayesian inference using Gibbs

sampling) is flexible software that constructs a

Markov chain Monte Carlo (MCMC) sampling

algorithm for a user-specified hierarchical model.

BUGS also implements the algorithm to produce an

MCMC sample from the posterior distribution of

the model parameters and an estimate of the

model’s DIC focused at the parameters on the hier-

archical level closest to the data (in our model

focused on Level 2).

Results

FDA analysis

As a summary of the individual study level effects

and the results of the FDA meta-analysis, Figure 2

displays a forest plot of the crude odds ratio (OR)

of suicidality for each study, as well as the

Mantel–Haenszel (estimated OR ? 2.00, CI [1.29,

3.08]) and DerSimonian and Laird (estimated

OR?1.80, CI [1.13, 2.88]) estimates of the overall

OR and 95% confidence intervals for the OR. We

note again that the four studies that had no events

in either arm were excluded from the FDA analysis.

Additionally, Table 2 displays the Mantel–Haenszel

and DerSimonian and Laird estimates of the OR

along with 95% confidence intervals for the four

pre-specified subsets of the data.

Bayesian fixed-effects models

We next consider a Bayesian version of the fixed-

effects analysis used by the FDA by fixing in the

hierarchical model ?j? 0 for all j and setting ? ? 0.

In words, all the studies are treated as if they were

homogeneous and pooled together. Table 2 displays

the posterior estimate of the OR, 95% credible

interval and the probability that the OR is greater

than one for each of the subsets of the data.

Inspection of Table 2 indicates that the results of

the Bayesian fixed-effects analysis are similar to the

results of the FDA’s fixed-effects analysis.

For all hierarchical fixed-effects models that

include data for MDD patients (subsets a, b and d),

the probability that the OR of suicidality is greater

than one exceeds 0.95. These large probabilities

indicate strong evidence for an association between

antidepressant use and suicidality. However, the

subset of the data that does not include patients

with an MDD diagnosis (subset c) does not similarly

support a link between antidepressant use and sui-

cidality. The 95% credible interval for the overall

OR (e?) includes one, (0.524, 4.267) and the proba-

bility that the overall OR is greater than one is 0.80.

Bayesian random-effects models

A Bayesian version of the FDA’s random effects

analysis is obtained by fixing ?j? ? for all j. This

model allows for a between-study component of

variance but treats the different study effects as

homogeneous, regardless of their study-level chara-

cteristics (eg, drug formulation). Table 2 displays

the results from this model for each of the data

subsets. Inspection of the table indicates that the

results of the Bayesian random-effects analysis are

similar to the results of the FDA’s random-effects

analysis.

In the subsets of studies that include MDD

patients (subsets a, b and d), the probability the

overall OR (e?) is greater than one (no effect) is large

and indicates strong evidence for an association

between antidepressant use and suicidality. In the

subset of studies that do not include patients with

an MDD diagnosis (subset c), however, we do not

find a significant association between antide-

pressant use and suicidality (95% credible interval

for e??[0.363, 4.332]). Using Level 2 focused DIC

to compare the fit of the Bayesian fixed- to the

Bayesian random-effects models, we find that the

DIC values for these models are within two points

of each other (100.5 versus 100.2, 120.6 versus

120.5, 38.9 versus 39.5 and 157.4 versus 159.3) for

each of the respective subsets of the data, suggest-

ing that the two models fit the data equally well

and that the between-study component of variance

is surprisingly small.

Bayesian four-level hierarchical model

We next investigate different sources of study-level

variation (including drug formulation, drug class,

study length and diagnosis) using the four-level

hierarchical model described in Equations (1)–(6).

We first applied a model in which the Level 3

categorization is drug formulation. Table 3 displays

the results for the overall antidepressant effect. We

see that the 95% credible intervals for the overall

OR, e?, based on the MDD studies (subset b) and

based on all studies (subset d) do not include one

([1.107, 3.651] and [1.169, 3.600], respectively)

indicating an increased risk of suicidality for chil-

dren who use antidepressants. For the SSRI and

non-MDD study subsets (a and c) the 95% credible

intervals for the drug effect do contain one ([0.795,

3.245] and [0.244, 6.534], respectively). For subset

a, the “SSRI for MDD” subset of the data, the prob-

ability that the overall OR e?is greater than one is

78EE Kaizar et al.

Clinical Trials 2006; 3: 73–98www.SCTjournal.com

Page 7

Do antidepressants cause suicidality in children? 79

www.SCTjournal.com Clinical Trials 2006; 3: 73–98

Figure 2

Forest plot of studies included in the meta-analysis; studies in italics had no events in either the control or treatment arms. Note

that the horizontal axis is presented on a log scale. CI denotes Confidence Interval, MDD denotes Major Depressive Disorder, OCD

denotes Obsessive-Compulsive Disorder, and ADHD denotes Attention Deficit Hyperactivity Disorder

Page 8

greater than 93% with a mean odds ratio equal to

e??1.78.

We present the Level 2 group sizes for each

subset in Table 4. Table 5 presents the results for

each of the subgroups in the four-level model where

drug formulation is the Level 3 variable. The prob-

ability that the group OR e?is greater than one for

each of the individual drug formulations among

the SSRI, MDD and all studies subsets (a, b and d)

are each at least 91%. However, many of the 95%

credible intervals did include one, and so there was

not strong support for a positive association

between each drug and an increased risk of suicide.

Specifically, all of the 95% credible intervals for the

80EE Kaizar et al.

Clinical Trials 2006; 3: 73–98 www.SCTjournal.com

Table 2

Posterior summary measures for fixed effects models (?2? 0, ?2? 0) and random effects models (?2? 0)

Subset c

Studies for

OCD, anxiety

or ADHD

Subset a

Studies of SSRIs

for MDD

Subset bSubset d

All studies Studies for MDD

Studies used in

Bayesian (frequentist)

estimation

11 (11)16 (14)8 (6) 24 (20)

Mantel–Haenszel odds ratio ? e?(95% CI)

1.690

(1.026, 2.785)

Bayesian odds ratio ? e? (95% credible interval)

1.725

(0.991, 2.770)

Pr(odds ratio ? 1) ? Pr(e?? 1)

0.974

DerSimonian and Laird odds ratio?e?(95% CI)

1.596

(0.941, 2.707)

Bayesian odds ratio ? e?(95% credible interval)

1.680

(0.904, 2.776)

Pr(odds ratio ? 1) ? Pr(e?? 1)

0.952

Fixed effects

Overall drug effect

1.962

2.185

1.995

(1.225, 3.145)

(0.718, 6.648)

(1.293, 3.080)

2.073

1.724

2.189

(1.266, 3.281)

(0.524, 4.267)

(1.374, 3.387)

0.9990.8040.999

Random effects

Overall drug effect

1.768

2.010

1.800

(1.067, 2.932)

(0.575, 7.026)

(1.127, 2.877)

2.041

1.655

2.134

(1.154, 3.323)

(0.363, 4.332)

(1.288, 3.350)

0.993 0.709 0.998

The prior distribution for the overall drug effect had mean zero and standard deviation 3; the prior distribution for the between-study

variance in the random effects model was an inverse gamma with parameters 0.001 and 0.001. SSRI denotes Selective Serotonin

Reuptake Inhibitor, MDD denotes Major Depressive Disorder, OCD denotes Obsessive-Compulsive Disorder, and ADHD denotes

Attention Deficit Hyperactivity Disorder. Bold text denotes intervals that do not include one.

Table 3

Posterior summary measures for ? and ?2. Level 3: Drug formulation

Subset a

Studies of

SSRIs for MDD

Subset c

Studies for OCD,

anxiety or ADHD

Subset b

Studies for MDD

Subset d

All studies

Studies used in estimation

Subjects used in estimation

11168 24

2033312113644485

Odds ratio e?(95% credible interval)

2.123

(0.795, 3.245)

Pr(odds ratio ? 1) ? Pr(e? ? 1)

0.930

Mean of variance distribution, log odds ratio scale (95% credible interval)

0.167 0.120

(0.001, 1.300)(0.001 0.933)

Overall drug effect, ?

1.779 2.039

(0.244, 6.534)

2.176

(1.169, 3.600)(1.107, 3.651)

Overall drug effect, ?

0.987 0.715 0.988

Between-formulation

Variance, ?2

0.948

(0.001, 8.490)

0.081

(0.001, 0.658)

The prior distribution for the overall drug effect had mean zero and standard deviation 3; the prior distributions for the within- and

between-formulation variances were inverse gamma with parameters 0.001 and 0.001. SSRI denotes Selective Serotonin Reuptake

Inhibitor, MDD denotes Major Depressive Disorder, OCD denotes Obsessive-Compulsive Disorder, and ADHD denotes Attention Deficit

Hyperactivity Disorder. Bold text denotes intervals that do not include one.

Page 9

SSRI subset (subset a) included one; the intervals for

Zoloft, Remeron and Serzone included one for the

MDD subset (subset b); and the intervals for Zoloft,

Remeron, Serzone and Wellbutrin included one

when all the studies were analyzed together (subset

d). For subsets a, b and d, respectively, the posterior

distributions for the Zoloft, Remeron and

Wellbutrin effects had the least mass above one (no

effect), but all were still above 90%. For the subset

of OCD, Anxiety and ADHD studies (subset c), we

found little support for individual drug formulation

suicidality drug effects; all of the credible intervals

included one, including all of the SSRIs. We present

complete Level 2 results in Table 5.

The posterior distribution of ?2, the Level 3 vari-

ance or the variance of the group level log OR (?),

shows that this variance may be quite small, with

each of the 95% credible intervals having a lower

bound of just 0.001 (see Table 3). Thus, this model

is potentially not very different from the random

and fixed effects models where this component of

variance is set to ?2? 0. However, the DIC values

indicate that the extra component of variance does

not overfit the data relative to the fixed and random

effects models. In fact, the DIC values for the drug

formulation model are consistently lower than

the corresponding fixed and random effects

models for all four subsets of the data. The DIC

values for the other categorization models are not

lower than the fixed effects model DIC, but are

all within or nearly within the two-point equiva-

lency range (see Table 6). Overall, the DIC values

suggest that the four-level models do not diminish,

and often improve, the fit of the model to the data

as compared to the fixed and random effects

models.

We next considered the model with the other

Level 3 categorizations, ie, study characteristics:

drug class, study length and psychiatric diagnosis;

we present these results in Table 7. In each of these

models, we found only moderate support for an

association between antidepressant use and an

increased risk of suicidality. All of the overall effect

(e?) 95% credible intervals included one, yet there is

about a 90% probability that the overall effect is

above one for all three study-level classification

models.

For the drug class variable, we see support of a

link between SSRI class antidepressants and suici-

dality (95% credible interval for e?SSRI[1.305, 3.355]),

whereas we do not see similar support for a link

involving the atypical antidepressants (95% credi-

ble interval for e?Atypical[0.353, 3.651]). Similarly, we

see support for an antidepressant-suicidality link

among longer studies, but not shorter ones (95%

credible interval for e??8 weeksand e??9 weeks, respectively,

are [0.503, 3.377] and [1.315, 3.796]). Further, the

results of this analysis support an antidepressant-

suicidality association among studies of MDD, but

not among studies of other diagnoses (95% credible

intervals for e?MDDand e?OtherDiagnosisare [1.290, 3.640]

and [0.560, 3.380], respectively).

In Table 6, we compare the fit of these

study-level variance components models with each

other using Level 3 focused DIC. We did not find

any one model fitting the data better than any

other since the DICs all fall within a two point

range from 153 to 155. We further compared each

of these models to the fixed-effect model via Level

2 focused DIC. Again, all but one of the models fell

within the two-point range indicating similar fit to

the data. The one model that fell outside this range

was the drug class model, which had an estimated

DIC of 159.8, as compared to the fixed effects DIC

of 157.4.

Sensitivity analysis

To investigate the sensitivity of our results to the

specification of prior distributions, we first exami-

ned the influence of the Level 3 (?2, eg, between

drug formulation) and Level 2 (?2, eg, within drug

formulation) variances. Our approach is to fix the

Level 2 and 3 variances, respectively, at different

points within a plausible range of values and see

how the conditional posterior distribution of the

OR changes. If we see little change in the condi-

tional posterior mean we would conclude that our

results are robust in the sense that they are not very

sensitive to the specification of the prior variances.

Do antidepressants cause suicidality in children? 81

www.SCTjournal.com Clinical Trials 2006; 3: 73–98

Table 4

3 variable is drug formulation

Number of studies used in the model where the Level

Subset a

Studies of Studies

SSRIs for

MDD

Subset bSubset c

Studies

for OCD, All

or ADHD

Subset d

for MDD

anxietystudies

Studies used in estimation

2

–

3

4

2

–

Remeron–

Serzone–

Wellbutrin–

11

SSRI Celexa

Luvox

Paxil

Prozac

Zoloft

Effexor

2

–

3

4

2

2

1

2

–

–

1

2

1

1

2

–

–

1

8

2

1

5

5

3

4

1

2

1

Atypical

Total16 24

Subjects used in estimation

2033Total31211364 4485

SSRI denotes Selective Serotonin Reuptake Inhibitor, MDD

denotes Major Depressive Disorder, OCD denotes Obsessive-

Compulsive Disorder, and ADHD denotes Attention Deficit

Hyperactivity Disorder.

Page 10

82 EE Kaizar et al.

Clinical Trials 2006; 3: 73–98 www.SCTjournal.com

Table 5

Posterior summary measures for ?s. Level 3: drug formulation

Subset a

Subset c

Subset a

Subset b

Subset c

Studies of SSRIs

Subset b

Studies for OCD,

Subset d

Studies of

Studies

Studies for OCD,

Subset d

for MDD

Studies for MDD

anxiety or ADHD

All studies

SSRIs for MDD

for MDD

anxiety or ADHD

All studies

Odds ratio ? e?(95% credible interval)

Pr(odds ratio ? 1) ? Pr(e?? 1)

SSRI

Celexa

1.787

2.106

–

2.204

0.922

0.975

–

0.980

(0.776, 3.364)

(1.002, 3.796)

(1.088, 3.777)

Luvox

–

–

3.658

2.306

–

–

0.763

0.980

(0.229, 15.302)

(1.059, 4.332)

Paxil

1.796

2.139

2.450

2.234

0.934

0.984

0.782

0.994

(0.850, 3.360)

(1.074, 3.834)

(0.369, 8.697)

(1.127, 3.781)

Prozac

1.879

2.187

5.573

2.260

0.952

0.989

0.715

0.992

(0.869, 3.456)

(1.157, 3.755)

(0.218, 9.806)

(1.192, 3.892)

Zoloft

1.780

2.127

11.570

2.119

0.909

0.972

0.661

0.974

(0.736, 3.397)

(0.975, 3.877)

(0.073, 8.746)

(0.998, 3.728)

Atypical

Effexor

–

2.387

1.919

2.275

–

0.945

0.658

0.984

(1.141, 4.938)

(0.177, 6.801)

(1.120, 4.187)

Remeron

–

2.101

–

2.189 –

–

0.944

–

0.965

(0.699, 4.003)

(0.884, 4.092)

Serzone

–

2.161

–

2.193

–

–

–

0.962

(0.707, 4.468)

(0.884, 4.112)

Wellbutrin

–

–

2.435

2.187 –

–

–

0.664

0.961

(0.071, 9.318)

(0.870, 4.108)

The prior distribution for the overall drug effect had mean zero and standard deviation 3; the prior distributions for the within- and between-formulation variances were inverse gamma

with parameters 0.001 and 0.001. SSRI denotes Selective Serotonin Reuptake Inhibitor, MDD denotes Major Depressive Disorder, OCD denotes Obsessive-Compulsive Disorder, and

ADHD denotes Attention Deficit Hyperactivity Disorder. Bold text denotes intervals that do not include one.

Page 11

We display the results of our sensitivity analysis

using a plot suggested by DuMouchel [18].

Figure 3a shows the histogram of the marginal

posterior distribution of the Level 3 variance (?2)

under the drug formulation model described in

Equations (1)–(6). Figure 4a shows a similar histo-

gram for the sample obtained from the model

where the Level 3 variable was psychiatric diagno-

sis. Superimposed on these histograms are the

means of the conditional posterior distribution of

the OR (ie, e?) and each of the study-level category

effects (ie, e?s) conditional on ?2, the between cate-

gory variance which we have set to values between

zero and one. We chose this range because it

encompasses about 90% of the mass of the poste-

rior distribution of the Level 3 variance (?2). Note

that ?2is the variance of the effects on the log scale.

As expected, as the between Level 3 variance (?2)

increases, the conditional posterior mean of each

category moves further from the overall mean. For

example, in Figure 4a, when the between-diagnosis

variance is set at 0.05, the conditional mean OR for

the MDD group is 2.17 (log OR ? ?MDD? 0.777), as

compared to 1.91 (log OR ? ?OCD? 0.645) for the

OCD group, thus indicating that the MDD group is

1.14 times more at risk than the OCD group (on the

odds ratio scale). However, when the between-

diagnosis variance is increased to 0.2, the condi-

tional mean OR for the MDD group is 2.20 (log

OR ? 0.790), as compared to 1.60 (log OR ? 0.473)

for the OCD group, thus increasing the risk differ-

ential to 1.37 times higher for the MDD group. This

increase in risk differential is visually apparent from

the diverging paths of the group means. Further,

the overall mean effect (e?) falls between the indi-

vidual group means (e?s), where each group is

weighted roughly equally. For example, in Figure 4a,

even though the mean effect remains greater than

two for the MDD group, the three other groups pull

the overall mean effect down to near 1.5 when the

Do antidepressants cause suicidality in children? 83

www.SCTjournal.com Clinical Trials 2006; 3: 73–98

Table 7

Posterior summary measures for study-level models applied to all studies (subset d)

Mean between-group

variance, ?2

(95% credible interval)

Mean OR

(95% credible

interval)Level 3 variable Overall or subgroupPr(OR ? 1)

Drug class5.0 (0, 20.4)Overall, ?

SSRI, ?SSRI

Atypical, ?Atypical

Overall, ?

?8 weeks, ??8Weeks

?9 weeks, ??9Weeks

Overall, ?

MDD, ?MDD

Other, ?OtherDiagnosis

2.568 (0.277, 6.482)

2.210 (1.305, 3.550)

1.793 (0.353, 3.651)

2.723 (0.503, 5.217)

1.901 (0.503, 3.377)

2.335 (1.315, 3.796)

2.978 (0.455, 5.775)

2.264 (1.290, 3.640)

1.797 (0.560, 3.380)

0.901

0.999

0.850

0.931

0.940

0.999

0.897

0.996

0.874

Study length 3.1 (0, 11.3)

Psychiatric diagnosis 4.5 (0, 16.2)

The prior distribution for the overall drug effect had mean zero and standard deviation 3; the prior distributions for the within- and

between-group variances were inverse gamma with parameters 0.001 and 0.001. SSRI denotes Selective Serotonin Reuptake Inhibitor

and MDD denotes Major Depressive Disorder. Bold text denotes intervals that do not include one.

Table 6

DIC for various models

Subset a

Studies

of SSRIs for MDD

Subset b

Studies for MDD

or ADHD

Subset c

Studies

for OCD, anxiety

Subset d

All studies

Level 2

focused DIC

Level 2

focused DIC

Level 2

focused DIC

Level 2

focused DIC

Level 3

focused DICModel Level

Fixed effects

Random effects

Level 3 ? Drug formulation

Level 3 ? Drug class

Level 3 ? Trial length

Level 3 ? Diagnosis

(MDD, OCD, anxiety,

ADHD)

Level 3 ? Diagnosis

(MDD, Other)

100.2

100.5

96.9

–

–

–

120.6

120.5

117.4

–

–

–

38.9

39.5

38.2

–

–

–

157.4

159.3

156.8

159.8

159.4

158.3

–

–

155.0

153.4

153.7

154.1

––– 158.3154.1

SSRI denotes Selective Serotonin Reuptake Inhibitor, MDD denotes Major Depressive Disorder, OCD denotes Obsessive-Compulsive

Disorder, and ADHD denotes Attention Deficit Hyperactivity Disorder.

Page 12

Level 3 variance is set at 1.0; the overall mean effect

is closer to the group of three than to the MDD

studies.

In Figures 3b and 4b, we examine how the

conditional probability that an OR is greater

than one changes with ?2. Here the same divergent

trend holds; as the between-Level 3 variance

increases, the probability of an effect greater than

one for each category moves further from the

overall mean.

The effects of changes of ?2, within-Level 3

variances, are displayed in Figures 5 and 6. As the

within-Level 3 variance increases, the shrinkage

towards the prior distribution for the overall effect

? (centered at no effect ? 1) increases. The result is

that as the within-Level 3 variance increases, the

overall OR decreases toward one, ie, towards no

effect (Figures 5a and 6a). The behavior of the

probability that the OR is greater than one is similar

under changes to within-Level 3 variance as under

changes in between-Level 3 variance; the same

forces are at work (Figures 5b and 6b).

We conclude from the variance sensitivity analy-

sis for the drug formulation model that while the

mean effects for Zoloft, Remeron, Serzone and

Wellbutrin become less pronounced when a larger

variance is imposed and the mean effect for Celexa

remains close to the overall effect, the mean ORs for

the other drugs remain greater than two as ?2

increases. Further, while increasing the within-

Level 3 variance (?2) decreases the size of the overall

effect, the variance must reach rather large values

(such as 0.6) to pull the estimated probability that

the overall OR is greater than one below 95%.

We reach similar conclusions of robustness to

the prior for the psychiatric diagnosis model. The

most pronounced feature in this model is that

regardless of changes to the between and within

Level 3 variance (?2and ?2), the MDD group

remained at an increased risk of suicidality (OR ?

e?MDD? 2.2, and the probability of an effect greater

than one is bigger than 95%), while the other

groups quickly fell from being at significant risk as

the between-group variance increased (falling

below 95% probability of the effect being greater

than one at ?2? 0.025 and remaining below 95%

for all values of ?). Note that we found similar

results for the sensitivity analysis of changes in the

84 EE Kaizar et al.

Clinical Trials 2006; 3: 73–98www.SCTjournal.com

Figure 3

panels) of the conditional posterior distribution of the odds ratio of suicidality for drug formulation subgroups, ?s, and the

overall effect, ?, given the prior distribution for the variance of the formulation log odds ratio of suicidality, or between formu-

lation variance, ?2, is set to a point mass at some value between zero and one (right axis). Panel A (top) displays changes in

the mean of the conditional posterior distribution of the log odds ratio of suicidality displayed on an odds ratio scale and Panel

B (bottom) displays changes in the probability the odds ratio of suicidality is greater than one. The models used for estimation

defined Level 3 as drug formulation. The prior distribution for the overall drug effect had mean zero and standard deviation 3;

the prior distribution for the within-formulation variance was an inverse gamma with parameters 0.001 and 0.001

Posterior distribution of between drug formulation variance (left axis) and changes in two measures (top and bottom

Page 13

between and within Level 3 variances when the

study-level variable was either study length or drug

class (displays not shown).

We also found our model robust to changes in

the prior mean and variance of the overall effect ?.

Moving the prior mean from an OR of 1 (no effect)

to 7.4 (strong effect) and moving the prior standard

deviation from 3 to 0.7 changed the posterior prob-

abilities of a drug effect only slightly. The conclu-

sions of our analysis only change when we use a

very strong prior (variance ? 0.49) at a large effect

(OR ? 7.4, log OR ? 2). For example, in Figure 7, we

display 95% credible intervals for the overall effect

e?and group effects e?MDDand e?Other Diagnosisunder six

different prior means and variances for ?. The cred-

ible interval for the overall OR, e?, only excludes

one (no effect) when the prior has mean OR ? 7.4

and variance 0.49. The MDD credible interval does

not include one and the Other Diagnoses credible

interval does include one for all of the prior distri-

butions we examined.

Finally, we note that our exploration of the sen-

sitivity of our results to single study influence via

leave-one-out analyses showed that the exclusion

of any single study did not change the conclusions

of our analysis.

In summary, although the general conclusions

from our sensitivity analyses suggest that the results

from the four-level hierarchical model are fairly

robust, the robustness of the FDA’s meta-analysis is

less clear. First, regarding the Level 3 between study-

level variance component ?2, recall that the FDA’s

random-effects analysis is roughly equivalent to

setting ?2to 0, ie, explicitly specifying no between

Do antidepressants cause suicidality in children?85

www.SCTjournal.com Clinical Trials 2006; 3: 73–98

Figure 4

bottom panels) of the conditional posterior distribution of the odds ratio of suicidality for psychiatric diagnosis subgroups,

?s, and the overall effect, ?, given the prior distribution for the variance of the diagnosis log odds ratio of suicidality, or

between diagnosis variance, ?2, is set to a point mass at some value between zero and one (right axis). Panel A (top) dis-

plays changes in the mean of the conditional posterior distribution of the log odds ratio of suicidality displayed on an odds

ratio scale and Panel B (bottom) displays changes in the probability the odds ratio of suicidality is greater than one. The

models used for estimation defined Level 3 as psychiatric diagnosis. The prior distribution for the overall drug effect had

mean zero and standard deviation 3; the prior distribution for the within-diagnosis variance was an inverse gamma with

parameters 0.001 and 0.001

Posterior distribution of between psychiatric diagnosis variance (left axis) and changes in two measures (top and

Page 14

study-level component of variation. As illustrated

in Figure 4, we see that even a small change in ?2

from 0 results in a relatively large change in the

conditional posterior mean OR for the overall drug

effect as well as for the specific category effect for

diagnoses other than MDD. Similar results were

observed when we considered the other Level 3

variables – study length and drug class. When we

investigate the sensitivity of the overall drug effect

in the context of the full hierarchical model for

extreme ranges of the prior odds ratio and prior

variance, the posterior 95% credible intervals for e?

mostly include one, indicating no overall drug

effect (see Figure 7). In particular, for the subset

of studies where the diagnosis was not MDD the

95% credible intervals all included one. The conclu-

sion of an association between suicidality and

antidepressant use in the MDD subset of studies,

however, does appear to be robust to the prior and

model specifications.

Discussion

Using a Bayesian multilevel modeling approach for

the synthesis of the clinical trial data collected by

the FDA, we found an association between anti-

depressant use and an increased risk of suicidal

behavior and/or ideation 1) among the group of

studies where the diagnosis was major depressive

disorder (MDD), and 2) among the group of studies

where the antidepressant was an SSRI. However, we

did not find evidence for such an association among

studies where the diagnosis was not MDD or where

86EE Kaizar et al.

Clinical Trials 2006; 3: 73–98 www.SCTjournal.com

Figure 5

panels) of the conditional posterior distribution of the odds ratio of suicidality for drug formulation subgroups, ?s, and the

overall effect, ?, given the prior distribution for the variance of the study log odds ratio of suicidality within each formulation,

or within formulation variance, ?2, is set to a point mass at some value between zero and one (right axis). Panel A (top) dis-

plays changes in the mean of the conditional posterior distribution of the log odds ratio of suicidality displayed on an odds

ratio scale and Panel B (bottom) displays changes in the probability the odds ratio of suicidality is greater than one. The models

used for estimation defined Level 3 as drug formulation. The prior distribution for the overall drug effect had mean zero and

standard deviation 3; the prior distribution for the between-formulation variance was an inverse gamma with parameters 0.001

and 0.001

Posterior distribution of between drug formulation variance (left axis) and changes in two measures (top and bottom

Page 15

the drug was an atypical antidepressant. These

latter results have not been reported previously.

The use of Bayesian hierarchical models for com-

bining evidence across multiple studies allows

greater flexibility in modeling for several reasons.

First, the model allows us to assess multiple sources

of variability in one unified probabilistic framework.

Whereas in their meta-analysis the FDA made the

strong assumption that all the primary studies had

homogeneous effects regardless of study-level chara-

cteristics (eg, drug formulation or psychiatric diag-

nosis), in our reanalysis we did not make such an

assumption. Rather, we explicitly allowed for multi-

ple sources of variability because we do not want

to make the strong assumption that the risk of sui-

cidality is the same for children who use antidepres-

sants who have, for example, different psychiatric

diagnoses, or who are exposed to different classes of

antidepressants. We summarize the evidence about

the outcome of interest using clinically meaningful

measures based on the posterior distribution, such

as the posterior probability of a drug effect.

Second, the Bayesian approach allows us to

incorporate existing expert opinion and/or objec-

tive evidence based on historical data into the

current analysis [see, eg, 19,20]. For example, it is

our understanding that before the FDA began to

examine possible associations between antidepres-

sant use and suicidality that there was no consensus

about such a relationship, particularly between

antidepressant use and the risk of suicidal ideation

and/or behavior. Initially, we incorporate this lack

of consensus into our analysis by the use of a

diffuse prior distribution with mean for the overall

Do antidepressants cause suicidality in children? 87

www.SCTjournal.comClinical Trials 2006; 3: 73–98

Figure 6

bottom panels) of the conditional posterior distribution of the odds ratio of suicidality for psychiatric diagnosis subgroups, ?s, and

the overall effect, ?, given the prior distribution for the variance of the study log odds ratio of suicidality within each diagnosis,

or within diagnosis variance, ?2, is set to a point mass at some value between zero and one (right axis). Panel A (top) displays

changes in the mean of the conditional posterior distribution of the log odds ratio of suicidality displayed on an odds ratio scale

and Panel B (bottom) displays changes in the probability the odds ratio of suicidality is greater than one. The models used for

estimation defined Level 3 as psychiatric diagnosis. The prior distribution for the overall drug effect had mean zero and standard

deviation 3; the prior distribution for the between-formulation variance was an inverse gamma with parameters 0.001 and 0.001

Posterior distribution of between psychiatric diagnosis variance (left axis) and changes in two measures (top and