Content uploaded by Hannes Kröger
Author content
All content in this area was uploaded by Hannes Kröger on Apr 13, 2017
Content may be subject to copyright.
!
1!
Logistic( Confusion( -( An( extended( treatment( on( cross-group( comparability(
of(findings(obtained(from(logistic(regression((
!
Hannes!Kröger,!German!Institute!for!Economic!Research!(DIW),!Berlin!
hkroeger@diw.de!
Jan!Skopek,!Trinity!College!Dublin!
jan.skopek@tcd.ie!
Working!Paper;!version!April!13,!2017!
!
Number!of!words:!12625!
Number!of!figures:!4!!
Number!of!tables:!7!
!
2!
Table!of!contents!
!
1!INTRODUCTION* 4!
1.1!TYPES!OF!COMPARABILITY! 7!
1.2!LATENT!VARIABLE!VERSUS!NATURAL!CATEGORICAL!APPROACHES! 8!
2!COMPARABILITY*UNDER*THE*LATENT*VARIABLE*AND*NATURAL*CATEGORICAL*
FRAMEWORK*11!
2.1!LATENT!VARIABLE!APPROACH!11!
!"#"#!$%&'()*%+,,'*'+-(.)/-0)%00.12/('%.)##!
!"#"!!34+2/&+)5/2&'-/6)+,,+*()#!!
!"#"7!8(/-0/20'9+0)*%+,,'*'+-(.)#:!
!"#";!3)<%-(+1=/26%).'5>6/('%-).(>0?)#:!
2.2!NATURAL!CATEGORICAL!DEPENDENT!VARIABLES!21!
!"!"#!34+2/&+)5/2&'-/6)+,,+*()!#!
!"!"!!@00.12/('%.)A)B'4/2'/(+)5%0+6.)!#!
!"!"7!@00.12/('%.)A)(C+)5>6('4/2'/(+)*/.+)!!!
!"!";!D'.E)2/('%)!7!
2.3!CONDITIONAL!AND!MARGINAL!INTERPRETATIONS!OF!OR!IN!MULTIVARIATE!MODELS!WITHIN!THE!
NATURAL!CATEGORICAL!FRAMEWORK!23!
!"7"#!=%-0'('%-/6)%00.12/('%.)!F!
!"7"!!8?-(C+('*)5/2&'-/6)%00.12/('%.)>.'-&)'-4+2.+)G2%B/B'6'(?)H+'&C('-&)!F!
!"7"7!IJ/5G6+)%,).?-(C+('*)5/2&'-/6)*%5G/2'.%-)!K!
3!AVERAGE*MARGINAL*EFFECTS,*RISK*RATIOS*AND*ODDS*RATIOS*–*UNITED*WE*
UNDERSTAND*36!
3.1!THE!COMPLEMENTARY!NATURE!OF!AME,!RR!AND!OR!36!
3.2!EXAMPLE!–!EDUCATIONAL!ATTAINMENT!AND!INTERGENERATIONAL!MOBILITY!37!
4!CONCLUSION*42!
5!REFERENCES*45!
6!APPENDIX*49!
6.1!S1!-!A!FORMAL!TREATMENT!OF!COMPARISON!49!
6.2!S2!–!ADDITIONAL!TABLES!AND!GRAPHS!51!
0
!
3!
Abstract!
!
Our! paper! discusses! cross-group! comparability! of! findings! obtained! from!
logistic!regression!in!a!systematic!way.!Recent!methodological!literature!in!sociology!
pointed! to! serious! pitfalls! of! logistic! regression! when! it! comes! to! comparability! of!
estimates! between! groups! and! samples.! Whereas! this! critique! is! mainly! driven! by!
statistical! concerns! we! argue! that! comparability! of! findings! depends! essentially! on!
the!conceptual!treatment!of!the!outcome!as!either!natural!categorical!or!based!on!a!
latent! variable! approach.! We! demonstrate! that! the! prevailing! methodological!
skepticism!about!cross-group!comparability!of!logistic!regression!is!preoccupied!by!a!
latent! variable! perspective.! In! addition,! we! show! that! under! the! latent! variable!
framework! the! use! of! average! marginal! effects! from! comparison! across! groups! or!
sample!is! as! unreliable! as! the!comparison! of! (log)! odds-ratios! that!has! been! in! the!
focus!of!criticism.!!
When!we!treat!outcome!variables! as! natural! categorical,! though,! cross-group!
comparisons!work!differently,!and! the! generalized!claim!that!odds-ratios!cannot!be!
compared!across!groups!does!not!hold.!Furthermore,!we!argue!that!the!crucial!point!
is! the! preference! in! many! sociological! applications! for! 5/2&'-/6! instead! of!
*%-0'('%-/6!interpretations!of!effect!estimates.!Our!paper!proposes!a!procedure!that!
allows! estimating! 5/2&'-/6!odds-ratios! that! are! adjusted! for! control! variables! and!
are!comparable!between!groups.!In!addition,!we!show!that!in!addition!to!odds!ratios!
(OR)!and! average!marginal!effects!(AME),!the!relative!risk!(RR)! is!another!useful!but!
largely! underused! metric! for! making! comparisons.! As! they! reflect! different!
perspectives!that!are!not!simply!exchangeable,!we!conclude!that!researchers!should!
use!AME,!OR!and!RR!jointly!to!evaluate!findings!obtained!from!logistic!regression.!!
!
4!
1 !Introduction!
The! systematic! comparison! of! observed! regularities! in! populations! is! an!
essential! part! of! research! in! social! sciences.! The! questions! behind! most! of! this!
comparative! work! is! whether! associations! between! variables! systematically! differ!
across! groups,! time! and! space.! In! most! circumstances,! descriptive! analysis,! i.e.!
assessing! whether! patterns! of! associations! are! different! (or! not),! precedes!
identifying!causal!processes!generating!these!patterns.!!
As! comparability! of! concepts! is! a! central! part! of! sociological! inquiry,!
comparability! of! statistical! quantities! reflecting! relationships! between! these!
constructs! is! highly! desirable.! In! sociological! research! comparability! of! regression!
coefficients!from! logistic! regression!became! subject! of! strong! debates.! In! our!view!
the! currently! dominant! view! in! sociology! proposes! that! in! contrast! to! linear!
regression!coefficients!in!logistic!regression!models!(and!in!other!non-linear!models)!
cannot!be! directly! compared! across! different! samples,! groups! and! models! (Allison,!
1999;! Holm,! Ejrnæs,! &! Karlson,! 2014;! Mood,! 2010;! Winship! &! Mare,! 1984).! In! a!
seminal! paper,! Mood! elaborates! that! the! issue! of! comparability! arises! when! one!
wants!to!interpret! these! coefficients! as! estimates!of!‘substantial’! effects.! Precisely,!
Mood!(2010)!asserts!that!it!is!problematic!(1)!to!interpret!odds!ratios!as!substantive!
effects!since!they!reflect!also!unobserved!heterogeneity,!(2)!to!compare!odds!ratios!
across! nested! models! because! unobserved! heterogeneity! is! certain! to! vary! across!
such!models,!and! (3)! to!compare!odds!ratios!from!the! same! model!across!samples,!
groups,! or! over! time! because! unobserved! heterogeneity! can! vary! across! samples,!
groups,!or!time.!We!agree!with!point!(2)!and!do!not!discuss!the!use!of!nested!models!
as!we!think!they!reflect!a!different!kind!of!research!logic!(mediation!vs.!moderation)!
and!are!not!covered!by!the!argument!we!make!here.!Treatment!of!these!models!has!
been!discussed! thoroughly! (e.g.! Karlson,! Holm,!&! Breen,! 2010;! Tchetgen! Tchetgen,!
2013).!!
However,! in! our! paper! we! first! argue! that! research! practice! has! been!
generalizing! issues! (1)! and! (3)! to! settings! in! which! the! arguments! elaborated! by!
Mood!(2010)! are!no!longer! valid!or!highly! contingent!on!the! researcher’s!agenda.!A!
major!problem!underlying! the! discussion!about!comparability!of!logit! coefficients!is!
!
5!
that! no! substantive! criterion! for! comparability! across! groups! is! explicitly! defined,!
although,!this!is!a!logical!prerequisite!when!insinuating!incomparability.!Establishing!
a!conceptual! scheme! for! comparison,! we! propose! to!use! an! old! distinction! for! the!
classification!of!research!problems!that!address!a!categorical!dependent!variable!(a)!
as! a! proxy! measurement! for! an! unmeasured! latent! constructs! or! (b)! as! a! natural!
categorical!outcome!G+2).+#)(Winship!&!Mare,!1983).!Note!that! this!distinction!is!of!
genuinely!theoretical!nature!and!should!not!be!guided!by!statistical!reasoning!alone.!
It! is! part! of! the! epistemological! orientation! and! theoretical! considerations!
underpinning! a! particular! empirical! study.! Important! in! this! respect! is! that!
comparability! of! the! same! statistical! quantities! depends! on! the! theoretical!
framework!as!we!will!show!in!this!paper.!!
Using!/4+2/&+) 5/2&'-/6) +,,+*(.! on! the! probability! outcome! (AME)! has! been!
emerging!as!a!popular!technical! way! of! circumventing!the!issue!of!comparability!of!
logit! coefficients! resulting! in! a! practice! of! an! entire! withdrawal! of! regression!
coefficients!(see! recommendations!by!Mood,!(2010)).!Yet,!while!this!may!be!a!good!
advice! in! some! circumstances! it! may! be! rendered! problematic! in! others.! Hence,!
secondly,! our! paper! will! systematically! elaborate! where! the! use! of! AME! might! be!
appropriate! and! where! not.! Without! preempting! much! of! the! later! discussion,! we!
argue!that!for!research!questions!falling!in!category!(a)!–!latent!variables!–!a!reliance!
on! AME! is! -%(! a! remedy! for! problems! in! comparability! across! groups.! The! recent!
methodological!literature!has!been!unclear!on!this!issue.!
Third,! we! argue! that! for! research! questions! falling! in! category! (b)! –! natural!
categorical! –! several! metrics! like! AME,! %0012/('%.! (OR)! or! 6%&1%00.) 2/('%.! (logit!
coefficients),! and! also! the! rather! seldom! used! 2'.E) 2/('%) %2) 2+6/('4+) 2'.E! (RR)! are!
comparable! across! groups.! Yet,! whereas! in! bivariate! models! the! odds-ratio! has! a!
straight-forward! 5/2&'-/6! interpretation,! in! multivariate! models! the! interpretation!
of!*%-0'('%-/6!odds!ratios!is!often!more!difficult!especially!when!comparing!their!size!
across!groups.!In!our!paper!we!propose!a!procedure!that!involves!the!estimation!of!
.?-(C+('*) 5/2&'-/6!odds-ratios! which! enables! meaningful! cross-group! comparisons!
of! odds! ratios! from! multiple! logistic! regression.! Our! approach! preserves! both! the!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1!The! argument! made! in! this! paper! can! be! generalized! to! the! analysis! of! latent! categorical!
outcomes!measured!by!both!categorical!(latent!classes)!and!continuous!(latent!profiles)!indicators.!
!
6!
odds-ratio!and!marginal!interpretation!while! allowing! adjustments! for! covariates!at!
the!same!time.!
Fourth,! we! argue! that! under! a! natural! categorical! framework,! a! joint! use! of!
AME,! OR! and! RR! might! often! be! the! best! practice! for! both! reporting! and!
interpretation! of! results.! The! quantities! complement! each! other! and! yield! insights!
into! cross-group! comparisons! that! are! not! obtainable! if! we! focus! only! on! one! of!
them.!
The! rest! of! the! paper! is! organized! as! follows.! We! first! define! two! types! of!
comparability!which!we!will!consistently!apply!to!various!metrics!related!to!logistic!
regression!as!discussed!in!the!paper:!comparability!of!.'9+!and!comparability!of!.'&-.!
We!then!introduce!the!conceptual!distinction!between!research!questions!focusing!
on!-/(>2/6)*/(+&%2'*/6)LM=N!dependent!variables!and!those!with!a!6/(+-()4/2'/B6+)L$ON!
framework!in!mind.!!
In!the!second!section,!we!assess!comparability!under!NC!and!LV!frameworks!of!
the!(log)! OR,! RR,!AME,! and! y*!standardized! coefficients.! For!the!LV! framework! we!
present!results!from! a! simulation! study! demonstrating!that!none! of! the! commonly!
estimated!quantities!is!comparable!in!size!across!groups!without!making!very!strong!
and!usually!non-testable!assumptions.!Furthermore,!we!provide!examples!for!group!
comparisons! within! both! the! LV! and! NC! framework.! In! the! NC! framework! we!
introduce!the!concept!of!a! .?-(C+('*) 5/2&'-/6!%00.) 2/('%! (SMOR)! as!a!quantity!that!
might!be!easier!to!interpret!and!compare!across!groups!for!certain!types!of!research!
questions.!
In!the!third!section!we!recommend!that!researchers!do!not!limit!themselves!to!
reporting!and!interpreting! just! one! of! AME,!RR!or! OR,! even! if! our! theory!seems!to!
suggest! one! of! them! fits! our! research! purpose! better.! We! give! an! example! for!
interpretation!and!suggestions!for!reporting!and!presentation.!!
Our!discussion!is!not!aiming!at! criticizing! previous! methodological! studies! G+2)
.+.!The!core!arguments!of!the!methodological! debate!are!inherently!valid! and!have!
been!laid!down!impressively.!Aiming!to!reduce!confusion!in!applied!social!research!
dealing! with! categorical! outcomes,! this! paper! systematizes! the! debate! on!
comparability! of! logistic! regression! model! across! groups,! points! weaknesses! and!
misconceptions! in! the! literature! and! gives! some! practical! suggestions! how! to! link!
!
7!
different!research!questions!to!various!quantities!estimated!from!logistic!regression.!
Therefore,! we! conclude! in! the! fourth! section! with! an! appeal! for! a! closer! link!
between! theory! and! methodological! application! as! well! as! more! openness! for!
different!types!of!research!question!and!their!methodological!implementations.!
!
1.1 Types*of*comparability**
The!first!step!to!a!more!systematic!discussion!of!comparability!is!first!to!reflect!
and!define!what!comparability!means.!While!this!might!seem!obvious!at!first!glance,!
we!will!show!that! it! is! not!obvious!what!comparability!in!logistic!regression!models!
means! and! that! there! are! different! legitimate! answers! to! this! question.! It! is! also!
noteworthy! that! almost! all! studies! referring! to! the! argument! in! Mood! (2010)! or! a!
similar! earlier! version! of! the! argument! do! not! define! comparability,! and! many!
methodological! articles! do! not! give! an! explicit! definition! either! (exceptions! are!
usually!studies!explicitly!referring! to!a!latent!variable!model!like! (Holm! et!al.,!2014;!
Karlson!et!al.,!2010)).!This!includes!work!criticizing!Mood!or!its!reception!(Buis!2016;!
Skopek!2016).!
At! the! most! general! level,! one! can! distinguish! comparability! of! quantities! in!
terms! of! .'&-! and! in! terms! of! .'9+.! The! first! relates! to! our! ability! to! assess! and!
compare! the! direction! of! statistical! effects! (as! based! on! a! particular! quantity)! that!
could! be! either! positive,! negative! or! zero.! The! second! dimension! relates! to! our!
ability! to! assess! and! compare! the! size! of! statistical! effects! based! on! particular!
statistical! quantities,! thus,! it! involves! a! quantification! of! differences! across! groups.!
Note!that!these!types!of!comparability!are!so!general!in!nature!that!they!do!not!only!
apply! to! applications! of! logistic! regression,! but! also! other! types! of! statistical!
estimation! technique.! Even! if! these! distinctions! are! useful! for! the! purpose! of! our!
paper,! we! do! not! claim! that! they! are! exhaustive! or! the! only! way! in! which!
comparability! could! be! classified.! Whereas! comparability! of! sign! usually! does! not!
represent! any! conceptual! problem! in! the! context! of! logistic! regression,! the!
comparability! of! size! does.! Appendix! S1! provides! a! more! detailed! and! formalized!
elaboration!on!these!two!types!of!comparability.!
!
!
8!
1.2 Latent*variable*versus*natural*categorical*approaches*
When!dealing!with!categorical!dependent!variables,!empirical!research!should!
define! whether! a! variable! is! treated! as! a! -/(>2/6) */(+&%2'*/6!(NC)! variable! or! a!
variable! representing! manifestations! of! an! underlying! 6/(+-() 4/2'/B6+! (LV,!
continuous)! that! cannot! or! is! not! directly! observed.! In! the! first! case,! one! may! be!
exclusively!interested!in!which!category!a!unit!or!individual!is!sorted!into.!From!that!
vantage,!differences! between!the!categories!are!manifest!and/or!have!a!substantial!
meaning! or! consequence.! For! instance,! an! A-level! equivalent! degree! is! needed! in!
most!countries! to! attend! university! and,!consequently,! having! this! degree! or!not!–!
independent! of! the! true! abilities! an! individual! possesses! –! bears! important!
consequences! for! individuals! being! eligible! for! admission! to! higher! education.! In!
many!cases,! we!would!also!think!that!these!categories!exist!beyond!our! research!in!
the!real! world,!although!this!does!not!need!to!be!the!case.!Our!theory!and!possible!
(social/causal)! mechanisms! would! refer! to! the! categories! of! the! observed! variable!
and! how! membership! in! these! categories! is! determined,! as! well! as! what!
consequences!the!membership!in!these!categories!might!have!for!the!subjects!under!
study.! Research! questions! under! a! NC! framework! would! rely! on! the! odds! and!
probability!scales!as!the!occurrence!or!non-occurrence!of!events!and!their!respective!
probabilities!or!odds!are!of!interest!G+2) .+.! Historically! this! approach! can! be! traced!
back!to!Georg!Udny! Yule! (Yule,! 1900,! 1903),! who! believed! some!variables!(but!not!
all)!can!be!seen!as!inherently!discrete!*6/..+.)or!natural!categorical!in!our!terms:!
“[…],! any! one! object! must! be! held! either! to! possess! the! attribute! or! not.”!!
(Yule,!1911)!
From! this! stance,! the! investigation! of! categorical! dependent! variables! is!
rendered!as!a!problem!of)*6/..','*/('%-.!
In!contrast!to!Yule,!Karl!Pearson! (Pearson!&!Heron,!1913)! advocated!the!view!
that! associations! of! categorical! variables! are! only! reflections! of! underlying!
continuous! distributions! (Agresti,! 2013).! Based! on! Pearson’s! idea,! we! can!
conceptualize!categorical!variables!as!manifestations!of!a!latent!variable!operating!in!
the! background.! It! is! not! the! membership! in! an! observed! category! that! we! are!
ultimately! interested! in! but! rather! what! this! membership! implies! for! the! score! on!
the!underlying!latent!variable.!Individuals’!membership!in!one!or!another!category!is!
!
9!
determined! by! their! score! on! a! latent! (usually)! unobserved! variable! which! we!
conceive! as! the! substantive! process! under! study! generating! (observable)!
classificatory!outcomes.!For!instance,!individuals!with!good!driving!skills!will!be!more!
likely!to!pass!a!driving!test.!Thus,!in!absence!of!a!measurement!of!driving!skills!we!
may!use!a!categorical!variable!‘having!passed!the!driving!test’!as!a!categorical!proxy!
for!measuring!those!skills.!In!a!LV!framework!the!conclusions!we!want!to!draw!would!
refer!to!the! underlying! concept!that!is!relevant!to! our!research!question!not!to! the!
observed! categories.! Our! theory! would! refer! to! the! latent! process! underlying! the!
observed! variable! and! how! it! is! (causally)! shaped! by! other! variables,! or,! how! it!
(causally)!shapes!other!variables.!Such!a! paradigm! imposes! research! questions!that!
are! interested! in! quantities! measured! on! the! scale! of! the! latent! variable! or!
standardized!forms!of! this!scale!and!not!in! the! probability!of!the!occurrence!of! any!
discrete!event!G+2).+.!
Note!that!while!it!is!useful!to!make!this!distinction! between! a! latent! variable!
and!a!categorical! approach,! any! single! dependent! variable!can!often! be! linked! in! a!
plausible!way!to!both!approaches! (Winship! &! Mare,!1983,!p.!56)!depending!on!the!
theoretical!framework,!the!mode!of!data!collection!and!the!research!question!that!is!
chosen!(see!Table!1).!For! example,! we! might!consider!completing!an!A-level!school!
degree! as! being! a! naturally! categorical! variable,! because! we! are! motivated! by! the!
consequences! implied! by! crossing! the! threshold! to! attainment! versus!non-
attainment!(e.g.!for!future!career!and!life!chances).! Alternatively,! we! could! also! be!
interested!in!the!educational!performance!that!corresponds!to!the! attainment!of!A-
level.!In! this!case,!it!is!not!of!major!importance!if!an!individual!is!correctly!classified!
based!on!a!model!prediction!in!having!or!not!having!an!A-level,!as!we!want!to!draw!
conclusions!about!the!latent!variable,!general!educational!performance!or!ability.!
!
10!
Table!1:!!Examples!of!dependent!variables!and!how!they!might!categorized!into!NC!and!LV!framework!
!
!
Categorized!…!
Conceptualized!as…!
by!the!researcher!
During!the!process!under!study!
!
Indicator!
Concept!
Indicator!
Concept!
Natural!categorical!
Obesity!
Obesity!
College!degree!
Educational!attainment!
Latent!variable!
Obesity!
Weight!
College!degree!
Academic!performance!
!
!
11!
2 Comparability!under!the!latent!variable!and!natural!
categorical!framework!
2.1 Latent)variable)approach)
Based!on! our!definitions!of! comparability!and!the! distinction!between!NC! and!
LV! research! questions! we! can! assess! which! of! the! commonly! used! quantities! in!
logistic! regression! reflect! meaningful! comparisons! of! concepts! in! the! context! of!
research!questions!related!to!a!NC!or!LV!thinking.!We!begin!by!considering!research!
questions!adopting!a! LV! approach.!Similar!to!both!early! and! recent!studies!(Allison,!
1999;! Breen,! Holm,! &! Karlson,! 2014;! Mood,! 2010;! Winship! &! Mare,! 1983)! we!
assume! that! there! is! an! underlying! variable! !"! that! is! determined! by! a! set! of!
predictors!and!linked!to!the!observed!dichotomous!#!in!the!following!way:!
!"$ % & '()(& *!!(1)!
# $ +,-.,#", / 0!
# $ 0,-.,#"1 0!
An!inherent! problem! of! this! approach!is! that! the! scale! of! #"!is! unknown! and!
consequently!the!variance!of!*!cannot!be!estimated,!but!is!set!to!a!fixed!value,!3.29!
in! the! case! of! logistic! regression.! Therefore,! we! are! not! able! to! estimate! the!
structural!coefficient!)(! ,! i.e.!the!coefficient!on!the!scale!of! the! latent! variable,! but!
only!the! !23
43
$ 5(,!the!logit!coefficient,!a! rescaled!variant!with! 6 $ 789:;<
=>?@ !the!scaling!
factor.!!
2.1.1 Logit!coefficients!and!odds-ratios!
Based! on! the! latent! variable! model,! logit! coefficients! are! not! generally!
comparable! across! groups! as! estimates! for! the! effect! on! the! latent! scale! without!
additional!assumptions.!If!we!take!the!expectation!of!the!difference!in!the!estimates!
of!logit!coefficients,!we!will!not!get!the!true!difference!between!the!LV!coefficients.:!
A 5(B 5CD , )(B )C!!(2)!
!
If! we! assume! 6($ 6C,! logit! coefficients! are! in! fact! comparable! in! size,!
otherwise! they! only! have! comparability! of! sign.! Assuming! equal! unobserved!
!
12!
heterogeneity,!which!implies!EFG:*<($EFG:*<C,!is!likely!to!be!very!unrealistic!and!
hard!to!defend!in!many!empirical!settings!(Mood,!2010).!!
!
2.1.2 Average!marginal!effect!
In! the! literature! that! builds! on! Mood! and! her! paper! itself! it! is! claimed! that!
marginal! effects! are! not! affected! (or! less! affected)! by! the! problem! of! unobserved!
heterogeneity.!Therefore,! the!use!of!AME!is!recommended!as!one!possible!solution!
to! the! comparability! problem.! However! –! based! the! previous! distinction! of!
approaches!–!we!need!to!reconsider! whether! AME! indeed! are! comparable!under!a!
LV!scenario.!
A+
H(
)(
6(
IJ%(& '(K )(
6(
L3
KMN
B+
HC
)C
6C
IJ%C& 'CK)C
6C
LO
KMN
D )(B )C,:P<!
!
The! term! above! only! equals! the! difference! in! structural! coefficients! of! the!
latent!variable! equation! if! 1)! truncation! of!the!latent! variable! (%),! 2)! the! degree! of!
unobserved! heterogeneity! (6),! and! 3)! the! distribution! of! the! independent! variable!
and! all! covariates! are! equal! (different! moments! of! the! distribution!
A ' Q A '?Q A:'=<<.!Usually,!we!cannot!assert!this!in!applied!social!science!research!
(Holm!et!al.,!2014).!!
We!can!also!get!an!intuitive!understanding!why!a!comparison!of!AMEs!cannot!
be! immune! to! unobserved! heterogeneity! if! they! are! to! represent! structural!
coefficients! in! the! latent! variable! model.! If! there! was! very! little! unobserved!
heterogeneity!(UH)!and!a!given!structural!effect!on!the!LV,!in!the!observed!model!we!
would! expect! a! rather! strong! AME.! If! in! the! other! groups! there! are! many! other!
factors!influencing!the!outcome,!we!might!have!a!huge!amount!of!UH.!This!leads!to!
the!fact!that!most!variation!in!the!observed!dichotomous!outcome!is!not!due!to!the!
predictor.!In!other!words,!discrimination!among! categorical!outcomes!based!on! the!
predictor!will!become!increasingly!weak.!If!we!are!approaching!an!infinite!amount!of!
UH,!assignment!to!the!outcome!based!on!the!predictor!would!effectively!be!random!
and!there!would! be!no!probability!difference! (AME)!between!different!levels! of!the!
predictor! variable.! Figure! 1! illustrates! the! sensitivity! of! the! AME! to! UH! for! a! less!
!
13!
extreme! case.! The! structural! latent! variable! coefficient! is! the! same! in! each! of! the!
groups,! but! the! UH! is! increased! for! each! group! from! left! to! right.! This! leads! to! a!
higher! dispersion! of! the! latent! variable! score! and! the! overlap! of! the! observed!
dichotomous!indicator! #!becomes!larger,!so!that! both!OR!and!AME!become!smaller!
in! groups! with! higher! UH.! We! can! see! that! the! #"-standardized! coefficient! also!
becomes!smaller.!This!reflects!the!reduction!in!explanatory!power!of!predictor!x!that!
we!see!from!left!to!right!and!that!can!be!seen!in!the!LV!model!using!the,R?.!
The!conclusion!that!AME!is!not!comparable!across!groups!if!the!latent!variable!
coefficient! is! of! interest! corroborates! the! finding! of! Holm! et! al.! (2014),! who! show!
that!the!coefficient!from! a!linear!probability!model!is!not!comparable!across!groups!
if!the!latent!variable!coefficient!is!the!point!of!reference!for!the!comparison.!
However,! AME! also! has! comparability! of! sign,! but! –! in! essence! –! is! no! more!
helpful! than! the! logit! coefficient! for! across! group! comparisons! within! the! LV!
framework.!To!the! contrary,! the!assumptions!underpinning!comparability! are! much!
more! complex! for! the! AME! than! for! the! logit! coefficient.! Summing! up,! we! can!
identify!differences!in!sign!of!association!with!any!of!these!quantities,!but!not!more.!
Why! did! the! misconception! arise! that! AME! might! be! immune! against!
unobserved!heterogeneity?!On!a!superficial!glance,!such!a!claim!can!also!be!found!in!
Wooldridge! (2002,! pp.! 471–472).! However,! he! argues! that! within& one& model,! the!
degree!of!unobserved! heterogeneity!does!not!affect! the!estimation!of!the! marginal!
effect.!This!refers!to! the! inclusion! or!exclusion!of!(unrelated)!variables.!This! implies!
that! nested! models! are! compared! which! are! all! special! cases! of! a! more! general!
model.! There! is! no! claim! that! the! AME! is! a! useful! approximation! for! comparing!
latent!variable!coefficients!across!groups.!To!the!contrary!he!notes:!!
“The! bottom! line! is! that,! except! in! cases! where! the! magnitudes! of! the! )Sin!
equation!(15.34)!have&some&meaning,!omitted!heterogeneity!in!probit!models!is!not!
a!problem.”![emphasis!added]!(Wooldridge,!2002,!p.!471)!
Within!a!LV! framework!the!meaningfulness!of!the!coefficients!on! the!LV!scale!
are!exactly!the!main!assumption!and!distinction!from!the!NC!framework.!!!
In! Mood’s! (2010)! analysis,! simulations! for! demonstrating! the! robustness! of!
AME! against! unobserved! heterogeneity! are! done! in! the! context! of! nested! models,!
not!in!the!context!of!comparisons!across!groups.!Yet,!eventually,!these!results!were!
!
14!
generalized!in!the!conclusion!to!hold!true!also!for!comparisons!across!groups!(Mood,!
2010,!p.!80).!Under!the!latent!variable!framework!that!Mood!obviously!adopts!in!the!
first!place,!this!claim!is!not!accurate.!To!our!view,!this!important!detail!got!lost!in!the!
reception!of!the!argument!and!the!undifferentiated!claim!that!AME!are!less!affected!
or!unaffected!by!unobserved!heterogeneity!was!resonating!in!subsequent!research.!!
Contrary,! our! proposed! distinction! between! natural! categorical! and! latent!
variable! framework! can! contribute! to! the! discussion! as! it! makes! it! is! easier! to!
distinguish! in! which! cases! AME! is! comparable! and! in! which! it! is! not.! It! is! readily!
comparable! as! a! quantity! that! estimates! (conditional)! absolute! probability!
differences,! but! it! is! only! comparable! in! size! for! latent! variable! coefficients! to! the!
degree!that!very!strong!and!usually!untestable!assumptions!hold.!!
!
!
!
!
!
!
!
!
!
!
!
!
!
15!
Figure! 1! Simulated! data! example:! Latent! variable! and! binary! outcomes! for! four!
groups.!
!
Note:!Underlying!model! is!#"$ P' & *!(Model! 1)!for! both!groups! with!TU
V$ WVXV
Y,,!and!scaling! factor!s=1!for! Group!A! and!
s=4!for! Group!B.! Variable!x!is! normally!distributed! with!mean! 0!and!variance! 1,!identically! for!both! groups.!Groups!share! the!
same!effect!of!'!() $ P)!on!the!latent!variable!#"!but!differ!in!the!residual!heterogeneity!as!expressed!by!the!R?(Model!1).!The!
larger! heterogeneity! of! Group! B! translates! into! a! smaller! logit! coefficient! and,! thus,! a! smaller! odds! ratio! in! the! logistic!
regression!model!(Model!2).!!!!!
R2=.73
b*=3
-30
-20
-10
0
10
20
30
y*
-4-3-2-1 0 1 2 3 4
x
Group 00
R2=.17
b*=3
-30
-20
-10
0
10
20
30
y*
-4-3-2-1 0 1 2 3 4
x
Group 10
R2=.04
b*=3
-30
-20
-10
0
10
20
30
y*
-4-3-2-1 0 1 2 3 4
x
Group 01
R2=.02
b*=3
-30
-20
-10
0
10
20
30
y*
-4-3-2-1 0 1 2 3 4
x
Group 11
OR=19.31
AME=0.34
y*std=.85
0
1
Y
-4 -3 -2 -1 0 1 2 3 4
x
Group 00
OR=2.22
AME=0.17
y*std=.4
0
1
Y
-4 -3 -2 -1 0 1 2 3 4
x
Group 10
OR=1.35
AME=0.07
y*std=.16
0
1
Y
-4 -3 -2 -1 0 1 2 3 4
x
Group 01
OR=1.14
AME=0.03
y*std=.07
0
1
Y
-4 -3 -2 -1 0 1 2 3 4
x
Group 11
!
16!
!
!
2.1.3 Standardized!coefficients!
If!we!use!the!y-standardized!coefficients! we!have!a!metric! on!the!scale!of! the!
SD!of!the!y*!variable.!Given!that!the!assumption!about!the!distribution!of!the!error-
term!(logistic!distribution)!is!fulfilled!it!can!be!shown!that!(Breen!et!al.,!2014):!
!
AZ3
Z3
[789 \3]^[
_
BZO
ZO
[789 \O]^[
_
$23"`a:\3<
`a:b3
"<B2O"`a:\O<
`a:bO
"<,!(4)!
!
However,! for! the! absolute! difference! on! the! latent! variable! scale! the!
standardized!coefficients!is!also!not!exactly!comparable!(Duncan,!1975):!!
!
AZ3
Z3
[789 \3]^[
_
BZO
ZO
[789 \O]^[
_
D )(B )C!(5)!
!
The! inequality! result! from! the! fact! that! the! model! is! under-identified! if! the!
scale! is! not! arbitrarily! fixed.! Put! differently,! there! is! a! lack! of! information! that! can!
only!be!solved!by!changing!assumptions!(like!in!heterogeneous!choice!models)!or!by!
gathering!additional!information!like!repeated!measurements.!
!
2.1.4 A!Monte-Carlo!simulation!study!!!!
To!demonstrate!that!our!claims!for!the!LV!hold!despite!different!perceptions!in!
applied! research,! we! conducted! a! Monte-Carlo! simulation! comparing! the!
performance!of!logit!coefficients,!relative!risk!from!a!log-binomial!model,!AME!from!
logistic!regression,!standardized! coefficients!and!estimates!from!a! linear!probability!
model! (LPM)! in! estimating! the! difference! in! the! structural! coefficients! across! two!
groups.!The!simulation!study!is!similar!in!design!to!(2014).!!
In!the!simulation! study! the!structural!coefficients!in! the! latent!variable!model!
are! the! same! in! both! groups! ()(Q )C),! set! to! be! 1.! The! error-term! in! the! latent!
!
17!
variable! is! constructed! to! follow! a! logistic! distribution! in! line! with! assumption! of!
logistic!regression.!Group!B!is!the!reference!group!for!which!the!threshold!(%C)!is!set!
to!0,!the!scale!parameter!to!1!(6c *C$d[
=" 6C).!We!vary!the!threshold!for!group!
A! (%()! as! well! as! the! scale! parameter!(6().! The! threshold! represents! differences! in!
response!behavior!or!conversion! of!the!underlying!score!into!the!observed!variable.!
It!reflects!how!high! the! latent! score! has!to!be!in!a!group,!so! that! individuals! would!
get!a!positive!result!on!the!observed!indicator.!For!example,!it!has!been!argued!that!
in!certain!countries!(e.g.!Germany)!the!true!health!has!to!be!markedly!higher!than!in!
other!countries!(e.g.!Sweden)!to!achieve!a!subjective!response!of!good!or!very!good!
versus!on!a!five!point!subjective!health!scale!(Jürges,!2007)!If!the!intercept!is!higher!
it!takes! more!of!the!underlying!score!to!get!a!positive!observed!outcome.!The!scale!
parameter! represents! the! degree! of! unobserved! heterogeneity,! the! key! issue!
discussed! in! the! literature! so! far.! A! larger! scale! factor! implies! higher! degrees! of!
unobserved!heterogeneity,!meaning!more!or!more!influential!factors!that!determine!
the! latent! variable! which! are! not! modeled.! We! conduct! 10,000! replications! and!
estimate!the!average!degree!of!bias!in!the!estimation!of!the!difference!between!the!
two!coefficients.!!
The!true!models!are:!
Group!A!#(
"$ %(& )(" '(& *(!
Group!B!#C
"$ %C& )C" 'C& *C!
The!bias!(e)!is!calculated!as!the!difference!of!the!average!estimates!across!f!
simulations,!in!proportion!to!the!average!estimate!of!the!reference!group!B.!This!
proportion!is!taken!as!the!scale!of!the!coefficients!is!arbitrary!and!does!not!reflect!
the!scale!of!the!underlying!latent!variable,!but!their!relative!size!can!be!captured:!
e $ 5(S
g
SMN B 5CS
g
SMN
5CS
g
SMN
!
If!the!quantities! under!study!are!comparable!in! the!sense!that!they!represent!
the! difference! in! the! coefficients! in! the! underlying! latent! variable! model,! the! bias!
should!be!close!to!zero.!A!bias!of!0.5!would!indicate!the!average!difference!between!
the!two!quantities!is!50%!of!the!size!of!the!reference!group!when!it!should!be!zero.!
!
18!
Table! 2! and! Table! 3! show! the! results! from! the! simulation! study.! We! can! see!
that!differences!in!truncation!leads!to!strong!bias!in!log-binomial!models!and!a!small!
degree!of! bias! using! LPM!and! AME! based! on!logistic! regression.! Interestingly,! logit!
coefficients!seem!to!be!quite!unaffected!by!variation!in!truncation.!
However,! looking! at! bias! due! to! unobserved! heterogeneity,! we! can! see! that!
the!bias!increases!with!the!difference!in!unobserved!heterogeneity.!A!scale!factor!of!
1.5!translates! into! a! 2.25! times! higher! variance!in! group! A! than! in! group! B.!This! is!
substantial! increase! in! unobserved! factors! influencing! the! outcome,! but! might! still!
reflect!certain!situations!in!applied!research,!for!example!in!comparing!labor!market!
outcomes! between! men! and! women.! The! amount! of! bias! is! very! similar! for! all!
quantities,!the!(log)!OR!is!somewhat!larger!than!the!rest.!
Based!on!this!restricted!set!of!scenarios!we!can!conclude!the!following.!First,!
against! common! believe! AME! and! LPM! are! not! immune! to! changes! in! unobserved!
heterogeneity! between! groups! and! does! not! even! necessarily! perform! better! than!
logit! coefficients.! Regarding! differences! in! threshold! logit! coefficients! performed!
actually! best.! Second,! RR! perform! poor! for! both! differences! in! threshold! and!
unobserved!heterogeneity.!!
!
19!
!
!
Table! 2:! Degree! of! bias! in! estimation! of! difference! in! LV! coefficient! dependent! on!
truncation!
!
!
Observations!
!
!
100!
1000!
5000!
%(=0!!
log(RR)!
0.01!
-0.00!
0.00!
log(OR)!
0.00!
-0.00!
0.00!
AME!
0.00!
-0.00!
0.00!
LPM!
0.00!
-0.00!
0.00!
%(!=0.25!
log(RR)!
-0.06!
-0.07!
-0.07!
log(OR)!
0.00!
0.00!
-0.00!
AME!
0.01!
0.00!
0.00!
LPM!
0.01!
0.00!
0.00!
%(=!1!
log(RR)!
-0.22!
-0.22!
-0.22!
log(OR)!
0.01!
0.00!
0.00!
AME!
0.09!
0.08!
0.08!
LPM!
0.09!
0.08!
0.08!
)
Table!3:!Degree!of!bias!in!estimation!of!difference!in!LV!coefficient!dependent!on!the!
degree!of!unobserved!heterogeneity!
!
!
Observations!
!
!
100!
1000!
5000!
6(=!1!
log(RR)!
0.01!
-0.00!
0.00!
log(OR)!
0.00!
-0.00!
0.00!
AME!
0.00!
-0.00!
0.00!
LPM!
0.00!
-0.00!
0.00!
6(=!1.1!
log(RR)!
0.08!
0.09!
0.09!
log(OR)!
0.10!
0.10!
0.10!
AME!
0.08!
0.09!
0.09!
LPM!
0.08!
0.09!
0.09!
6(!=!1.5!
log(RR)!
0.43!
0.43!
0.43!
log(OR)!
0.49!
0.50!
0.50!
AME!
0.43!
0.43!
0.43!
LPM!
0.43!
0.43!
0.43!
!
20!
Table!4:!Comparability!of!quantities!within!the!latent!variable!framework!
!
Comparability!
Interpretation!
Note!
!
Bivariate!
multivariate!
!
!
Odds-ratio!
Sign!
Sign!
Rescaled! LV!
coefficients!
Log-odds! more! useful! than! OR,! limited!
interpretation!
AME/LPM!
Sign!
Sign!
Complexly! rescaled! LV!
coefficients!
Not! immune! to! heterogeneity,! limited!
interpretation!
RR!
Sign!
Sign!
Complexly! rescaled! LV!
coefficients!
Especially! sensitive! to! differences! in! truncation,!
limited!interpretation!
y*-std!
sign!
(standardized!
size)!
sign!
(standardized!
size)!
Differences! std.! by!
distribution!of! LV,! rank!
or!relative!inequalities!
Useful! for! many! purposes;! see! comparisons! of!
intra-class! correlations! in! ML! modelling,! or!
standardized!coefficients! in! SEM! literature;! does!
not!identify!absolute! differences! in! structural! LV!
coefficients!
!
!
!
!
21!
2.2 Natural)categorical)dependent)variables))
This! section! considers! comparability! if! interest! is! in! the! categories! of! the!
dependent! variable! as! such! instead! of! treating! them! as! manifest! values! of! an!
unobserved! latent! variable.! Three! different! scales! are! commonly! applied! in! such!
settings,! although! others! are! surely! imaginable.! We! will! consider! the! additive! (or!
absolute)!probability!scale,!the!multiplicative!(or!relative)!probability!scale,! and! the!
odds-scale,!which!is!always!multiplicative.!
In! general,! instead! of! assuming! the! observed! outcome! to! be! an! imperfect!
measurement! of! an! underlying! latent! construct,! we! assume! that! outcomes! are!
determined!in!a!way!that!the!probability!of!event!occurrence!is!a!functional!form!of!
the!predictors!(Winship!&!Mare,!1983,!p.!61):!
! " # $ # %&' ( )*+!(6)!
G! is! the! cumulative! distribution! function! for! a! probability! distribution,! in! our!
case!the!logistic!function.!'!and!*!are!model!parameters!to!be!estimated.!
2.2.1 Average!marginal!effect!
Using! the! additive! probability! scale! we! are! interested! in! absolute! probability!
differences!and!whether!these!are!smaller!or!larger!between!the!comparison!groups.!
A! possible! research! question! which! would! require! this! scale! could! be:! Are! the!
absolute!inequalities!in!tertiary!education!(as!measured!by!the!difference!in!absolute!
rates!of!tertiary!education!attainment!by!social!origin)!larger!in!countries!with!higher!
proportion!of!tertiary!graduates!(Triventi,!2013)?!Based!on!this!research!question!we!
would!like!to!compare!the!difference!in!probability!of!educational!attainment!(")!by!
levels! of! the! predictor! (parental! education)! ,.! The! AME! does! exactly! that! (see!
Wooldridge,!2002,!p.!471).!
!
-$
./
0/12' ( ,0/
34
567
8$
.9
0912' ( ,09
3:
567
#;< "
/
;=,/
8;< "
9
;=,9
=&>+!
2.2.2 Odds-ratios!–!bivariate!models!
We! could! move! on! and! modify! the! research! question! by! asking:! Are! relative!
inequalities! in! educational! attainment! larger! (the! difference! in! relative! rates! of!
!
22!
tertiary!education! enrolment!by!social!origin)!in!countries!with!higher!proportion!of!
tertiary!graduates?!This!research! question! requires! the! odds-scale!and!it!holds!that!
OR! are! comparable! in! size! across! groups.! Note! that! the! OR! in! a! bivariate! logistic!
regression!model!are!often!called!marginal!odds-ratio!as!they!represent!the!OR!that!
might!be!calculated!from!a!marginal!table!(Loux,!Drake,!&!Smith-Gagen,!2014)!if!the!
predictor!was!categorical!as!well.!
-?@4
?@:# =
;< "
/# $
< "
/# A
;=,/
;< "
9# $
< "
9# A
;=,9
=&B+!
The! statement! that! marginal! OR! are! comparable! in! size! across! groups! might!
seem!at!odds!with!many!statements!found!in!the!literature.!However,!we!think!that!
most! scholars! would! agree! that! within! the! frame! of! the! research! question! the!
statement! holds! true.! For! the! bivariate! case! logic! postulates! that! if! predicted!
probabilities! and! their! differences! (AME)! are! comparable,! so! are! odds! ratios! (OR)!
derived! from! probabilities.! The! multivariate! case! is! more! complex! and! will! be!
discussed!next.!
2.2.3 Odds-ratios!–!the!multivariate!case!
When! stating! that! ORs! cannot! be! compared! across! groups,! one! needs! to! be!
more!specific!and!state!that:!A!comparison!of!the!magnitude!of!OR!across!groups!in!
a! multivariate! model! does! not! necessarily! reflect! the! difference! in! the! marginal(
chance!of!event!occurrence.!This!means!that!a! higher! conditional! effect! on! odds! in!
group!A!than!in!group!B!does!not!necessarily!mean!that!the!effect!in!the!population!
(the!marginal!effect)!will!also!be!larger!in!group!A!than!in!group!B.!!!
However,!the!conditional!effect!can!be!compared!in!size.!It!is!only!that!we!are!
in!practice!most!often!interested!in!the!marginal!change!(in!odds,!or!probability)!that!
we! come! to! say! that! we! cannot! compare! OR! in! size! across! groups.! The! non-
equivalence! of! conditional( and! marginal!odds-ratio! in! multivariate! models! is!
therefore! translated! into! a! generalized! statement! of! non-comparability! (which! is!
sometimes!correct,!but!not!always).!
!
23!
2.2.4 Risk!ratio!
A!third!research!question!that!might!arise!has!a!slightly!different!focus:!Are!the!
relative!inequalities!in! tertiary! education! larger! in! countries! with! higher!proportion!
of! tertiary! graduates?! In! contrast! to! the! first! research! question! this! addresses!
relative!inequalities,!but!on!the!probability!scale!instead!of!the!odds-scale.!One!way!
to!achieve!the!relative!interpretation!is!to!take!the!quotient!of!probabilities!between!
two!groups!or!levels!of! a! predictor.! If! we! then! take!log!of!this!quotient!we!can!see!
that!a!multiplicative!prediction!on!the!probability!scale!is!like!predicting!additively!on!
the! log-probability! scale.! The! quotient! of! the! probabilities! is! known! as! the! relative(
risk(or( risk( ratio,! commonly! used! in! epidemiology.! We! can!get! a! direct! estimate! of!
the!risk!ratio!from!a!log-binomial!or!a!Poisson!model!(Cummings,!2009;!Gail,!Wieand,!
&!Piantadosi,!1984)!and!get!comparability!in!size!for!this!type!of!research!question2.!
Using!a! logistic!regression!model,!only!indirect!methods!are!available!for!estimating!
the!risk!ratio.!First,!with!low!baseline! levels! of! the! outcome! (prevalence)! the! odds-
ratio! approximates! the! risk! ratio! (usually! for! cases! with! less! than! .1! baseline!
probability).! Second,! we! can! estimate! either! two! predicted! probabilities! and! take!
their!ratio!for!dichotomous!predictors!or!estimate!the!marginal!effect!and!divide!it!
by!the!average!probability!of!success.!
-C@D:E4
C@D:E: #
< "
/# $F,/# $
< "
/# $F,/# A
< "
9# $F,9# $
< "
9# $F,9# A
&G+!
!
2.3 Conditional)and)marginal)interpretations)of)OR)in)multivariate)models)
within)the)natural)categorical)framework)
In! many! situations! in! applied! research! we! might! want! to! compare! effects!
conditional! on! a! set! of! covariates.! In! logistic! regression! this! complicates! the!
interpretation!of!OR,!but!not!the!interpretation!of!the!average!marginal!effect!or!RR.!
The!reason! is! that! AME! and! RR! retain!marginal! interpretation! when! covariates!are!
introduced! into! the! equation! while! OR! does! not.! However,! one! should! resist! the!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2!The!analytical!proof!of!collapsibility!of!RR!(Gail!et!al.,!1984,!p.!437;!Neuhaus!&!Jewell,!1993,!p.!
812)!corrobarates! the! simulation! results! of! Norton! (2012)!who! claim! RR! is! unaffected! by!
(uncorrelated)!unobserved!heterogeneity.!
!
24!
temptation!of!choosing!research!question!based!on!convenience!of!interpretation!of!
certain!quantities.!If!we!want!to!know!the!development!of!educational!inequalities!
over! cohorts! after! accounting! for! achievement! as! measured! by! test! scores,! it! is!
unsound! to! change! our! whole! research! interest! to! absolute! differences,! reflecting!
absolute!inequalities,!only!because!it!is!easier!for!us!to!interpret!and!compare!ME!in!
multivariate! models! than! it! is! to! interpret! OR.! For! OR,! we! take! two! different!
perspectives! that! have! been! seen! as! a! problem! of! logistic! regression,! because! we!
want! to! apply! linear! logic! in! non-linear! (multiplicative)! models.! We! suggest! a!
practical! approach! that! might! be! useful! for! certain! research! questions! that! aim! at!
relative! chances,! but! want! to! adjust! for! other! covariates! and! compare! these!
estimates!with!a!marginal!interpretation!across!models.!
The!important!difference!is!the!distinction!between!a!conditional!estimate!and!
a! marginal! estimate! or! whether! a! quantity! is! collapsible.! A! conditional! effect!
estimate!shows!the!association!at!a!certain!level!of!the!covariates.!A!marginal!effect!
estimate!shows!the!estimate!marginalized!(summed!up! over)! the! set! of! covariates,!
meaning!the!actual! values! in!the!data!set!or! population.!If!the!weighted!average! of!
the!conditional!estimated!of!a!quantity! equals! the! marginal! estimate! for!covariates!
that! are! not! confounders,! we! say! that! this! quantity! is! collapsible! (Whittemore,!
1978).! We! could! collapse! two! conditional! cross-tables! to! get! the! marginal! cross-
table.!However,!while!AME!and!RR!are!collapsible,!the!OR!is! not.! The! discussion! of!
collapsibility! and! its! consequences! is! very! advanced! in! the! epidemiologic! literature!
(Greenland,!Robins,!&!Pearl,!1999;!Pang,!Kaufman,!&!Platt,!2013)!and!can!be!seen!as!
the!counter-part!to!the!discussion!of!the!consequences!of!unobserved!heterogeneity!
in!social!sciences!within!a!NC!framework.!
When! estimating! the! AME! we! marginalize! the! conditional! probability!
differences!(PD)!over!the!set!of!covariates! that! we! are! adjusting!for.! Therefore,! we!
speak!of! average!marginal!effect.! We!want!the! same!property!that! allows!adjusting!
for! covariates,! but! retaining! a! marginal! interpretation! in! our! statistical! quantity!
without!changing!to!an!absolute!probability!scale.!We!want!a!form!of!marginal!odds-
ratio!as!we!get!it!from!a!bivariate!model!for!a!multivariate!model!and!we!know!that!
standard!regression!with!covariate!adjustment!does!not!do!the!trick.!
!
25!
2.3.1 Conditional!odds-ratios!
At! first,! let! us! consider! what! we! interpret! and! compare! if! we! estimate!
conditional!odds-ratios.!For!example,!we! might! want! to! estimate!the!association!of!
college!attendance!on!(high!versus!low)! parental! education! conditional! on! regional!
features! and! gender.! We! can! interpret! this! as! the! OR! that! we! get! if! we! compare!
individuals! from! the! same! regions! and! of! the! same! gender! with! each! other.!
Conditional!odds-ratio!always!have!the!differences!“at!the!same!level!of!covariates”!
(conditional! interpretation).! They! do! not! carry! a! marginal! interpretation! like!
differences!in!“in!the!population”.!The!conditional!OR!are!indeed!comparable!in!size!
across!groups!if!and!only!if!this!interpretation!of!the!OR!is!used.!Hence,!if!we!get!an!
OR! estimate! of! 1.7! for! high! versus! low! parental! education! when! controlling! for!
gender!and!regional!dummies,!we!could!not!say!that!the!odds!of!attending!college!in!
the!total!population!are!1.7!higher!for!those!from!high!parental!background!if!there!
was! no! confounding! with! region! and! gender.! Rather! it! is! the! odds-ratio! of! high!
versus! low! parental! education! if! we! compare! individuals! who! are! from! the! same!
region! and! of! the! same! sex.! Further,! it! is! important! to! remember! that! this!
conditional! estimate! of! OR! will! always! be! larger! than! the! unconditional! (marginal)!
OR!even! if!region,!gender! and!parental!education!are!not!related!at!all!but!if!region!
and!gender!predict!the!outcome!(Neuhaus!&!Jewell,!1993,!p.!812).!
For!pursuing!this! type!of!comparison,!we!propose!an!approach!that!combines!
the! advantages! of! a! marginal! interpretation! with! covariate! adjustment! for! OR.! We!
call!this!the!synthetic!marginal!odds-ratio!(SMOR).!
!
2.3.2 Synthetic!marginal!odds-ratios!using!inverse!probability!weighting!!
We! define! the! SMOR! as! the! ratio! in! chances! of! success! between! different!
levels! of! the! predictor! in! the! population! if! the! predictor! of! interest! would! be!
unrelated!to!a!specified!set!of!covariates.!While!this!marginal!OR!can!be!interpreted!
as!a!causal!effect!when!certain!additional!assumptions!are!fulfilled,!it!will!be!useful!in!
many!descriptive!applications!as!well.!!
For!comparability!of!OR,!the!distinction!between!studying!an!association!or!a!
causal!effect!is!not!decisive.!However,!drawing!a!distinction!between!conditional!and!
!
26!
marginal! OR! is! important.! A! marginal! OR! represents! the! aggregate! difference! in!
event!occurrence! between! groups,! an!attractive! feature! that! makes!a! marginal! OR!
ready! for! comparisons! across! groups.! In! contrast,! a! conditional! OR,! for! instance!
estimated!by!a!multiple!logistic! regression! model,! is! defined! only! with! respect! to! a!
set!of!covariates!(Zhang,!2008),!a!fact,!that!is!imposing!problems!for!between!group!
comparisons.!!
In! the! following,! we! propose! an! approach! that! aims! at! combining! useful!
features! of! marginal! and! conditional! OR! while! preserving! comparability! across!
groups.! Our! strategy! involves! applying! inverse! probability! weighting! (IPW)! in! the!
context!of!logistic!regression!as!it!was! previously! applied! to! survival! curves! (Cole! &!
Hernán,!2004).!IPW!is! most! commonly! referred! to! in! the! context!of!causal!analysis!
where! it! is! used! to! calculate! inverse! probabilities! of! treatment! to! account! for!
selection!into!treatment!(Morgan!&!Winship,! 2007).!However,!IPW!can!also!applied!
in!regression! analysis! when! researchers! aim! for! descriptive! rather! than! causal!
inference.!!
In! general,! IPW! works! in! three! steps.! First,! a! (logistic)! regression! model! is!
estimated,!taking!the! (dichotomous)! predictor! of!interest!(X)!as!dependent! variable!
and! on! all! other! control! variables! (C)! that! are! to! be! considered! as! independent!
variables.!!
HI < J # $
$ 8 < J # $ # ' ( 07K , ( L K MN=&$A+!
!
Second,!based! on!this!model,! we!predict!the! probability!of!(a)! having!the!trait!
for!those!who!in!fact!have!the!trait!(<&J # $FL # O+)!and!of!(b)!not!having!the!trait!
for!those!who!in! fact!do!not!have!the!trait!($ 8 <&J # $FL # O+).!Then!we!take!the!
inverse! of! the! probabilities! as! weights! (P # QPRE P7S)! and! standardize! it! in! the!
nominator!with!the!overall!probability!of!having!the!trait!to!reduce! variance! of! the!
weights.!
!
P7#<&J # $+
<&J # $FL # O+!
!
27!
PR#$ 8 <&J # $+
$ 8 <&J # $FL # O+ = &$$+!
!
Afterwards,!in!a!third! step,! we! run!the!substantial!regression!model!including!
only!the!predictor!of!interest!but!using!a!weighted!estimator.!The!likelihood!function!
for! estimating! the! coefficient! in! the! logistic! regression! model! is! then! modified! to!
include!the!weights!(as!for!example!implemented!in!Stata!14,!see!StataCorp,!2015,!p.!
1291):!
T.U # P
V=T.% ' ( ,V0WX (
VYW
P
V=HI=&$ 8 % ' ( ,V0WX +
VZW
=&$[+!
!
U=is!the!likelihood,!%the!logistic!function,!0!the!coefficient!of!the!predictor!of!
interest,!and!\=includes!all!observations!]!with=J # $.!
If! the! predictor! of! interest! has! more! than! one! category,! multinomial! logistic!
regression!can! be!used!in!analogue!way!(Imbens,!2000).!If!the!variable!of!interest!is!
continuous,! several! ways! of! estimating! inverse! density! weights! based! on! normal!
distributions! or! quantile! binning! are! available! (Naimi,! Moodie,! Auger,! &! Kaufman,!
2014).!The!weights!are!formally!defined!independent!of!the!distribution!of!X!as:!
P # ^
_&J` a7` b7
c+
^
_Fd &JFL # O` ac` bc
c=+=&$e+!
^
_is!the! functional! form!in! which! X!is! related! to!the! other! covariates!(e.g.!
(multinomial)! logistic,! linear,! log-binomial),! a!is! the! threshold! and! bcthe!
variance!estimate!(fixed!in!logistic!regression).!
From! the! logistic! regression! model! estimated! in! the! third! step,! we! obtain! a!
synthetic! marginal! odds-ratio! (SMOR)! which! can! be! interpreted! as! follows! (in!
equation,! this! is! C)f=&0WX++.! Taking! the! example! from! above,! the! SMOR! measures!
the! difference! in! odds! of! attending! college! between! individuals! with! high! and! low!
educated! parents! that! cannot! be! attributed! to! gender! and! region.! It! is! the! factor!
difference! in! the! odds,! if! the! parental! education! were! unrelated! to! gender! and!
region!in!the!population! under! study.! Note,! this! is! not! the! same!as!the!conditional!
OR! we! obtained! from! an! ordinary! logistic! regression! model! that! just! controls! for!
!
28!
gender!and!region,!i.e.!the!average!factor!difference!in!odds!at!different!levels!of!the!
controls.!!
For! the! case! of! categorical! predictors,! a! very! useful! feature! of! the! IPW!
approach!is!that!we!actually!could!resort!to!cross-tabulation!based!on!the!weighted!
data.!That!way! we! can!construct!a!synthetic! marginal! table!(for!the!use!of! a!similar!
way!of!presenting!tables!for!causal!analysis,!see!Yamaguchi,!2012),!which!in!the!form!
of! a!cross-tabulation! provides! information! in! an! easily! accessible! way.! From! the!
synthetic! marginal! table! we! can! recover! OR,! RR,! or! ME! which! are! numerically!
(approximately)! the! same! as! estimated! by! logistic! regression.! However,! a! practical!
advantage! of! using! the! latter! might! be! that! most! common! statistical! software!
packages! enrich! regression! outputs! with! information! on! statistical! inference! (like!
standard!errors!or!interval!estimates)!which!might!be!more!tedious!to!calculate!from!
a!table.! Nonetheless,! the! important! thing! to! remember! is! that! we! are! looking! at! a!
counterfactual3!or!synthetic!cross-table! that! does! not! have! a! real!life! equivalent! as!
conditional!cross-tables!do.!
!
2.3.3 Example!of!synthetic!marginal!comparison!
Using! the! IPW! approach! to! construct! a! synthetic! marginal! table! is! best!
illustrated! by! giving! an! example! from! the! research! on! educational! mobility.! We!
would! like! to! compare! the! direct! association! of! parental! education! with! college!
attendance! (net! of! academic! performance)! across! two! subsequent! cohorts! to!
evaluate! whether! the! ‘secondary! effect’! of! social! background! has! been! changing!
over!time.! For! illustration,! we! use!simulated!data.! The! data! set! contains!a! variable!
that! indicates! whether! an! individual! attended! college! or! not,! a! dichotomous!
indicator! for! high! vs.! low! parental! education! and! two! possible! control! variables.!
Gender! is! a! strong! predictor! of! college! in! this! example,! but! unrelated! to! parental!
education.!Academic!performance!in!high!school!is!a!continuous!variable!(measured!
via!test!scores)!and!is!correlated!to!both!college!attendance!and!parental!education.!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
3!While!this! table!and! the!OR!calculated!from! it!can!be! labelled!counterfactual!(“What!if!the!
control! variables! were! equally! distributed”),! we! refrain! from! using! this! term! as! counterfactual! is!
strongly!associated! with! causal! research! designs,! while! the! synthetic! marginal! OR! or! table! might! be!
used!for!causal!research!designs,!but!often!this!might!not!be!the!goal.!
!
29!
Being!interested!in! the! ‘secondary!effect’!of!parental!background!on! absolute!
or! relative! probability,! we! could! simply! estimate! a! logistic! regression! model! of!
college! attendance! including! parental! education! and! academic! performance! as!
independent! variables.! Our! research! question! is:! What! is! the! factor! difference! in!
odds!between!the!student!populations!from!high!versus!low!educational!background!
that! cannot! be! attributed! to! academic! performance?! And,! how! did! this! difference!
change! over! cohort?! Comparing! the! conditional! odds-ratios! of! parental! education!
based!on!an!ordinary!logistic!regression!model!is!problematic,!because!the!estimate!
from! a! multivariate! model! has! the! interpretation! of! “at! the! same! level! of!
performance”.! However,! our! research! question! addresses! the! difference! between!
the! groups! in! the! total! population! taking! adjusting! for! the! correlation! between!
parental!education!and!performance.!!
We! can! apply! the! suggested! IPW! method! to! estimate! the! desired! quantity.!
Based!on!this!procedure!we!calculated!a!weighted!and!an!unweighted!OR!for!both!
cohorts.4! Inspecting! results! for! cohort! A! first! (Tables! 5! and! 6),! we! see! that! the!
difference! in! probability! of! attending! college! between! students! from! high! and! low!
educated!parents!is!more!than!19!percentage!points.!The!relative!risk!is!1.34,!which!
means!the!probability!of! attending! college!is!34!percent!higher!for! those! from!high!
educational!background.! The! (marginal)! OR! is!2.41,! meaning! the! odds! of!attending!
college!are!about! 141! percent! larger! for! students! from!higher!educated! parents! as!
compared!to!those!from! lower!educated!backgrounds.!If!we!weight!this!data!by!the!
inverse!probability!of!(not)!having!high!educated!parents!based!on!prediction!only!by!
performance,!we!get!a! synthetic! marginal! table!(right!hand!side!of!Tables!5! and! 6).!
The!synthetic!situation,!the!weighted!data,!is!constructed!in!a!way!that!the!marginal!
distribution! of! parental! education! remains! the! same! while! performance! is! equally!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
4!Note,!that!conceptually!this!approach!is!very!similar!to!decomposition!of!effects!into!primary!
(indirect)!and!secondary!(direct)!effects!in!logit!models!as!proposed!in!previous!studies!(Buis,!2010;!
Erikson,!Goldthorpe,!Jackson,!Yaish,!&!Cox,!2005).!The!difference!is!that!we!do!not!propose!to!
integrate!over!predicted!probabilities,!although!this!would!lead!to!very!similar!results.!The!second!
difference!is!that!the!method!has!previously!been!used!for!the!calculation!of!indirect!effects!within!
one!model,!not!the!comparison!across!two!groups.!
!
!
30!
distributed! across! groups! of! parental! education.! Weighting! the! data! leads! to! a!
change! in! absolute! frequencies! of! college! attendance! (Table! 4)! and! probabilities!
conditional!on!parents’!education!(Table!5).!!
The!absolute! probability! difference! in! the! synthetic!table!about! 9! percentage!
points,! the! relative! risk! is! 1.15! and! the! synthetic! marginal! odds-ratio! amounts! to!
1.51.! As! expected,! part! of! the! association! of! parental! education! and! college!
attendance! can! be! attributed! to! differential! performance! in! school.! Yet,! a! direct!
association! between! parents’! education! and! college! remains,! which! would! be! the!
association! found! if! performance! was! equally! distributed! between! the! groups.!
Quantified! by! the! odds! ratio,! the! odds! of! attending! college! would! be! higher!by! 51!
percent! for! those! with! high! educated! parents! compared! with! those! having! lower!
educated!parents.!!
A! multivariate! logistic! regression! adjusting!for! performance! yields! the! same!
absolute!difference!in!probability,!the!same!relative!risk,!but!a!conditional!odds!ratio!
odds-ratio!of!1.67.! Holding! constant!performance,!we!see!that!there! is!a!difference!
of! 67! percent! in! the! odds! of! attending! college,! 16! percentage! points! higher! as!
compared!to!the!synthetic!marginal!OR.!Note,!that!there!is!a!performance!difference!
between!groups!in! the!marginal,!but!not!in! the!synthetic!marginal!table,!where! the!
performance!difference! between!college!and! not!college!is! equal!for!both!weighted!
and!unweighted! data.! Hence,! although! the!distribution! of! performance! conditional!
on!parental!education!is!altered,!the!distribution!conditional!on!college!is!not.!One!
could!think! of!constructing!other!scenarios!like!what!would!be!the!group!difference!
if!the!low!education!group! had! the! same!performance!as!the!high!education!group!
or!vice!versa.!!
For!the!second!cohort!B,!the!overall!level!of!inequality!in!college!attendance!is!
much!higher.!The!marginal!effect!is!over!40!percentage!points,!the!relative!risk!ratio!
about!2! and!the!factor! difference!in!odds!more!than!6.!Accounting!for!performance!
differences!via!IPW,!differences!are!much!reduced.!The!absolute!inequality!is!a!little!
lower! than! in! cohort! A! (7.78! percentage! points),! the! relative! inequality! about! the!
same! (RR! 1.14)! and! the! OR! is! a! little! smaller! with! 1.38.! Based! on! this! we! can!
conclude! that! overall! inequalities! are! stronger! in! cohort! B! than! in! cohort! A,! both!
relatively! and! absolutely! speaking.! The! secondary! effect! –! group! differences! in!
!
31!
college!attendance!that!cannot!be!explained! by!group!differences!in!performance!–!
is!about!the!same!which!implies! that!the!indirect! effect!is!larger!in!cohort!B!than! in!
cohort!A.!
Our!conclusion!is!based!on!comparing!the!synthetic!marginal!OR!across!groups.!
If!we!now!compared!the!conditional!OR!we!would!see!an!OR!of!1.67!in!cohort!A!and!
a!conditional!OR!of! 2.68!in!cohort!B.!So,!if!we!took!the!conditional!OR!as!a!measure!
we!would!conclude!that!the!secondary!effect! is!substantially!larger!in!cohort! B!than!
in! cohort! A.! The! reason! is! that! performance! is! much! more! predictive! for! college!
attendance!in!cohort!B!and!conditioning! on!it!increases! the!predictive!power!of!the!
model!in!cohort!B.!Therefore,!knowing!performance,!the!differences!in!odds!of!those!
at! the! same! level! are! larger! between! high! and! low! educated! in! cohort! B! than! in!
cohort! A! (conditional! interpretation).! However,! if! we! compare! the! high! versus! low!
education!groups! under!the!assumption! that!performance!were!equally!distributed,!
the!odds!ratio!for!parental!education!were!roughly!the!same,!even!slightly!higher!in!
cohort!A!than!in!cohort!B!(marginal!interpretation).!Depending!on!the!interpretation!
of!odds!ratios,!conditional!or! marginal,! we! would! draw! different! conclusions! about!
the!relative!importance!of!the!secondary!effect!of!parental!education!within!the!two!
cohorts.! Both! the! calculation! and! interpretation! of! AME! and! RR! are! unaffected! by!
our! approach! by! the! weighting! approach.! The! IPW! approach! yields! approximately!
the!same!results!in!for!AME!and!RR!as!the!unweighted!regression.!!
What!happens!in!this!example!if!we!control!for!a!predictor!that!is!unrelated!to!
parental! education,! but! a! strong! predictor! of! college! attendance?! In! our! example!
that! could! be! gender.! The! adjusted! conditional! OR! estimate! from! multivariate!
regression!increases!to!2.11!in!cohort!A.!If!we!do!not!want!to!compare!odds-ratio!for!
individuals!of!the!same! gender! between!cohorts,!but!for!the! whole! population,! but!
still!want!to!adjust!for!gender,! we!need!to!marginalize! over!gender.!We!can! do!this!
by!including!gender!as!an!additional!variable! in! the! first! step! of! the! IPW! approach,!
the!prediction!of!parental!education.!Even!though!gender!might!be!related!to!college!
attendance,! we! expect! that! accounting! for! gender! should! not! alter! the! synthetic!
marginal!odds-ratio!because!there!is!no!reason!to!believe!that!an!individual’s!gender!
is!related!to!their!parents’!education.!In!fact,!accounting!for!gender!does!not!change!
the! inverse! probabilities! significantly! and,! thus,! we! get! almost! exactly! the! same!
!
32!
synthetic! marginal! table! (see! Table! 10! and! Table! 11! in! the! appendix).! This!
demonstrates!another!viable! feature! of! using! the! IPW! approach:!the!robustness! of!
the! marginal! interpretation! of! SMOR! when! accounting! for! other! variables! that! are!
predictive!for!the!outcome!under!study,!but!not!the!predictor!of!interest.!!
!
33!
Table!5:!Marginal!and!synthetic!marginal!table!linking!parental!education!and!children's!education!–!cell!frequencies!for!Cohort!A!
!
Marginal!table!
Synthetic!marginal!table!
Parental!education!
No!!
college!
College!
Total!
Performance!
No!
college!
College!
Total!
Performance!
Low!
1,303!
1,718!
3,021!
-0.008!
1,181!
1,840!
3,021!
0.412!
High!
474!
1,505!
1,979!
1.017!
590!
1,389!
1,979!
0.410!
Total!
1,777!
3,223!
5,000!
0.398!
1,771!
3,229!
5,000!
0.411!
Performance!!
-0.827!
1.073!
0.398!
ME:19.18((
RR:1.34((
OR:(2.41!
-0.805!
1.050!
0.411!
ME:9.26((
RR:1.15((
OR:(1.51!
Note:! Simulated! data.! The! synthetic!table! was! created! using! weights! that! balances! performance!level! in! high! s chool! to!create! a! synthetic!data! set! in! which!performance!and! parental! education! are! unrelated.!
Performance!level!is!equally!distributed!by!parental!background.!
!
Table!6:!Marginal!and!synthetic!marginal!table!linking!parental!education!and!children's!education!–!row!percentages!for!Cohort!A!
!
Marginal!table!
!
Synthetic!marginal!table!
Parental!education!
No!!
college!
College!
Total!
Performance!
No!!
college!
College!
Total!
Performance!
Low!
43.13!
56.87!
100.00!
-0.008!
39.08!
60.92!
100.00!
0.412!
High!
23.95!
76.05!
100.00!
1.017!
29.82!
70.18!
100.00!
0.410!
Total!
35.54!
64.46!
100.00!
0.398!
35.41!
64.59!
100.00!
0.411!
Performance!
-0.827!
1.073!
0.398!
ME:19.18((
RR:1.34((
OR:(2.41(
-0.805!
1.050!
0.411!
ME:9.26((
RR:1.15((
OR:(1.51!
Note:!Simulated!data.!The!synthetic!table!was!created!using!weights!that!balances!performance!level!in!high!school!to!create!a!synthetic!data!set!in!which!performance!and!parental!education!are!unrelated.!
Performance!level!is!equally!distributed!by!parental!background.
!
34!
!
Table!7:!Marginal!and!synthetic!marginal!table!linking!parental!education!and!children's!education!–!cell!frequencies!for!Cohort!B!
!
Marginal!table!
Synthetic!marginal!table!
Parental!education!
No!!
college!
College!
Total!
Performance!
No!
college!
College!
Total!
Performance!
Low!
1,822!
1,199!
3,021!
-0.004!
1,388!
1,633!
3,021!
0.435!
High!
370!
1,609!
1,979!
1.009!
755!
1,224!
1,979!
0.423!
Total!
2,192!
2,808!
5,000!
0.397!
2,143!
2,857!
5,000!
0.429!
Performance!
-0.552!
1.137!
0.397!
ME:41.61((
RR:(2.04((
OR:6.61!
-0.549!
1.140!
0.429!
ME:7.78((
RR:(1.14((
OR:1.38!
Note:! Simulated! data.! The! synthetic!table! was! created! using! weights! that! balances! performance!level! in! high! schoo l!to!create!a! synthetic!data! set! in! which! performance!and! parental! education! are! unrelated.!
Performance!level!is!equally!distributed!by!parental!background.!
!
Table!8:!Marginal!and!synthetic!marginal!table!linking!parental!education!and!children's!education!–!row!percentages!for!Cohort!B!
!
Marginal!table!
Synthetic!marginal!table!
Parental!education!
No!!
college!
College!
Total!
Performance!
No!!
college!
College!
Total!
Performance!
Low!
60.31!
39.69!
100.00!
-0.004!
45.94!
54.06!
100.00!
0.435!
High!
18.70!
81.30!
100.00!
1.009!
38.16!
61.84!
100.00!
0.423!
Total!
43.84!
56.16!
100.00!
0.397!
42.86!
57.14!
100.00!
0.429!
Performance!
-0.552!
1.137!
0.397!
ME:41.61((
RR:(2.04((
OR:6.61(
-0.549!
1.140!
0.429!
ME:7.78(
RR:(1.14(
OR:1.38!
Note:!Simulated!data.!The!synthetic!table!was!created!using!weights!that!balances!performance!level!in!high!school!to!create!a!synthetic!data!set!in!which!performance!and!parental!education!are!unrelated.!
Performance!level!is!equally!distributed!by!parental!background.
!
35!
Table!9:!Comparability!of!quantities!under!natural!categorical!framework!
!
!
!
Comparability!
Interpretation!
Note!
Quantity!
Bivariate!
Multivariate!
!
!
Odds-ratio!
Size!
Sign!(size)!
Classificatory!power;!degree!of!
stratification,!Change!in!odds!
Given!known!covariates,!multivariate!only!if!
interpreted!as!conditional!(at!same!level!of!
covariates)!
SMOR!
Size!
Size!
Degree!of!stratification,!Change!of!odds!
in!population!
Given!correct!IPW!model!
AME/LPM!
Size!
Size!
Absolute!probability!difference!
ME!varies!between!individuals!
RR!
Size!
Size!
Relative!probability!difference!
Marginal!interpretation;!not!symmetric!like!OR,!
coding!of!event!important!
y*-std!
Sign!
Sign!
Underlying!propensity!
Unclear!meaning,!counter!intuitive!
Standardized!
ratio!
Size!
Size!
Relative!probability!difference!
Has!RR!interpretation,!but!is!based!on!probability!
predictions,!captures!relative!aspect;!in!
univariate!case!identical!to!RR;!in!multivariate!
not!identical!due!to!Jensen’s!inequality.!!
!
36!
3 Average!Marginal!Effects,!Risk!Ratios!and!Odds!Ratios!–!
United!we!understand!
The! discussion! of! comparability! under! a! natural! categorical! framework! is!
summarized!in!table!9.!
3.1 The'complementary'nature'of'AME,'RR'and'OR'
As! we! discussed,! under! the! LV! framework! OR,! RR! and! AME! only! have!
comparability! of! sign! and! are! equally! problematic! for! comparisons! across! groups.!
Within!the!natural!categorical!research!framework,!however,!all!three!can!be!useful!
with!the!effect!that!we!either!decided!for!one!that!best!fits!the!quantity!to!measure!
or,!alternatively,!use!all!three!in!a!complementary!way.!!
There! are! several! arguments! why! an! exclusive! reliance! on! AME! –! which! has!
become!more! common! in! recent! years! –!limits! interesting! aspects! of! comparisons.!
First,! while! log-odds! ratios! are! parameters! of! a! statistical! model,! average! marginal!
effects! are! not.! An! AME! does! not! depend! only! on! parameters! of! the! probability!
function! but! also! on! the! joint! distribution! of! covariates! in! a! sample.! Hence,! while!
being!illustrative!for!a! specific! set!of!data,!it!is!impossible! to! reproduce! the!original!
model!parameters!from!the!AME!which!severely!limits!the!ability!to!replicate!results!
of!studies.!If!study!results!cannot!be!reproduced!it!is!unclear!whether!this!is!due!a!
difference! in! the! estimation! of! the! model! parameters! or! due! to! subsequent!
calculation!of!AME!in!the!used!sample.!!
Second,!and! substantively! more! important,! in!sociology!in! general! and! in!the!
field! of! stratification! research! in! particular,! the! distinction! between! absolute! and!
relative! inequality! among! groups! should! be! kept! in! mind.! An! exclusive! reliance! on!
AME!for!comparative!purposes!would!mean!that! we! eliminate! all! kinds! of! research!
questions! that! address! relative! differences,! e.g.! relative! differences! in! educational!
attainment.! Absolute! probability! differences! (AME)! and! relative! rates! (OR/RR)!
represent! a! different! concepts.! A! comparison! might! lead! to! the! same! conclusion!
based! on! absolute! or! relative! perspectives.! However,! a! difference! in! conclusion!
based! on! either! the! absolute! probability! scale,! the! relative! odds! or! relative!
probability! scale! is! not! only! a! theoretically! valid! result,! but! can! happen! using! real!
!
37!
world! data! and! might! be! of! particular! substantive! interest.! This! argument! is! not!
novel,! but! builds! on! the! conclusion! that! was! drawn! in! (Mood,! 2010)! who! also!
advocates!a!careful!choosing!of!quantities!to!report!and!advises!against!treating!any!
single!estimate!as!a!panacea.!
We! will! now! illustrate! how! we! can! use! OR,! AME! and! RR! jointly! to! compare!
groups!and!how!this!e.g.!can!enhance!our!understanding!of!changes!over!cohorts.!!!!!!!!!
!
3.2 Example'–'Educational'attainment'and'intergenerational'mobility'
We! present! a! fictional! example! of! development! of! secondary! school!
attainment! (Figure! 2).! The! log! of! the! OR! is! visually! represented! as! the! size! of! the!
diamonds!with!the!first!cohort! being! the! category! of! reference.! The! comparison! of!
AME!shows!that!absolute!inequalities!in!secondary! school!attainment!have!become!
smaller! (from! 19! to! 6! percentage! points)! over! cohorts.! The! reduction! in! relative!
inequalities!is!similar!(RR!reduced!from! 1.46!to!1.06).!Therefore,!we!could!conclude!
that!inequalities!in!secondary! school! attainment!are!strongly!reduced!over!cohorts,!
which! is! correct.! However,! we! argue! that! the! statement! that! secondary! school!
attainment! is! no! longer! socially! stratified! based! on! these! results,! is! only! half! the!
truth!and! hides! an! important!fact.! We! can! see!that! the! OR! has! remained!constant!
across!cohorts.!The! reason! for!increasing!OR!is!that! social! stratification!is!no!longer!
relevant!for!the!question!of!who!gets!a!secondary!school! degree!(almost!everybody!
does),!but!is!still!relevant!for!the!question!of!who!does!not!get!a!secondary!school!
degree.! While! ‘winners’! are! no! longer! socially! stratified,! ‘losers’! are.! The! high!
baseline! probability! in! the! later! cohorts! also! indicates! that! sociological! analyses!
would! rather! focus! on! describing! and! explaining! patterns! of! drop-out! instead! of!
completion.!!
In! Figure! 3! we! present! an! example! based! on! real! data! on! cohort! change! in!
inequality!in!secondary!school!attainment!(min.!ISCED!level!3!and!above)!by!parental!
education.5!The!pattern! is! slightly!different!than!in!the! simulate! example.!We!see!a!
marked! decrease! in! absolute! (AME! from! 43! to! 26! percentage! points)! and! relative!
inequality! (RR! from! 1.91! to! 1.35)! as! well,! but! substantial! inequality! remains! in! the!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
5!Analyses!based!on!data!from!the!National!Educational!Panel!Study!(NEPS).!
!
38!
younger!cohorts.!At!the!same!time,!we!see!that!there!is!no!decrease!in!the!OR.!Quite!
to!the!contrary,!the!youngest! cohort! even! displays! a!substantially!higher!OR.!While!
inequality! in! secondary! school! attainment! and! non-attainment! clearly! existed! in!
earlier! cohorts,! the! decrease! in! inequality! in! attainment! is! mirrored! to! the!
observation! that! non-attainment! in! the! youngest! cohorts! is! almost! exclusively! an!
issue! of! individuals! coming! from! lower! educated! parents! which! is! reflected! in! the!
OR,!but!neither!in!the!RR!nor!the!AME.!This!trend!becomes!more!apparent!if!we!flip!
the!Figure!3!on!its!head!and!plot!non-attainment!of!secondary!education!as!done!in!
Figure!4.!
Note!that!the!scaling!of!the!diamonds! is! exactly! the! same! as! in! Figure! 3! as! is!
the!absolute!difference!in!probability.!However,!the!relative!risk!of!non-attainment!is!
not! simply! the! inverse! of! the! relative! calculated! for! attainment.! While! ME! and! OR!
are!symmetric!to!the!coding!of!the!dependent!variable,!the!relative!risk!is!not.!Figure!
4!shows!more!prominently!that!the!risk!of!not!finishing!secondary!education!is!more!
than! 10! times! higher! in! the! youngest! cohorts! for! those! from! low! educational!
background! compared! to! those! from! high! educational! background.! This! is! despite!
the! fact! that! the! absolute! difference! has! decreased! by! between! 17-19! percentage!
points.!
In!sum,!we!can!say!that!using!the!OR!in!combination!with!ME!and!RR,!we!could!
conclude! that! while! relative! and! absolute! inequalities! in! secondary! school!
attainment!have!diminished!substantially,!social!stratification!of!drop-outs!and!non-
completion! is! still! imminent! and! could! be! the! focus! of! future! research.! While!
educational! expansion! has! changed! relevant! degrees! of! absolute! and! relative!
inequality! in! secondary! school! attainment,! there! seem! to! be! still! mechanisms! that!
link! social! background! and! this! educational! attainment,! in! the! sense! that! they!
determine! failure! instead! of! success! and! this! degree! of! determination! has! not!
diminished!over!cohorts.!!
This!example!tells!us!that!it!can!be!very!helpful!to!report!and!interpret!ME!and!
RR,! alongside! the! OR.! Our! graph! is! just! one! of! many! ways! to! combine! this!
information.!
!
!
!
39!
!
!
40!
Figure!2:!Simulated!development!of!secondary!school!attainment!
!
Note:!Simulated!data.!Size!of! the!diamonds! is!scaled!to! reflect!differences! in!the!OR!with!the!oldest!cohort!set!to!be!the!point!
of!reference.!
ME: 0.19
RR: 1.46
OR: 2.22
ME: 0.19
RR: 1.37
OR: 2.88
ME: 0.11
RR: 1.16
OR: 2.30
ME: 0.09
RR: 1.12
OR: 4.71
ME: 0.06
RR: 1.07
OR: 6.25
0.1 .2 .3 .4 .5 .6 .7 .8 .9 1
1947/1952 1953/1957 1958/1962 1963/1967 1968/1972
Birth Cohort
High parental SEP Low parental SEP 95%-CI
!
41!
Figure!3!Development!of!secondary!school!attainment!in!Germany!
!
Note:! Illustrative! data! from! the! National! Educational! Panel! Study! (NEPS)! in! Germany.! Size! of! the!
diamonds!is!scaled!to!reflect!differences!in!the!OR!with!the!oldest!cohort!set!to!be!the!point!of!reference.!
!
Figure! 4! Development! of! secondary! school! Non-attainment! in!
Germany!
!
Note:! Illustrative! data! from! the! National! Educational! Panel! Study! (NEPS)! in! Germany.! Size! of! the!
diamonds!is!scaled!to!reflect!differences!in!the!OR!with!the!oldest!cohort!set!to!be!the!point!of!reference.!
!
!
ME: 0.43
RR: 1.91
OR: 11.45
ME: 0.32
RR: 1.50
OR: 11.39
ME: 0.27
RR: 1.39
OR: 15.31
ME: 0.24
RR: 1.32
OR: 13.46
ME: 0.26
RR: 1.36
OR: 22.75
0.1 .2 .3 .4 .5 .6 .7 .8 .9 1
1947/1952 1953/1957 1958/1962 1963/1967 1968/1972
Birth Cohort
Low parental Education High Parental education 95%-CI
Inequality in secondary school attainment over cohorts - All
ME: 0.43
RR: 6.01
OR: 11.45
ME: 0.32
RR: 7.58
OR: 11.39
ME: 0.27
RR: 11.02
OR: 15.31
ME: 0.24
RR: 10.17
OR: 13.46
ME: 0.26
RR: 16.68
OR: 22.75
0.1 .2 .3 .4 .5 .6 .7 .8 .9 1
1947/1952 1953/1957 1958/1962 1963/1967 1968/1972
Birth Cohort
Low parental Education High Parental education 95%-CI
Inequality in secondary school NON-attainment over cohorts - All
!
42!
4 Conclusion!
Our!paper!aimed!to!shed!light!on!a!confusing!debate! on! the! comparability! of!
logit! coefficients! that! has! been! emerging! in! the! recent! years.! We! started! with!
arguing! that! issues! raised! by! Mood! (2010)! and! others! do! not! apply! to! all! research!
agendas.!Importantly,!logistic!regression! can! serve!different!ends.!It!can!be! used! to!
analyze!natural!categorical!dependent!variables.!It!may!also!be! used! as! a! model! to!
estimate!effects!on!a!latent!variable!(propensity),!which!is!unobserved!but!assumed!
to!generate!binary!observations.!Both!are!very!different!theoretical!approaches!that!
cannot! be! distinguished! empirically,! but! have! far! reaching! consequences! for! the!
interpretation! of! the! model! results.! We! argued! in! detail! that! the! comparability! of!
model! results! depend! on! whether! we! have! a! natural! categorical! (NC)! or! a! latent!
variable!(LV)!approach!in!mind.!!
Second,! we! pointed! out! that,! contrary! to! common! beliefs,! AME! are! not!
immune! to! unobserved! heterogeneity! under! the! LV! framework! (for! a! similar!
argument,!see!Holm! et!al.,!2014).!In!fact,! none!of!the!possible! quantities! estimated!
from! logistic! regression! (with! a! partial! exception! of! the! standardized! coefficients!
(Breen!et!al.,!2014))!are!helpful!for!across!group!or!sample!comparisons!of!size.!
Third,!we!showed! that!AME,!OR!and! RR!are!all!comparable! in!size!in!bivariate!
models!across!groups! and! AME!and!RR!also!in! multivariate!models.!Contrary!to!the!
common! belief,! OR! are! comparable! in! size! even! in! multivariate! models! if! the!
conditional!interpretation!is!used.!If!a!marginal!interpretation!–!while!controlling!for!
other! covariates! –! is! desired,! we! proposed! an! inverse! probability! weighting!
technique! that! combines! these! two! properties! to! make! OR! comparable! in! size! for!
marginal!interpretations!in!multivariate!models.!
Fourth,! we! showed! that! for! research! questions! in! the! natural! categorical!
framework!AME,!OR!and!RR!complement!each!other!in!interpretation!and!illustrated!
the!joint!use!for!cohort!comparisons!that!yielded!insights!that!would!have!been!lost!
if!only!one!of!the!quantities!would!have!been!reported.!
We! have! four! main! suggestions! for! future! research.! First,! when! cross-group!
comparisons! of! effects! are! made,! researchers! should! be! clear! about! what! which!
effect! on! what! they! are! referring! to:! Probability,! relative! probability,! odds,!
!
43!
(standardized)! latent! variable?! Further,! researchers! could! think! about! hypotheses!
that! combine! comparisons! on! these! different! scales! given! that! theory! is! detailed!
enough.!In! any! case,! it! is!advisable! for! any! comparison! to!report! effects! should! on!
different!scales.!!
Second,! AME! should! not! be! used! in! comparisons! if! the! interest! is! directed!
towards!coefficients!in!the!LV!model!unless!convincing!argument!are!presented!that!
the!underlying!assumptions!are!likely!to!hold.!Furthermore,!research!should!be!more!
clear!and! consistent!in!clarifying! whether!the!dependent! variable!is!treated!in!LV!or!
NC!framework.!!
Third,! in! many! cases! comparisons! of! AME,! RR,! and! OR! give! a! more! coherent!
picture! of! differences! between! groups! if! we! conceive! our! dependent! variable! as!
being! naturally! categorical.! Further,! we! should! take! substantial! differences! in!
baseline! between! groups! into! account! and! discuss! if! the! meaning! of! the! variable!
remains!the!same!or!whether!the!absence'of'a' condition!might!be!more!interesting!
than!the!condition!itself.!
Fourth,! we! suggest! the! usage! of! inverse! probability! weighting! to! estimate!
synthetic!marginal!odds-ratio!(SMOR)!for!comparisons!across!groups!or!samples.!For!
many! research! contexts! this! might! be! favored! over! comparison! of! conditional! OR!
which!are!more!difficult!to!interpret.!However,!we!want!to!stress!the!conclusion!that!
both! kinds! of! comparison! are! possible! within! a! NC! framework,! depending! on! the!
precise!research!question!and!interpretation!of!the!results.!
In! sum,! we! believe! that! a! stronger! reliance! on! theory! grounded! decisions! is!
needed! for! deciding! about! which! quantities! to! be! reported! and! interpreted! when!
using!logistic!regression!for!comparisons!across! groups! and! samples.! There! are!few!
rules!that!hold! for!all!perspectives!and! research!questions!and!generalizations!have!
been! shown! to! be! faulty! under! certain! circumstances! (an! example! of! a! close! link!
between!theoretical!discussion!of!inequality!and!methodological!implications,!can!be!
found! in! Bulle,! 2016).! Further,! forcing! ourselves! to! think! again! about! which!
quantities! to! interpret! also! allows! thinking! more! carefully! about! our! theories! and!
whether!they!might!be!able!to!guide!analysis!in!absolute,!relative!or!odds!terms!and!
whether!they!might!actually!make!predictions!on!different!levels.!For!example,!the!
idea!of!persistent'inequality!(Shavit!&! Blossfeld,! 1993)! proposes! that! absolute! level!
!
44!
of!inequality!(as!could!be!tested!using!AME)!in!education!have!declined!over!certain!
periods!while!relative!inequalities!have!remained!constant!(as!could!be!tested!using!
RR!or!OR)!while!opposing!claims! could!equally!draw!on!different!kinds!of! quantities!
to!test!their!claims!about!relative!of!absolute!inequalities!(e.g.!Breen,!Luijkx,!Müller,!
&!Pollak,!2009).!This! way! a! methodological! discussion! would! not! only!facilitate!the!
statistical!implementation!of!certain!models,!but!also!contribute!to!improving!theory!
and!its!predictions.!
!
!
!
45!
5 References!
!
Agresti,! Alan.! 2013.! Categorical' Data' Analysis.! 3rd! ed.! Wiley! Series! in!
Probability!and!Statistics!792.!Hoboken,!NJ:!Wiley.!
Allison,!Paul!D.!1999.!“Comparing!Logit!and!Probit!Coefficients!Across!Groups.”!
Sociological'Methods'&'Research!28:!186–208.!doi:10.1177/0049124199028002003.!
Bailis,!Daniel!S,! Alexander!Segall,!and!Judith! G!Chipperfield.!2003.!“Two! Views!
of! Self-Rated! General! Health! Status.”!Social' Science' &' Medicine!56! (2):! 203–17.!
doi:10.1016/S0277-9536(02)00020-5.!
Blane,! D.,! G.! Netuveli,! and! J.! Stone.! 2007.! “The! Development! of!Life! Course!
Epidemiology.”!Revue' d’Épidémiologie' et' de' Santé'Publique!55! (1):! 31–38.!
doi:10.1016/j.respe.2006.12.004.!
Breen,! Richard,! Anders! Holm,! and! Kristian! Bernt! Karlson.! 2014.! “Correlations!
and! Nonlinear! Probability! Models.”!Sociological' Methods' &' Research!43! (4):! 571–
605.!doi:10.1177/0049124114544224.!
Buis,!Maarten!L.!2010.!“Direct!and!Indirect!Effects!in!a!Logit!Model.”!The'Stata'
Journal!10!(1):!11.!
Cole,!Stephen!R.,!and! Miguel! A.! Hernán.!2004.!“Adjusted!Survival!Curves!with!
Inverse! Probability! Weights.”!Computer' Methods' and' Programs' in' Biomedicine!75!
(1):!45–49.!doi:10.1016/j.cmpb.2003.10.004.!
Cummings,!Peter.! 2009.! “Methods! for! Estimating!Adjusted! Risk! Ratios.”!Stata'
Journal!9!(2):!175.!
Dowd,! Jennifer! B.,! Amanda! M.! Simanek,! and! Allison! E.! Aiello.! 2009.! “Socio-
Economic! Status,! Cortisol! and! Allostatic! Load:! A! Review! of! the! Literature.”!
International'Journal'of'Epidemiology,!August,!dyp277.!doi:10.1093/ije/dyp277.!
Dowd,!Jennifer!Beam,!and!Anna!Zajacova.!2010.!“Does!Self-Rated!Health!Mean!
the! Same! Thing! Across! Socioeconomic! Groups?! Evidence! From! Biomarker! Data.”!
Annals'of'Epidemiology!20!(10):!743–49.!doi:10.1016/j.annepidem.2010.06.007.!
Duncan,!Otis!Dudley.!1975.!Introduction'to'Structural'Equation'Models.!Studies!
in!Population.!New!York:!Academic!Press.!
!
46!
Erikson,! Robert,! John! H.! Goldthorpe,! Michelle! Jackson,! Meir! Yaish,! and! D.! R.!
Cox.! 2005.! “On! Class! Differentials! in! Educational! Attainment.”!Proceedings' of' the'
National' Academy' of' Sciences' of' the' United' States' of' America!102! (27):! 9730–33.!
doi:10.1073/pnas.0502433102.!
Gail,!M.!H.,!S.!Wieand,!and!S.!Piantadosi.!1984.!“Biased!Estimates!of!Treatment!
Effect! in! Randomized! Experiments! with! Nonlinear! Regressions! and! Omitted!
Covariates.”!Biometrika!71!(3):!431–44.!doi:10.1093/biomet/71.3.431.!
Greenland,!Sander,!James!M!Robins,!and!Judea!Pearl.!1999.!“Confounding!and!
Collapsibility!in!Causal!Inference.”!Statistical'Science,!29–46.!
Holm,! Anders,! Mette! Ejrnæs,! and! Kristian! Karlson.! 2014.! “Comparing! Linear!
Probability! Model! Coefficients! across! Groups.”!Quality' &' Quantity,! 1–12.!
doi:10.1007/s11135-014-0057-0.!
Imbens,! G.! W.! 2000.! “The! Role! of! the! Propensity! Score! in! Estimating! Dose-
Response!Functions.”!Biometrika!87!(3):!706–10.!doi:10.1093/biomet/87.3.706.!
Jylhä,! Marja.! 2009.! “What! Is! Self-Rated! Health! and! Why! Does! It! Predict!
Mortality?!Towards!a!Unified!Conceptual!Model.”!Social'Science'&'Medicine!69!(3):!
307–16.!doi:10.1016/j.socscimed.2009.05.013.!
Jylhä,!Marja,!Jack!M.!Guralnik,!Luigi!Ferrucci,!Jukka!Jokela,!and!Eino!Heikkinen.!
1998.!“Is!Self-Rated! Health!Comparable!across!Cultures! and!Genders?”!The'Journals'
of'Gerontology'Series'B:'Psychological'Sciences'and'Social'Sciences!53B!(3):!S144–52.!
doi:10.1093/geronb/53B.3.S144.!
Karlson,! Kristian! Bernt,! Anders! Holm,! and! Richard! Breen.! 2012.! “Comparing!
Regression!Coefficients!Between!Same-Sample!Nested!Models!Using!Logit!and!Probit!
A! New! Method.”!Sociological' Methodology!42! (1):! 286–313.!
doi:10.1177/0081175012444861.!
Leopold,! Liliya.! 2016.! “Cumulative! Advantage! in! an! Egalitarian! Country?!
Socioeconomic!Health!Disparities!over!the!Life!Course!in!Sweden.”!Journal'of'Health'
and'Social'Behavior!57!(2):!257–73.!doi:10.1177/0022146516645926.!
Mood,!Carina.!2010a.!“Logistic!Regression:!Why!We!Cannot!Do!What!We!Think!
We!Can!Do,!and!What!We!Can!Do!about!It.”!European'Sociological'Review!26!(ii):!67–
82.!doi:10.1093/esr/jcp006.!
!
47!
———.!2010b.!“Logistic!Regression:!Why! We! Cannot! Do! What! We! Think! We!
Can!Do,!and!What!We!Can!Do!About!It.”!European'Sociological'Review!26!(1):!67–82.!
doi:10.1093/esr/jcp006.!
———.! 2013.! “Life-Style! and! Self-Rated! Global! Health! in! Sweden:! A!
Prospective!Analysis!Spanning!Three!Decades.”!Preventive'Medicine!57!(6):!802–806.!
Morgan,! Stephen! L.,! and! Christopher! Winship.! 2007.! Counterfactuals' and'
Causal'Inference:'Methods'and'Principles'for'Social'Research.!Analytical!Methods!for!
Social!Research.!New!York:!Cambridge!University!Press.!
Naimi,!Ashley!I.,!Erica!E.!M.!Moodie,!Nathalie!Auger,!and!Jay!S.!Kaufman.!2014.!
“Constructing!Inverse! Probability! Weights! for!Continuous! Exposures:! A! Comparison!
of! Methods.”!Epidemiology' (Cambridge,' Mass.)!25! (2):! 292–99.!
doi:10.1097/EDE.0000000000000053.!
Norton,!Edward!C.!2012.!“Log!Odds!and!Ends.”!Working!Paper!18252.!National!
Bureau!of!Economic!Research.!http://www.nber.org/papers/w18252.!
Pang,! Menglan,! Jay! S.! Kaufman,! and! Robert! W.! Platt.! 2013.! “Studying!
Noncollapsibility!of!the!Odds! Ratio! with! Marginal!Structural!and!Logistic!Regression!
Models.”!Statistical' Methods' in' Medical' Research,! October,! 0962280213505804.!
doi:10.1177/0962280213505804.!
Pearson,!K.,!and!D.!Heron.!1913.!“On!Theories!of!Association.”!Biometrika!9!(1–
2):!159–315.!doi:10.1093/biomet/9.1-2.159.!
Tchetgen!Tchetgen,!Eric!J.! 2013.! “Inverse! Odds! Ratio-Weighted!Estimation!for!
Causal!Mediation!Analysis.”!Statistics'in'Medicine!32!(26):!4567–4580.!
Triventi,! Moris.! 2013.! “Stratification! in! Higher! Education! and! Its! Relationship!
with! Social! Inequality:! A! Comparative! Study! of! 11! European! Countries.”!European'
Sociological'Review!29!(3):!489–502.!
Whittemore,! Alice! S.! 1978.! “Collapsibility! of! Multidimensional! Contingency!
Tables.”!Journal'of'the'Royal'Statistical'Society.'Series'B'(Methodological),!328–340.!
Willson,! Andrea!E.,! Kim!M.! Shuey,! and! Jr.! Glen!H.!Elder.! 2007.! “Cumulative!
Advantage!Processes!as!Mechanisms!of!Inequality!in!Life! Course! Health.”!American'
Journal'of'Sociology!112!(6):!1886–1924.!doi:10.1086/509520.!
Winship,! Christopher,! and! Robert! D.! Mare.! 1984.! “Regression! Models! with!
Ordinal!Variables.”!American'Sociological'Review!49:!512.!doi:10.2307/2095465.!
!
48!
Yule,! G.! Udny.! 1900.! “On! the! Association! of! Attributes! in! Statistics:! With!
Illustrations!from!the!Material!of!the!Childhood!Society.”!Philosophical'Transactions'
of' the' Royal' Society' of' London' A:' Mathematical,' Physical' and' Engineering' Sciences!
194!(252–261):!257–319.!doi:10.1098/rsta.1900.0019.!
———.!1903.!“Notes! on!the!Theory!of!Association! of!Attributes!in!Statistics.”!
Biometrika!2!(2):!121.!doi:10.2307/2331677.!
Yule,!George!Udny.!1911.!An'Introduction'to'the'Theory'of'Statistics.!C.!Griffin,!
limited.!
Zhang,! Zhiwei.! 2008.! “Estimating! a! Marginal! Causal! Odds! Ratio! Subject! to!
Confounding.”!Communications' in' Statistics' -' Theory' and' Methods!38! (3):! 309–21.!
doi:10.1080/03610920802200076.!
!
!
!
!
!
49!
6 Appendix!
!
6.1 S1%-%A%formal%treatment%of%comparison%
In! our! definition,! we! use! !! as! a! placeholder! for! the! construct! we! want! to!
compare! from! a! theoretical! perspective! and! "! the! quantity! we! actually! estimate!
from!our!model!that!is!to!represent!!.!In!the! following! we! give! a! formal! definition!
when!comparisons!of! "! across!groups!represents!a!comparison! of! !!across!groups.!
We!use!group!A!and!B,!as!stand-ins!for!any!kind!of!groups!comparisons!e.g.!between!
countries,!men!and!women,!cohorts,!ethnic!groups!or!periods.!
We!define!comparability!of!size!on!additive!scales!as!follows:!
!
# "$% "&' !$% !&!
(1a)!
!
This!means! that!the!difference!between!group!A!and!group!B!in!our!construct!
equals! the! expectation! of! the! difference! of! our! estimated! quantities,! a!
straightforward!definition.! For!multiplicative'scales!the!analogue!definition!refers!to!
the!ratio!instead!of!the!difference:!
#"()
"(*
'!()
!(*
!
(1b)!
Our!definition! implies!that!the!difference!(ratio)!of!the!quantities!we!estimate!
needs! to! be! an! unbiased! estimator! of! the! difference! (ratio)! of! the! true! difference!
(ratio).!
In!contrast,!the!comparability'of'sign!lets!us!only!answer!the!simple!question!
whether! D! has! the! same! sign! in! both! groups.! Comparability! of! sign! is! given! if! the!
following!conditions!hold!(3):!
# "$+ ,--.//---!$+ ,!
# "$0 ,--.//---!$0 ,!
# "&+ ,--.//---!&+ ,-!
# "&0 ,--.//---!&0 ,!
!
!
50!
The!acceptance!that!comparability!of!estimates!depends!on!the!definition!of!!!
is! crucial! to! our! argument.! How! we! define! !! either! as! an! absolute! distance! on! an!
additive! metric! or! a! ratio! between! two! quantities! would! be! ideally! rooted! in!
theoretical!grounds.!Thus,!we! deliberately! omitted! a!definition!of!the!scale!of!!.! !!
could!be! measured! on! different! scales!depending!on! the! research! context.! For!our!
purpose! the! probability! scale! (Pr(Y)),! either! additive! or! multiplicative,! ! the! odds!
(12-345
)612-345)! scale,! and! the! (standardized)! scale! of! the! latent! variable! (y*)! will! be! the!
central! scales! under! which! most! research! questions! can! be! subsumed.! Which! of!
these! scales! is! relevant! to! determine! comparability! is! mainly! dependent! on! our!
choice!of!conceptual!framework!that!we!apply!to!our!dependent!variable.!!
!
51!
6.2 S2%–%Additional%tables%and%graphs%
Table! 10:! Marginal! and! synthetic! marginal! table! linking! parental! education! and! children's! education! (weighting! additionally! for! gender)! –!
frequencies!for!Cohort!A!
!
Marginal!table!
Synthetic!marginal!table!
Parental!education!
No!college!
College!
Total!
Mean!level!of!skill!
No!
college!
College!
Total!
Mean!level!of!skill!
Low!
1,303!
1,718!
3,021!
-0.008!
1,175!
1,846!
3,021!
0.412!
High!
474!
1,505!
1,979!
1.017!
595!
1,384!
1,979!
0.410!
TOTAL!
1,777!
3,223!
5,000!
0.398!
1,770!
3,230!
5,000!
0.411!
Mean!level!of!skill!!
-0.827!
1.073!
0.398!
ME:19.18%RR:1.34%.OR:%2.41!
-0.807!
1.053!
0.411!
.%ME:8.8%RR:1.14%.OR:%1.48!
Note:!Simulated! data.!The! counterfactual!table!was!created!using! weights!that! balances!skill! level!in! high!school,! so!as!to!create!a!counterfactual! data!set!in!which!skill!level!and!parental!education!are!unrelated.!
Skills!level!is!equally!distributed!across!individuals!from!high!and!low!parental!background.!
!
Table!11:!Marginal!and!synthetic!marginal!table!linking!parental!education!and!children's!education!(weighting!additionally!for!gender)!–!row!
percentages!for!Cohort!A!
!
Marginal!table!
!
Synthetic!marginal!table!!
Parental!education!
No!college!
College!
Total!
Mean!level!of!skill!
No!college!
College!
Total!
Mean!level!of!skill!
Low!
43.13!
56.87!
100.00!
-0.008!
38.90!
61.10!
100.00!
0.412!
High!
23.95!
76.05!
100.00!
1.017!
30.08!
69.92!
100.00!
0.410!
TOTAL!
35.54!
64.46!
100.00!
0.398!
35.41!
64.59!
100.00!
0.411!
Mean!level!of!skill!
-0.827!
1.073!
0.398!
ME:19.18%RR:1.34%.OR:%2.41%
-0.805!
1.050!
0.411!
ME:8.8%RR:1.14%.OR:%1.48!
Note:!Simulated!data.!The!synthetic!table!was!created!using!weights!that!balances!skill!level!in!high!school,!so!as!to!create!a!synthetic!data!set!in!which!skill!level!and!parental!education!are!unrelated.!Skills!level!is!
equally!distributed!across!individuals!from!high!and!low!parental!background.!