Content uploaded by Marta E Alarcón-Riquelme
Author content
All content in this area was uploaded by Marta E Alarcón-Riquelme on Mar 31, 2016
Content may be subject to copyright.
ARTHRITIS & RHEUMATOLOGY
Vol. 68, No. 4, April 2016, pp 932–943
DOI 10.1002/art.39504
V
C2016, American College of Rheumatology
Genome-Wide Association Study in an Amerindian Ancestry
Population Reveals Novel Systemic Lupus Erythematosus
Risk Loci and the Role of European Admixture
Marta E. Alarc
on-Riquelme,
1
Julie T. Ziegler,
2
Julio Molineros,
1
Timothy D. Howard,
2
Andr
es Moreno-Estrada,
3
Elena S
anchez-Rodr
ıguez,
4
Hannah C. Ainsworth,
2
Patricia Ortiz-Tello,
3
Mary E. Comeau,
2
Astrid Rasmussen,
1
Jennifer A. Kelly,
1
Adam Adler,
1
Eduardo M. Acevedo-V
azquez,
5
Jorge Mariano Cucho-Venegas,
5
Ignacio Garc
ıa-De la Torre,
6
Mario H. Cardiel,
7
Pedro Miranda,
8
Luis J. Catoggio,
9
Marco Maradiaga-Cece~
na,
10
Patrick M. Gaffney,
1
Timothy J. Vyse,
11
Lindsey A. Criswell,
12
Betty P. Tsao,
13
Kathy L. Sivils,
1
Sang-Cheol Bae,
14
Judith A. James,
15
Robert P. Kimberly,
16
Kenneth M. Kaufman,
17
John B. Harley,
17
Jorge A. Esquivel-Valerio,
18
Jos
e F. Moctezuma,
19
Mercedes A. Garc
ıa,
20
Guillermo A. Berbotto,
21
Alejandra M. Babini,
22
Hugo Scherbarth,
23
Sergio Toloza,
24
Vicente Baca,
25
Swapan K. Nath,
1
Carlos Aguilar Salinas,
26
Lorena Orozco,
27
Teresa Tusi
e-Luna,
28
Raphael Zidovetzki,
29
Bernardo A. Pons-Estel,
30
Carl D. Langefeld,
2
and Chaim O. Jacob
31
Objective. Systemic lupus erythematosus (SLE) is
a chronic autoimmune disease with a strong genetic com-
ponent. We undertook the present work to perform the first
genome-wide association study on individuals from the
Americas who are enriched for Native American heritage.
Methods. We analyzed 3,710 individuals from the
US and 4 countries of Latin America who were diag-
nosed as having SLE, and healthy controls. Samples
were genotyped with HumanOmni1 BeadChip. Data on
out-of-study controls genotyped with HumanOmni2.5
were also included. Statistical analyses were performed
using SNPtest and SNPGWA. Data were adjusted for
genomic control and false discovery rate. Imputation
was performed using Impute2 and, for classic HLA
alleles, HiBag. Odds ratios (ORs) and 95% confidence
intervals (95% CIs) were calculated.
Results. The IRF5–TNPO3 region showed the stron-
gest association and largest OR for SLE (rs10488631:
genomic control–adjusted P[P
gcadj
]52.61 310
229
,OR
2.12 [95% CI 1.88–2.39]), followed by HLA class II on the
DQA2–DQB1 loci (rs9275572: P
gcadj
51.11 310
216
,OR
1.62 [95% CI 1.46–1.80] and rs9271366: P
gcadj
56.46 3
10
212
, OR 2.06 [95% CI 1.71–2.50]). Other known SLE
loci found to be associated in this population were ITGAM,
STAT4,TNIP1,NCF2,andIRAK1. We identified a novel
Supported by grants from the NIH (R01-CA-141700 and
RC1-AR-058621 to Dr. Alarc
on-Riquelme; R01-AR-043814 and R21-
AR-065626 to Dr. Tsao; U19-A1082714, U01-A101934, U54-GM-
104938, P30-AR-053483, and P30-GM-103510 to Dr. James; P01-AR-
049084 to Dr. Kimberly; R01-AR-060366 and R21-AI-107176 to Dr.
Nath; and R01-AR-057172 to Dr. Jacob), the International Consor-
tium on the Genetics of Systemic Lupus Erythematosus (SLEGEN)
(P01-AI-083194 to Drs. Alarc
on-Riquelme, Harley, and Jacob), and
the Alliance for Lupus Research (to Drs. Tsao and Jacob).
1
Marta E. Alarc
on-Riquelme, MD, PhD, Julio Molineros,
PhD, Astrid Rasmussen, MD, PhD, Jennifer A. Kelly, MPH, Adam
Adler, BS, Patrick M. Gaffney, MD, Kathy L. Sivils, PhD, Swapan K.
Nath, PhD: Oklahoma Medical Research Foundation, Oklahoma
City;
2
Julie T. Ziegler, MA, Timothy D. Howard, PhD, Hannah C.
Ainsworth, BS, Mary E. Comeau, MA, Carl D. Langefeld, PhD: Wake
Forest School of Medicine, Winston-Salem, North Carolina;
3
Andr
es
Moreno-Estrada, MD, PhD, Patricia Ortiz-Tello, BS: Stanford Uni-
versity School of Medicine, Stanford, California, and Laboratorio
Nacional de Gen
omica para la Biodiversidad, Centro de Investigaci
on
y de Estudios Avanzados del Instituto Polit
ecnico Nacional, Irapuato,
Mexico;
4
Elena S
anchez-Rodr
ıguez, PhD: Icahn School of Medicine at
Mount Sinai, New York, New York;
5
Eduardo M. Acevedo-V
azquez,
MD, Jorge Mariano Cucho-Venegas, MD: Hospital Nacional Guil-
lermo Almenara Irigoyen, Lima, Peru;
6
Ignacio Garc
ıa-De la Torre,
MD: Hospital General de Occidente, Zapopan, Mexico;
7
Mario H.
Cardiel, MD: Centro de Investigaci
on Cl
ınica de Morelia, Morelia,
Mexico;
8
Pedro Miranda, MD: Centro de Estudios Reumatol
ogicos,
Santiago, Chile;
9
Luis J. Catoggio, MD, PhD: Hospital Italiano de
Buenos Aires, Buenos Aires, Argentina;
10
Marco Maradiaga-Cece~
na,
MD: Hospital General de Culiac
an, Culiac
an, Mexico;
11
Timothy J.
Vyse, FRCP, PhD, FMedSci: King’s College London, London, UK;
932
locus on 10q24.33 (rs4917385: P
gcadj
51.39 310
28
)with
an expression quantitative trait locus (eQTL) effect
(P
eqtl
58.0 310
237
at USMG5/miR1307), and several new
suggestive loci. SLE risk loci previously identified in Euro-
peans and Asians were corroborated. Local ancestry esti-
mation showed that the HLA allele risk contribution is of
European ancestral origin. Imputation of HLA alleles sug-
gested that autochthonous Native American haplotypes
provide protection against development of SLE.
Conclusion. Our results demonstrate that study-
ing admixed populations provides new insights in the
delineation of the genetic architecture that underlies
autoimmune and complex diseases.
“Hispanics” or “Mestizos” in the Americas are a
heterogeneous group of populations with a complex his-
tory of Native American, European, African, or Asian
admixture, resulting from varying geographic origins
and individual lineages (1,2). The influence of genetic
factors is important for better understanding these pop-
ulations and providing an opportunity to leverage the
ancestral contributions to identify genetic factors that
predispose to complex genetic traits.
Systemic lupus erythematosus (SLE) (OMIM
no. 152700) is the prototypical systemic autoimmune
disease, with a complex genetic influence. Genome-wide
association studies (GWAS) in European and European
American populations have identified .40 genetic loci
(3,4). Hispanics have an increased likelihood of devel-
oping severe SLE, showing an earlier age at onset and
severe renal disease (5,6). In fact, the global genetic pro-
portion of Native American ancestry is associated with
increased, and that of European ancestry is associated
with decreased, risk of developing SLE with severe renal
disease (6,7). Consistent with this observation, we have
shown that the number of risk alleles for SLE increases
with increased global genetic Native American ancestry
(8). These patterns suggest that ancestry contributes to
genetic susceptibility through an increased genetic load
of risk alleles, and/or that genetic background of Native
American ancestry interacts with European risk loci in
disease development (8).
With the aim of identifying previously unknown
genetic risk loci for SLE contributed by the Native heri-
tage of the Americas, we conducted a large-scale
GWAS of individuals from the US and 4 countries
of Latin America who were diagnosed as having SLE.
We compared their allele frequencies across the
HumanOmni1-Quad version 1.0 BeadArray with those
of healthy controls from the same geographic areas. To
improve statistical power, we also used a genotyped set
of out-of-study controls from Mexico. We imputed classic
HLA 4-digit alleles and analyzed their impact on risk
using univariate and stepwise regression. Based on local
ancestry estimates, we attempted to identify the ethnic
origins of the risk loci. Finally, we examined the risk loci
to determine whether any of these are expression quanti-
tative trait loci (eQTL) from publicly available databases.
Together, the known and novel loci identified in this
GWAS provide a unique picture of the genetic architec-
ture of SLE contributed by Native American ancestry in
its admixture with Europeans and define the HLA as a
risk factor contributed by the European admixture.
PATIENTS AND METHODS
Samples. All SLE patients were recruited from spe-
cialist rheumatology clinics and fulfilled the American College
of Rheumatology 1982 criteria (9). We used 2 strategies to
enrich samples for Native American ancestry. First, previously
genotyped Hispanic, Latin American, and US Native American
individuals who were previously genotyped with quality control
and a set of 253 ancestry informative markers that were part of
the Lupus Large Association 2 Study were included (6,8,10).
12
Lindsey A. Criswell, MD, MPH, DSc: University of California,
San Francisco;
13
Betty P. Tsao, PhD: University of California, Los
Angeles;
14
Sang-Cheol Bae, MD, PhD, MPH: Hanyang University
Hospital for Rheumatic Diseases, Seoul, Republic of Korea;
15
Judith
A. James, MD, PhD: Oklahoma Medical Research Foundation and
University of Oklahoma Health Sciences Center, Oklahoma City;
16
Robert P. Kimberly, MD: University of Alabama at Birmingham;
17
Kenneth M. Kaufman, PhD, John B. Harley, MD, PhD: Cincinnati
Children’s Hospital Medical Center, Cincinnati, Ohio;
18
Jorge A.
Esquivel-Valerio, MD: Hospital Universitario Dr. Jos
e Eleuterio
Gonz
alez Universidad Autonoma de Nuevo Le
on, Monterrey, Mexico;
19
Jos
e F. Moctezuma, MD: Hospital General de M
exico, Mexico City,
Mexico;
20
Mercedes A. Garc
ıa, MD: Hospital Interzonal General de
Agudos General San Martin, La Plata, Argentina;
21
Guillermo A.
Berbotto, MD: Hospital Escuela Eva Per
on, Granadero Baigorria,
Argentina;
22
Alejandra M. Babini, MD: Hospital Italiano de C
ordoba,
C
ordoba, Argentina;
23
Hugo Scherbarth, MD: Hospital Interzonal
General de Agudos Oscar E. Alende, Mar del Plata, Argentina;
24
Sergio M. A. Toloza, MD: Hospital Interzonal San Juan Bautista,
San Fernando del Valle de Catamarca, Argentina;
25
Vicente Baca,
MD: Hospital de Peditaria, Centro M
edico Nacional Siglo XXI, Insti-
tuto Mexicano del Seguro Social, Mexico City, Mexico;
26
Carlos Agui-
lar Salinas, MD, PhD: Instituto Nacional de Ciencias M
edicas y
Nutrici
on Salvador Zubir
an, Mexico City, Mexico;
27
Lorena Orozco,
PhD: Instituto Nacional de Medicina Gen
omica, Mexico City, Mexico;
28
Teresa Tusi
e-Luna, MD, PhD: Instituto Nacional de Ciencias
M
edicas y Nutrici
on Salvador Zubir
an and Instituto de Investigaciones
Biom
edicas de la Universidad Nacional Aut
onoma de M
exico, Mexico
City, Mexico;
29
Raphael Zidovetzki, PhD: University of California,
Riverside;
30
Bernardo A. Pons-Estel, MD: Sanatorio Parque, Rosario,
Argentina;
31
Chaim O. Jacob, MD, PhD: University of Southern
California School of Medicine, Los Angeles.
Address correspondence to Marta E. Alarc
on-Riquelme,
MD, PhD, Arthritis and Clinical Immunology Program, Oklahoma
Medical Research Foundation, 825 NE 13th Street, Oklahoma City,
OK 73104. E-mail: alarconm@omrf.org.
Submitted for publication April 2, 2015; accepted in revised
form November 3, 2015.
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 933
Overall, only 35% of these samples had primarily European
and Native American admixture, while the great majority also
hadAsian,African,and,20% Native American ancestry,
showing that the Hispanic US and Mestizo Latin American
populations are highly admixed. We selected samples with
,10% global African or Asian ancestries and at least 20%
Native American, as estimated by Structure and using as refer-
ence HapMap data and an in-house reference set of 40 individ-
uals of Nahua origin from Mexico (8,11). Second, we recruited
patients from countries that are historically and demographical-
ly known to have a lower frequency of African or Asian ances-
tries (12).
Genotyping and laboratory quality control. All sam-
ples were genotyped on a HumanOmni1-Quad version 1.0
BeadChip, using Manifest H, at the Oklahoma Medical
Research Foundation genotyping core facility. We used the
Illumina clustering algorithm with GenomeStudio version
2011.1 and an Illumina-provided scatter profile. We required
that samples and single-nucleotide polymorphisms (SNPs)
pass 2-step quality control measurements. Samples with a SNP
call rate of ,90% were removed from the study. SNPs with a
call rate of ,90% or a Hardy-Weinberg equilibrium Pvalue of
,0.0001 were reclustered using Genome Studio automatic
clustering. The cluster profiles of SNPs generating a case–
control chi-square Pvalue of ,0.0001 were manually inspected
and adjusted to increase accuracy of genotype calls. Genotyping
module version 1.9.4 was used to complete the genotype clus-
tering and calling. Cluster plots were examined and samples
were excluded if they had a SNP call rate of ,0.90. We also
integrated a set of 1,432 out-of-study controls from the Slim Ini-
tiative in Genomic Medicine for the Americas (SIGMA) type 2
diabetes study in Mexicans (13) who were genotyped on Illu-
mina HumanOmni2.5-Quad. Analyses were completed with
data aligned to the positive strand. In addition, 125 samples
were genotyped for classic HLA–A, B, C, DRB1, DQA1, and
DQB1 alleles using a Luminex system, and data were used to
evaluate the performance of the imputation of classic HLA
alleles.
Statistical quality control. We considered SNPs to be
of high quality if they had call rates of .95%, no evidence of dif-
ferential missingness (P,0.05) between cases and controls, and
no evidence of departure from Hardy-Weinberg equilibrium pro-
portions (controls P,0.01, cases P,0.000001). Primary infer-
ences were made based on SNPs with minor allele frequencies of
.1%. Based on SNPs passing the quality control thresholds,
we removed samples if there was inconsistency between
recorded and genetically inferred sex or excess autosomal het-
erozygosity. Duplicates and samples from first- or second-
degree relatives were determined according to identity-by-
descent statistics computed with the program King (http://
people.virginia.edu/~wc9c/king/) (14) and were removed. Princi-
pal components analysis (PCA) was performed using Eigensoft
version 3.0 (http://www.hsph.harvard.edu/faculty/alkes-price/
software/), after merging with data from HapMap phase III indi-
viduals (Utah residents of northern and western European ances-
try [CEU], Yoruba from Ibadan, Nigeria [YRI], Han Chinese
individuals from Beijing China, and Mexican ancestry in Los
Angeles, California [MEX]) as reference. The PCA was per-
formed on a subset of SNPs with minor allele frequencies of
$0.05 outside of the known SLE loci and flanking linked
regions, after linkage disequilibrium (LD) pruning (r
2
$0.2).
Admixture estimates were computed using Admixture (http://
www.genetics.ucla.edu/software/admixture/) (15), and genetic
outliers were removed based on the admixture estimates and
first 5 PCs.
Statistical analysis. To test for association between a
SNP and case/control status, logistic regression analysis was per-
formed with the first 2 PCs included as covariates; the 2 PCs
reduced the inflation factor sufficiently such that adding addi-
tional PCs as covariates did not further reduce it. The primary
analysis reported herein is the joint analysis of North American
and South American samples. We confirmed that the results
were consistent with results from the meta-analysis of North
American and South American samples computed separately,
with weighting by sample size and using the program Metal
(16). Similarly, we found comparable results using Gemma
(17). No heterogeneity was observed between data from North
American subjects and data from South American subjects (see
below), supporting the use of the joint analysis. The primary
inference was based on the additive genetic model unless there
was significant lack-of-fit to the additive model (P,0.05). If
there was evidence of departure from an additive model, then
inference was based on the most significant result among the
dominant, additive, and recessive genetic models. The additive
and recessive models were computed only if there were at least
10 individuals and 30 individuals, respectively, who were homo-
zygous for the minor allele. For analysis of the X chromosome,
the data were first stratified by sex and then meta-analyzed
across both sexes. The GWAS results reported are adjusted for
the genomic control inflation factor. These statistical analyses
were performed using SNPtest (https://mathgen.stats.ox.ac.uk/
genetics_software/snptest/snptest.html) and SNPGWA version
4.0 (www.phs.wfubmc.edu) as described below. To confirm the
genotyping quality, we visually inspected the cluster plots for
the most closely associated SNPs in the regions.
To determine how many independent associations
were within a genomic region, a manual stepwise conditional
logistic regression procedure (i.e., forward selection with back-
ward elimination, entry and exit criteria of P,0.0001) was
performed. Specifically, for each region that reached genome-
wide significance, the top SNP was included as a covariate and
the association statistics were recalculated. SNPs were entered
into and exited from models in this stepwise manner until no
additional SNPs met a significance threshold of P,0.0001.
To adjust for the number of tests actually computed, we calcu-
lated false discovery rate (FDR)2adjusted Pvalues (P
FDR
)
using the Benjamini-Hochberg procedure on the genomic con-
trol2adjusted Pvalues (P
gcadj
) (18).
The cumulative variance explained by common SNPs
was estimated using a variance component model and restrict-
ed maximum-likelihood estimation as implemented in the pro-
gram GCTA with adjustment for the PCs as covariates using
Yang’s correction factor (c 50 from formula 9) for imperfect
LD with causal variants. Estimates are based on SNPs that had
,1% missing genotypes and a stringent relatedness threshold
of 0.025. We constructed regional plots of association for
regions of interest using the program LocusZoom (http://csg.
sph.umich.edu/locuszoom/).
Imputation. SNP genotype imputation. SNP genotype
imputation across the genome was computed using the 1000
Genomes reference panel to localize the top statistical associa-
tions. Specifically, we used the program ShapeIt (http://www.
shapeit.fr/) to separately pre-phase the Illumina HumanOmni1-
Quad genotype data and HumanOmni2.5M genotype data for
934 ALARC
ON-RIQUELME ET AL
SNPs that passed quality control. We used Impute2 (https://
mathgen.stats.ox.ac.uk/impute/impute_v2.html) (19,20) with the
1000 Genomes phase 1 integrated reference panel to impute the
SNP genotypes for the HumanOmni2.5M data. Impute2 was
then used to impute the SNP genotypes for the HumanOmni1-
Quad data with the 1000 Genomes phase 1 integrated reference
panel and the previously imputed HumanOmni2.5M chip. We
performed tests of association and computed odds ratios (ORs),
95% confidence intervals (95% CIs), and other summary statis-
tics on those SNPs that had information scores (.0.5) and confi-
dence scores (.0.9). We tested for association between imputed
SNPs and SLE status using the Impute2 companion program
SNPtest, which explicitly accounts for estimated imputation
uncertainty (the posterior probability) in the test of association.
Classic HLA allele imputation. The classic HLA alleles
at HLA–A, B, C, DPB1, DQA1, DQB1, and DRB1 were imput-
ed using the program HiBag (21) and its corresponding Hispan-
ic reference data set. HiBag uses an ensemble classifier and
bagging techniques to arrive at an average posterior probability
for the 4-digit classic HLA alleles. Because 37.8% of the refer-
ence SNPs used by HiBag had missing genotype data, we
repeated the HLA imputation after filling in missing genotype
data with the “best guess” imputed SNP data from the 1000
Genomes imputation described above. By using the “best guess”
genotype data that had a posterior probability of .0.90, the per-
cent of missing SNPs in the reference set was reduced to 0.6%
(for details see Supplementary Table 1, available on the Arthritis
&Rheumatologyweb site at http://onlinelibrary.wiley.com/doi/
10.1002/art.39504/abstract). However (and as would be
expected), we observed that the posterior probabilities for the
HLA alleles were comparable with the 2 methods, and HLA
imputation data reported herein are based on the imputed data
incorporating the imputation posterior probability.
Estimation of local ancestry. Local ancestry assign-
ment was performed using PCAdmix (22) at K53 ancestral
groups. This approach relies on phased data from reference
panels and the admixed individuals. To construct our refer-
ence panel we combined publicly available data from 26 West
African (YRI) and 26 European-descent (CEU) HapMap3
trios (23,24) and 40 of the Nahua individuals. After merging
with all Latino individuals and pruning for missing values at
5% and highly linked markers with r values of .0.8, the result-
ing data set comprised 417,563 overlapping SNPs. Individual
genotypes from admixed and continental reference samples
were phased using Beagle version 3.0 (25). An assessment of
the accuracy of this approach is provided in ref. 26. For each
local ancestry window we compared cases and controls by
Welch’s 2-sample t-test to assess whether there was significant
contribution of any given ancestral group to each window’s dis-
ease association.
Bioinformatic analysis of eQTL locus. The GENE-
VAR database (http://www.sanger.ac.uk/resources/software/
genevar/) was used to analyze gene expression variation of the
novel locus rs7911488 on UMSG5. Data from HapMap lym-
phoblasts from individuals of several ancestries were incorpo-
rated in the analysis. In addition, ENCODE version 3
data (http://www.genome.gov/encode/) was used to investigate
transcription factor binding information for the locus. Trans-
fac and Match interface (http://www.gene-regulation.com/
cgi-bin/pub/programs/match/bin/match.cgi) were used to
determine if SNP alleles modified binding of the transcription
factors to the target DNA sequence. Only high-quality verte-
brate matrices were used, with cutoffs set to minimize the sum
of the false-positive and false-negative results. Prediction of
the RNA folding for each of the alleles was performed using
RNAfold (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi). Final-
ly, biogps (http://biogps.org) was used to obtain information
about the cellular expression of USMG5.
RESULTS
Origin and quality control of genotyped SLE
and control samples. The present GWAS included
3,710 genotyped individuals (1,393 SLE cases, 886 con-
trols, and genotype data on 1,431 out-of-study controls
from the SIGMA study) (27). The 1,393 SLE cases con-
sisted of 862 individuals of enriched Native American
ancestry from Mexico and Hispanics and Native Ameri-
cans from the US (collectively denoted North Ameri-
can), and 531 individuals of enriched Native American
ancestry from Argentina, Chile, and Peru (collectively
denoted South American). Parallel numbers for within-
study controls were 542 and 344, respectively; all out-of-
study controls were of North American origin. Omni1
and Omni2.5 had 996,672 and 1,469,575 SNPs that
passed quality control filters, respectively. A total of
580,483 SNPs on both chips are the foundation of the
analyses reported here. After completion of quality con-
trol procedures, 88.7% of the North American SLE
cases and 78.8% of the North American controls were
female; inclusion of the North American out-of-study
controls reduced this percentage to 65.9% female. Simi-
larly, 91.7% and 89.5% of the South American SLE
cases and controls, respectively, were female.
Principal components analysis. With PCA, we
identified 2 PCs that significantly reduced the inflation
of our test statistic to 1.18 and were supported by the
PC scree plot. The addition of more PCs did not mean-
ingfully impact the inflation. The PC plot shown in Sup-
plementary Figure 1 (on the Arthritis & Rheumatology
web site at http://onlinelibrary.wiley.com/doi/10.1002/
art.39504/abstract) demonstrates that 2 PCs differentiat-
ed the populations, diverging with a vertex at the Euro-
pean ancestry cluster and North American samples at
the top, and South American samples below. These cor-
responded to a South American population group
(n 5875), and a North American group (n 52,835). The
association analysis probability-probability plot is shown in
Supplementary Figure 2 (http://onlinelibrary.wiley.com/
doi/10.1002/art.39504/abstract). Thus, the reported
analyses entail adjustment for 2 PCs and then a geno-
mic control adjustment.
The joint analysis of North American and South
American samples versus the separate meta-analysis of
North American samples and South American samples
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 935
was carefully examined. Clearly, and because the out-of-
study controls were of North American origin, the statis-
tical power was higher for the North American analysis.
As expected, the results of the separate meta-analysis
were comparable with those of the joint analysis, allow-
ing for a more robust sample size for use in consider-
ation of additive and recessive genetic models
(Supplementary Figure 3 and Supplementary Table 1,
http://onlinelibrary.wiley.com/doi/10.1002/art.39504/
abstract). There were no SNPs with evidence of differ-
ential effects between North American and South
American samples that met genome-wide significance
or FDR thresholds of significance.
Non-HLA associations. Results for non-HLA
associations are partitioned into 3 tiers: tier 1 (P,53
10
28
), tier 2 (5 310
28
,P,1310
26
), and tier 3 (not
P.1310
26
and P
FDR
,0.05). All associations in these
tiers meet significance after adjustment for the number
of tests computed via the Benjamini-Hochberg FDR
method. All associations reported have LD support for
the top associations (Table 1). All loci were subjected to
stepwise analysis (conditional logistic regression analysis)
following imputation. In total, 7 regions met tier 1 crite-
ria, 4 regions met tier 2 criteria, and 31 additional
regions met tier 3 criteria (Table 1 and Supplementary
Table 2 and Figure 4, on the Arthritis & Rheumatology
web site at http://onlinelibrary.wiley.com/doi/10.1002/
art.39504/abstract).
Surprisingly, the most significant association
across the entire genome, including the HLA region,
was that implicating IRF5–TNPO3 (Figure 1 and Table
1). Specifically, rs12539741 (genomic control2adjusted
P[P
gcadj
]56.24 310
231
,P
FDR
51.32 310
225
,OR
2.11 [95% CI 1.88–2.37]) together with rs10488631
(P
gcadj
52.61 310
229
,P
FDR
56.04 310
224
, OR 2.12
[95% CI 1.88–2.39]) were the top associations.
The associations of multiple previously reported
non-HLA SLE-predisposing loci in North American–
enriched Hispanic samples were supported in the sam-
ples analyzed in the present study (Table 1). Among the
Table 1. Overview of the SLE-associated loci with FDR values of ,0.05 in the genome-wide association study*
SNP† Chr. Position Region‡ Gene RA
RAF
Best P
gcadj
Best P
FDR
OR (95% CI)
Stepwise Pvalue
imputation§Cases Controls
Tier 1 (best P
gcadj
,5.0 310
28
)
rs12736195
i
1q25 173307901 5 LOC100506023 C 0.210 0.279 7.31 310
211
1.08 310
26
0.66 (0.59–0.74) 1.36 310
211
rs17346550 1q25 173315625 5 LOC100506023 C 0.202 0.269 5.59 310
210
5.40 310
26
0.66 (0.58–0.74) 2
rs13306575¶ 1q25 183532437 6 NCF2 A 0.117 0.105 3.21 310
28
1.62 310
24
1.62 (1.39–1.91) 2
rs11889341 2q32 191943742 3 STAT4 T 0.489 0.407 7.02 310
213
1.05 310
28
1.5 (1.36–1.66) 2
rs7708392
r,i
5q33 150457485 4 TNIP1 C 0.536 0.499 2.23 310
211
3.42 310
27
1.76 (1.50–2.07) 1.34 310
211
rs10036748
r
5q33 150458146 4 TNIP1 T 0.534 0.498 9.56 310
211
1.01 310
26
1.72 (1.46–2.02) 2
rs4728142 7q32 128573967 1 IRF5 A 0.498 0.387 1.06 310
223
1.22 310
218
1.76 (1.59–1.95) 2
rs10488631 7q32 128594183 1 TNPO3 C 0.295 0.223 2.61 310
229
6.04 310
224
2.12 (1.88–2.39) 2
rs12539741
i
7q32 128596805 1 TNPO3 T 0.295 0.222 6.24 310
231
1.32 310
225
2.11 (1.88–2.37) 1.26 310
234
rs4917385
i
10q24 105003721 7 BC040734 T 0.294 0.352 1.39 310
28
1.04 310
24
0.72 (0.65–0.80) 9.98 310
210
rs34572943 16p11 31272353 2 ITGAM A 0.150 0.068 6.52 310
219
3.78 310
214
2.25 (1.91–2.66) 2
rs1143679
i
16p11 31276811 2 ITGAM A 0.158 0.071 3.26 310
221
2.92 310
216
2.30 (1.97–2.70) 8.82 310
224
rs1059702 Xq28 153284192 X1 IRAK1 A 0.416 0.500 8.42 310
214
1.45 310
29
0.58 (0.49–0.68) 2
Tier 2 (best P
gcadj
,1.0 310
26
)
rs643955
r,i
8p23 9891254 39 – T 0.759 0.748 3.92 310
27
0.0016 1.49 (1.28–1.73) 3.06 310
27
rs7911488 10q24 105154089 7 USMG5 G 0.296 0.350 2.48 310
27
9.26 310
24
0.74 (0.66–0.82) 2
rs11231824
i
11q13 64354795 8 SLC22A12 C 0.146 0.111 7.02 310
27
0.0026 1.51 (1.3–1.76) 8.99 310
27
rs1878186
d,i
15q21 48508400 30 SLC12A1 C 0.072 0.047 1.49 310
27
7.24 310
24
1.82 (1.45–2.28) 2.49 310
27
Tier 3 (P
FDR
,0.05,
selected results)
rs17301013
d
1q25 174312813 11 RABGAP1L T 0.377 0.417 8.41 310
26
0.0155 0.70 (0.61–0.81) 2
rs10254284
i
7p15 28167391 19 JAZF1 A 0.613 0.583 8.21 310
26
0.0156 1.29 (1.16–1.42) 4.06 310
27
rs4641121
d,i
9p24 7071706 10 KDM4C A 0.144 0.149 8.39 310
26
0.0159 0.68 (0.58–0.80) 4.78 310
26
rs2928402
i
10q11 50018165 25 WDFY4 C 0.235 0.182 1.20 310
26
0.0039 1.39 (1.23–1.57) 1.58 310
27
rs931127 11q13 65405300 9 PCNXL3 G 0.721 0.703 6.71 310
26
0.0132 1.33 (1.18–1.48) 2
rs7892586 Xp22 12833100 X2 PRPS2 A 0.212 0.288 3.18 310
26
0.0076 0.73 (0.60–0.88) 2
* SLE 5systemic lupus erythematosus; FDR 5false discovery rate; Chr. 5chromosome; RA 5risk allele; RAF 5risk allele frequency;
OR 5odds ratio; 95% CI 595% confidence interval.
† Subscript d denotes that Pvalues for the single-nucleotide polymorphism (SNP) are based on a dominant model, and subscript r denotes that
Pvalues are based on a recessive model; all other Pvalues are based on an additive genetic model. Subscript i denotes that Pvalues are based
on imputed SNP genotypes; all other Pvalues are based on direct genotyping.
‡ Listed in order of statistical significance of the hit, beginning with the best Pvalue across the genome but excluding the major histocompatibili-
ty complex.
§Pvalues shown in this column surpassed the genomic control–adjusted Pvalue (P
gcadj
).
¶P,0.01, within-study controls versus out-of-study controls.
936 ALARC
ON-RIQUELME ET AL
most prominent were ITGAM (rs34572943: P
gcadj
56.52
310
219
, OR 2.25 [95% CI 1.91–2.66]), STAT4
(rs11889341: P
gcadj
57.02 310
213
, OR 1.5 [95% CI
1.36–1.66]), TNIP1 (rs7708392: P
gcadj
52.23 310
211
,
OR 1.76 [95% CI 1.50–2.07]), NCF2 (rs13306575:
P
gcadj
53.21 310
28
, OR 1.62 [95% CI 1.39–1.91]), and
IRAK1 in the X chromosome (rs1059702: P
gcadj
58.42
310
214
, OR 0.58 [95% CI 0.49–0.68]) (Table 1). A
locus in proximity to TNFSF4 was detected at the tier 1
level (rs17346550, LOC100506023:P
gcadj
55.59 3
10
210
, OR 0.66 [95% CI 0.58–0.74]) (Figure 1 and Table
1). A region previously associated in Europeans that
was found to be associated with enriched Native Ameri-
can ancestry at the tier 3 level in our study was JAZF1
(Table 1 and Supplementary Table 2 and Figure 4). A
region previously associated in Asians and associated
with good support in the present study was WDFY4 at
tier level 3, as well as RABGAP1L, previously associated
as a deletion variant in Koreans (28).
Our GWAS identified a number of novel loci
associated with SLE. The region on 10q24.33
(rs4917385: P
gcadj
51.39 310
28
, OR 0.72 [95% CI
0.65–0.80] and rs7911488: P
gcadj
52.48 310
27
,OR
0.74 [95% CI 0.66–0.82]) had 27 tier 1 and tier 2 SNPs
(i.e., Pvalues ,10
26
). The association in this region
covers several genes (INA,CALHM1,USMG5,
PDCD11,PCGF6,NT5C2,COL17A1,CNNM2,and
miR1307) (Figure 2). Using stepwise analysis following
imputation, the strongest hit explaining the entire asso-
ciation was found to be rs4917385 (P59.98 310
210
)
located on RPEL1. Strong LD was observed across the
region, and further fine mapping will be needed.
Considering the ancestral proximity between
Native American and Asian populations, we used data
from an ongoing SLE GWAS in Korean subjects (29). The
start and stop positions on the meta-analysis were based
on SNPs that were 6500 kb from rs7911488, with the
actual coordinates (b37): chromosome 10, 104,661,484–
105,643,134 bp. Importantly, exactly the same genotyped
SNPs showed association (albeit weak) in this Korean
population (rs11191642: Pvalue from dominant model
[P
dom
]51.02 310
22
and rs7911488: P
dom
51.33 3
10
22
) as in the Amerindian-enriched individuals
(rs11191642: P
dom
53.97 310
26
and rs7911488: P
dom
5
2.80 310
26
) (Figure 2), increasing the significance
slightly for this locus (rs11191642: P
dom
58.66 310
27
and rs7911488: P
dom
51.01 310
26
) (Figure 2 and Sup-
plementary Table 3, on the Arthritis & Rheumatology web
site at http://onlinelibrary.wiley.com/doi/10.1002/art.
39504/abstract).
Two novel regions with significance at the tier 2
level were SLC22A12 and SLC12A1 (Table 1). Three
novel regions reached the tier 3 level of significance:
KDM4C,PCNXL3,andPRPS2 in the X chromosome.
Using the FDR threshold of significance of 0.05, we
identified 17 other novel and suggestive loci in the pre-
sent study. Included among these were TIPRL,DSTYK,
UNC80,ABTB2,MS4A1,MACROD1,NRXN2,RAD51B,
Figure 1. Annotated Manhattan plot of the genome-wide association study. The plot represents the joint additive association analysis of North
American and South American sets of cases and controls.
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 937
CRIP1,CTXN2,CABLES1,CDCD2,andPGPEP1 (Sup-
plementary Table 2, http://onlinelibrary.wiley.com/doi/10.
1002/art.39504/ abstract).
Expression QTL analysis of the major novel
locus. The chromosome 10q24.33 region houses
USMG5 and miR1307, where the associated SNP
rs7911488 is located in the loop of the hairpin structure
of the pre-microRNA (pre-miR) of miR-1307 (30). SNP
rs7911488 shows a cis eQTL effect (Supplementary
Figure 5, http://onlinelibrary.wiley.com/doi/10.1002/art.
39504/abstract) for all populations available (30,31).
Particularly for the populations of interest, rs7911488
association values were, for CEU and MEX, P
eqtl
58.0
310
237
and P
eqtl
55.1 310
215
, respectively, deter-
mined using HapMap lymphoblastoid cell lines (31).
The data indicated significantly lower USMG5 levels in
individuals with the AA genotype and higher levels in
those with the GG genotype, with levels in heterozy-
gotes intermediate between the two. Examination of
transcription factor binding data from ENCODE ver-
sion 3 revealed that rs7911488 lies within the binding
region of several transcription factors. Using high-
quality Transfac matrices we selected the 2 best tran-
scription factor candidates to bind to the sequence of
USMG5. The A variant of rs7911488 exhibited binding
sites for 2 transcription factors: cartilage oligomeric
matrix protein 1 (COMP-1) (matrix sequence cgagtc-
GATTGgcaacacagacga [with matrix core in capital
letters]), and regulatory factor X subunit 1 (RFX1)
(gagtcgattgGCAACac). The G variant retained only the
binding site for COMP1. Binding only to RFX1 was lost
when the risk variant (G) of rs7911488 was present.
The second intriguing mechanism of action for
rs7911488 involves the microRNA-1307 gene miR1307,
which overlaps exon 2 of USMG5. While this microRNA
is listed as “provisional” in RefSeq, there is evidence
that it is expressed (30); rs7911488 lies in the middle of
this gene, and while it does not alter the sequence of the
mature miR1307 products, it appears to considerably
change the structure of the precursor molecule in a way
that would likely render it inactive. Similar to USMG5,
miR1307 is encoded from the negative DNA strand.
Therefore, the A/G SNP designation, based on the ref-
erence sequence, would be the complement, T/C in
miR1307. Based on the predicted folding of RNA from
the 2 alleles, the T allele forms the double-stranded
structure of a precursor microRNA molecule (Supple-
mentary Figure 6, http://onlinelibrary.wiley.com/doi/
10.1002/art.39504/abstract). This would result in the
molecule binding to the Dicer complex and eventual
release of the single-stranded microRNAs. The pres-
ence of the C allele is predicted to completely alter the
secondary structure of the molecule (Supplementary
Figure 6), making it unlikely to bind to the Dicer com-
plex required for trimming. These data suggest that only
the T allele would result in an active miR1307, which
would affect all of the downstream targets of this
molecule.
MHC region association. The HLA region
showed several SNPs primarily confined to the class II
region, with 1 peak on the DQA2–DQB1 loci (rs9275572:
P
gcadj
51.11 310
216
, OR 1.62 [95% CI 1.46–1.80]) and
a second peak on the DRB1–DQA1 loci (rs9271366:
P
gcadj
56.46 310
212
, OR 2.06 [95% CI 1.71–2.50]).
The latter SNP is the tag for the DR2 haplotype of the
HLA allele DRB1*15:01 in Europeans (32). We observed
no association with the tag for DR3 in Europeans,
rs2187668 (DRB1*03:01), present in the Omni1 and
2.5 BeadArrays and passing quality control, suggesting
that the haplotype structure is different from that of
Europeans.
Imputation of the HLA alleles identified both
known and novel 4-digit allele associations with SLE.
Figure 2. LocusZoom plot of the newly identified systemic lupus ery-
thematosus–associated locus in chromosome 10 (Chr. 10). The data were
obtained from the joint additive association analysis using only genotype
data covering positions 105–105.3 Mb. Single-nucleotide polymorphism
rs7911488 lies at USMG5 and miR1307.GWAS5genome-wide associa-
tion study; GC 5genomic control2adjusted.
938 ALARC
ON-RIQUELME ET AL
Table 2 displays the HLA allele associations with imput-
ed Pvalues (P
imp
)of,0.0001 or strongest association
by locus.
The strongest associations were with DQA1*
01:02,DQB1*06:02,andDRB1*15:01, known to conform
the common European extended haplotype for DR2, and
DQA1*05:01,DQB1*02:01,andDRB1*03:01,knownto
conform, in turn, the European extended haplotype for
DR3 (33). Several alleles frequently found in Native
American populations were also observed. For instance,
alleles frequent in the Mexican population, but not exclu-
sive to it, such as DQB1*03:01 and DQB1*03:02 (Table 2
and Supplementary Table 4, http://onlinelibrary.wiley.
com/doi/10.1002/art.39504/abstract), appeared to confer
protection against development of SLE. In particular,
DQB1*03:01 was strongly associated, individually (P
imp
5
8.46 310
212
, OR 0.60 [95% CI 0.52–0.69]). This DQB1
allele is found in known Native American haplotypes,
associated with the DRB1*14:06,DRB1*16:02,and
DRB1*14:02 alleles (Supplementary Table 4). A similar
effect was observed for DQB1*03:02, which showed a
weaker protective association (P
imp
52.2 310
23
,OR
0.82 [95% CI 0.72–0.93]). DQB1*03:02 is found in Native
American haplotypes when associated with the DRB1*
04:07 or DRB1*04:11 alleles. These DRB1 alleles also
showed individually, weaker associations, all protective
(Supplementary Table 3), suggesting that Native Ameri-
can haplotypes, in aggregate, contribute primarily to pro-
tection against SLE. Two associations with DRB1 alleles
that confer protection and that are found in haplotypes
of European ancestry, DRB1*11:04 and DRB1*11:01,
were observed.
The HLA–A and B genes also showed evidence
of association and supported the presence of European-
derived haplotypes (Table 2). Specifically, A*01:01
(P
imp
55.52 310
26
, OR 1.58) and B*08:01 (P
imp
51.48
310
28
, OR 2.06) conferred risk in these Amerindian
ancestry–enriched samples. Both are associated in
the European haplotype that includes DRB1*03:01.
C*07:01 is also part of this haplotype; however, its asso-
ciation was much weaker (P
imp
55.22 310
23
, OR 1.29)
than was observed for other class I and class II alleles.
Table 2. Association results for imputed single HLA alleles by locus*
Dosage frequency Best-guess count
HLA allele Cases Controls Cases Controls OR (95% CI) P†
A*01:01 0.0936 0.0542 263 241 1.58 (1.30–1.93) 5.52 310
26
B*08:01 0.0658 0.0299 189 142 2.06 (1.61–2.65) 1.48 310
28
C*07:01 0.1087 0.0746 308 348 1.29 (1.08–1.55) 5.22 310
23
DPB1*04:02 0.2779 0.3522 819 1,808 0.84 (0.74–0.95) 5.19 310
23
DQA1*01:02 0.1433 0.0811 426 382 1.94 (1.63–2.31) 1.64 310
213
DQA1*05:01 0.1021 0.0502 284 230 1.91 (1.58–2.33) 6.39 310
211
DQA1*05:05 0.0890 0.1284 251 583 0.59 (0.49–0.70) 1.00 310
28
DQB1*06:02 0.0928 0.0469 278 227 2.16 (1.75–2.67) 1.10 310
212
DQB1*03:01 0.1413 0.2090 409 1,006 0.60 (0.52–0.69) 8.46 310
212
DQB1*02:01 0.1019 0.0502 283 231 1.92 (1.58–2.33) 6.66 310
211
DRB1*15:01 0.0899 0.0450 269 220 2.10 (1.70–2.59) 6.82 310
212
DRB1*03:01 0.1018 0.0496 284 230 1.94 (1.59–2.35) 3.68 310
211
* Associations were adjusted for principal components 1 and 2. OR 5odds ratio; 95% CI 595% confidence interval.
†Pvalues of ,1.0 310
24
or smallest Pvalue within an HLA locus.
Table 3. Results of stepwise analysis using HiBag-imputed HLA alleles*
Dosage frequency Best-guess count Univariate results Stepwise model
HLA allele† Cases Controls Cases Controls OR (CI 95% CI) POR (95% CI) P
DQA1*01:02 0.1433 0.0811 426 382 1.94 (1.63–2.31) 1.64 310
213
2.41 (1.96–2.95) 3.96 310
217
DRB1*03:01 0.1018 0.0496 284 230 1.94 (1.59–2.35) 3.68 310
211
1.98 (1.61–2.45) 1.75 310
210
DQB1*03:01 0.1413 0.2090 409 1006 0.60 (0.52–0.69) 8.46 310
212
0.74 (0.64–0.87) 2.71 310
24
DQB1*06:02 0.0928 0.0469 278 227 2.16 (1.75–2.67) 1.10 310
212
0.42 (0.26–0.69) 5.21 310
24
A*01:01 0.0936 0.0542 263 241 1.58 (1.30–1.93) 5.52 310
26
1.41 (1.15–1.74) 1.16 310
23
DRB1*08:02 0.1192 0.1270 402 702 1.22 (1.02–1.45) 2.65 310
22
1.36 (1.13–1.64) 1.21 310
23
B*42:01 0.0054 0.0025 20 13 4.54 (1.54–13.40) 6.16 310
23
6.05 (2.01–18.27) 1.40 310
23
A*11:01 0.0511 0.0335 141 153 1.40 (1.09–1.79) 7.62 310
23
1.49 (1.15–1.92) 2.22 310
23
* A single-nucleotide polymorphism (SNP) was allowed to enter the model if the Pvalue of association for the SNP was ,0.01, and SNPs within
a model were kept in the model for as long as the Pvalue of association remained at ,0.01. Associations were adjusted for principal compo-
nents 1 and 2. OR 5odds ratio; 95% CI 595% confidence interval.
† Listed in descending order according to stepwise Pvalue.
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 939
The European HLA class I gene alleles found in the
extended DRB1*15:01 haplotype, namely A*03:01,
B*07:02, and C*07:02, were not associated, despite the
strong association of the corresponding class II alleles.
Joint modeling of classic HLA alleles identified
8 alleles spanning the 5 genes. Some changes from the
univariate results were observed: the association of
DQA1*01:02 became much stronger (Table 3), while
associations of A*01:01,DRB1*03:01, and DQB1*03:01
became weaker and DQB1*06:02, a risk allele in the
univariate analysis, was a non-risk allele according to
joint modeling. The remaining 3 alleles exhibited only
minor modifications from the findings obtained in the
univariate analysis.
Local ancestry analysis of the HLA locus. To
ascertain whether any genetic association could involve
particular continental ancestries, we performed local
ancestry estimations using PCAdmix (22). The posterior
probabilities for each phased chromosome are depicted
in Supplementary Figure 7 (http://onlinelibrary.wiley.
com/doi/10.1002/art.39504/abstract). We analyzed each
associated locus for SNP windows showing association
between cases and controls in any given ancestry. Con-
firming our imputation data, the HLA region in chro-
mosome 6 (29.3–33.3 Mb) had 4 continuous windows
with 863 SNPs with significant differences between cases
and controls of European ancestry (Figure 3). The
reminder of the loci had no specific significant differ-
ences in any particular ancestry, suggesting that most
loci composed of common variants are found across all
continental ancestry borders.
DISCUSSION
This is the first published GWAS in SLE patients
enriched for Native American ancestry. The opportunity
Figure 3. Distribution of European ancestry within the split HLA genomic region (regions 29–32). Local ancestry was estimated in each of 4
windows. The regions depicted cover HLA class I, III, and II, in order from 29 to 33 Mb. The mean distribution of local European HLA ances-
try was significantly different (P52.2 310
26
) between cases (designated as 2) and controls (designated as 1) across all windows, and ancestry
was highest in the window reflecting region 32 (32.3–33.3 Mb) containing HLA–DRB1 to DQA2, the region of the major histocompatibility com-
plex with the main genetic association. European ancestry within the HLA region conferred an increased risk for systemic lupus erythematosus.
Data are shown as box plots. Each box represents the interquartile range indicating the first (25th percentile) and third (75th percentile) quar-
tiles. Notches in the boxes represent the median. Thin vertical whiskers represent 2 SD from the median. Thick vertical lines above whiskers rep-
resent all values more than 2 SD from the median.
940 ALARC
ON-RIQUELME ET AL
to perform a complete GWAS provides an interesting
view of the genetic structure of the population. This
study yielded several findings. We clearly demonstrated
a different pattern of genetic association in which the
genetic association within the IRF5–TNPO3 locus sur-
passes that of the major histocompatibility complex
(MHC), the main and major association for auto-
immune disease in Europeans and in most populations.
Use of 2 approaches, local ancestry estimation and
imputation of classic HLA alleles, showed that risk for
lupus within the HLA region is conferred by the Euro-
pean ancestral alleles involving DRB1*03:01 and
DRB1*15:01, while all autochthonous Amerindian
ancestral alleles identified were protective (33). It is fea-
sible that the degree of diversity of the Native American
populations is such that there are several potential, but
low-frequency, risk alleles or novel functional alleles that
remain to be discovered and require studies with much
more power to be consistently associated with SLE.
Our present results and our previous observations
(8) that individuals with higher Native American global
ancestry had increased numbers of risk alleles seem to be
contradictory. The most plausible explanation is that a
higher number of alleles is needed to attain a level of risk
for SLE in the Native American population.
The imputation of HLA alleles using HiBag in a
primarily Native American population may be contro-
versial due to the scant information available and the
complexity of the admixture. To ascertain the fidelity of
our data, we performed a comparison of imputed and
genotyped alleles in 87 individuals. Agreement between
2- and 4-digit HLA genotyping with that of HiBag impu-
tation was high (see Supplementary Figure 8, available
on the Arthritis & Rheumatology web site at http://online-
library.wiley.com/doi/10.1002/art.39504/abstract), rein-
forcing our results and allowing for several conclusions.
Interestingly, the classic haplotype for DRB1*
15:01 class I loci in Europeans (A*03:01,B*07:02, and
C*07:02) was not associated with SLE in our study pop-
ulation. This finding suggests that the class II associa-
tions are independent and would support the notion of
a major role of class II, and not class I, alleles in SLE.
This is consistent with the results of other studies using
trans-ancestral mapping of the HLA region in SLE (34).
We identified 23 completely new suggestive loci.
As recent admixture can produce extended blocks of
LD, several of these regions need to be analyzed
individually. For example, PCNXL3 is located close to
RNASEH2C, a gene implicated in Aicardi-Goutieres
interferonopathy (34), a monogenic disease resembling
SLE in infants, and PRPS2 is located close to TLR7 in
the X chromosome. In one study, the genetic association
with TLR7 was found exclusively in male subjects (35),
but a second study demonstrated the association in
males and females (36). A male tendency was not dem-
onstrated in our study (Supplementary Table 5, http://
onlinelibrary.wiley.com/doi/10.1002/art.39504/abstract);
we observed the association in females, while, as expected,
the frequency in males was reduced.
One interesting finding in the present study was
the strong association of the IRF52TNPO3 locus. This
association surpassed even the MHC, suggesting major
alternative mechanisms in disease development. We had
previously reported this association in Mexicans (37).
This locus shows a stronger effect than that for published
European lineages (38); the reason for this is unclear.
IRF5 is a transcription factor that is clearly
involved in type I interferon (IFN) and proinflammatory
signaling, and risk alleles for this locus are associated
with increased expression of IRF5. Most importantly,
studies by Niewold et al have demonstrated specific
associations of IRF5–TNPO3 haplotypes with anti-RBP
antibodies (the collective name for anti-Sm, anti-Ro,
and anti2nuclear RNP antibodies). Niewold and col-
leagues showed that both IRF5 haplotypes and anti-
RBP serology were strongly associated with increased
levels of IFN
a
in serum (39), a key feature of SLE. The
role of TNPO3, if any, is not understood, but a recent
study by Kottyan et al (40), in which both genes were
carefully analyzed in subjects of several different ethnic-
ities, showed an independent haplotype that included
both genes tagged by SNP rs12534421 that originated in
European or European-admixed populations but are
not found in Africans from HapMap or 1000 Genomes.
Another haplotype, tagged by SNP rs4728142 and locat-
ed in the promoter of IRF5, was found in subjects of all
ethnicities. Local admixture estimations showed in-
creased European admixture at the IRF5–TNPO3 locus
in African American cases (40). Our local ancestry ana-
lysis in the admixed Amerindian European population
sample does not specifically provide evidence of in-
creased European ancestry within the IRF5–TNPO3
locus, although this possibility appears quite likely and
would further support the idea that European ancestry
has a role in the inheritance of IRF5 SNPs in disease
susceptibility, as we previously proposed (37).
Local ancestry analysis provided information on
the effect of other ancestries in our study loci, and inter-
estingly, none with the exception of the HLA showed a
clear ancestral “bias.” Demonstrating this could require
a larger sample size. It is important to understand the
fine structural consequences of admixture and their
impact on the development of disease in rapidly expand-
ing and admixed populations, in order to eventually
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 941
define differences that might have an impact on the
mechanisms of disease development and the design of
new individual therapies.
Our data suggesting the involvement of the
USMG5 gene and miR1307 warrant further investigation.
Whether the expression of this microRNA may impact
numerous targets of importance in lupus needs to be
experimentally tested. USMG5 is widely expressed in the
central nervous system, but also in CD1051endothelial
cells, CD341hematopoietic stem cells, and dendritic,
myeloid, and natural killer cells. RFX1, the transcription
factor whose binding was abolished according to Trans-
fac analysis, is known to regulate expression of the HLA
class II molecules. In a recent study, RFX1 was found to
regulate the expression of CD11a and CD70 in T cells
from patients with SLE (41).
In conclusion, we suggest the existence, in this North
and South American study population of Amerindian-
enriched ancestry, of 23 novel loci for lupus that require rep-
lication, and 1 completely novel locus in chromosome 10
that has been replicated in Asians. Several of the major
genes for lupus that have been previously confirmed in
Europeans and Asians are also associated in the present
population, showing that the major genes for this disease
are clearly established and associated across ethnicities.
ACKNOWLEDGMENTS
The authors would like to acknowledge the Wake
Forest Center for Public Health Genomics for personnel
support and computing support, the late Laura Riba (who
passed away in 2014) for her excellent help in preparation
and organization of Mexican samples, and Ms Farideh
Movafagh and Ms Rosario Rodr
ıguez-Guill
en for technical
assistance.
AUTHOR CONTRIBUTIONS
All authors were involved in drafting the article or revising it
critically for important intellectual content, and all authors approved
the final version to be published. Dr. Alarc
on-Riquelme had full
access to all of the data in the study and takes responsibility for the
integrity of the data and the accuracy of the data analysis.
Study conception and design. Alarc
on-Riquelme, Moreno-Estrada,
S
anchez-Rodr
ıguez, Vyse, Criswell, Tsao, Sivils, Kimberly, Kaufman,
Harley, Pons-Estel, Langefeld, Jacob.
Acquisition of data. Alarc
on-Riquelme, Rasmussen, Kelly, Adler,
Acevedo-V
azquez, Cucho-Venegas, Garc
ıa-DelaTorre,Cardiel,Miranda,
Catoggio, Maradiaga-Cece~
na, Gaffney, Tsao, Sivils, Bae, James, Kimberly,
Esquivel-Valerio, Moctezuma, Garc
ıa, Berbotto, Babini, Scherbarth,
Toloza, Baca, Aguilar Salinas, Orozco, Tusi
e-Luna, Pons-Estel.
Analysis and interpretation of data. Alarc
on-Riquelme, Ziegler,
Molineros, Howard, Moreno-Estrada, S
anchez-Rodr
ıguez, Ainsworth,
Ortiz-Tello, Comeau, Nath, Tusi
e-Luna, Zidovetzki, Langefeld, Jacob.
REFERENCES
1. Pons-Estel BA, Catoggio LJ, Cardiel MH, Soriano ER, Gentiletti S,
Villa AR, et al. The GLADEL multinational Latin American pro-
spective inception cohort of 1,214 patients with systemic lupus ery-
thematosus: ethnic and disease heterogeneity among “Hispanics.”
Medicine (Baltimore) 2004;83:1–17.
2. Auton A, Bryc K, Boyko AR, Lohmueller KE, Novembre J,
Reynolds A, et al. Global distribution of genomic diversity
underscores rich complex history of continental human popula-
tions. Genome Res 2009;19:795–803.
3. Harley JB, Alarcon-Riquelme ME, Criswell LA, Jacob CO, Kim-
berly RP, Moser KL, et al. Genome-wide association scan in
women with systemic lupus erythematosus identifies susceptibili-
ty variants in ITGAM, PXK, KIAA1542 and other loci. Nat
Genet 2008;40:204–10.
4. Hom G, Graham RR, Modrek B, Taylor KE, Ortmann W, Garnier
S, et al. Association of systemic lupus erythematosus with C8orf13-
BLK and ITGAM-ITGAX. N Engl J Med 2008;358:900–9.
5. Calvo-Alen J, Reveille JD, Rodriguez-Valverde V, McGwin G
Jr, Baethge BA, Friedman AW, et al. Clinical, immunogenetic
and outcome features of Hispanic systemic lupus erythematosus
patients of different ethnic ancestry. Lupus 2003;12:377–85.
6. Sanchez E, Rasmussen A, Riba L, Acevedo-Vasquez E, Kelly
JA, Langefeld CD, et al. Impact of genetic ancestry and sociode-
mographic status on the clinical expression of systemic lupus ery-
thematosus in American Indian–European populations. Arthritis
Rheum 2012;64:3687–94.
7. Richman IB, Taylor KE, Chung SA, Trupin L, Petri M, Yelin E,
et al. European genetic ancestry is associated with a decreased
risk of lupus nephritis. Arthritis Rheum 2012;64:3374–82.
8. Sanchez E, Webb RD, Rasmussen A, Kelly JA, Riba L, Kaufman
KM, et al. Genetically determined Amerindian ancestry correlates
with increased frequency of risk alleles for systemic lupus erythe-
matosus. Arthritis Rheum 2010;62:3722–9.
9. Tan EM, Cohen AS, Fries JF, Masi AT, McShane DJ, Rothfield
NF, et al. The 1982 revised criteria for the classification of sys-
temic lupus erythematosus. Arthritis Rheum 1982;25:1271–7.
10. Sanchez E, Nadig A, Richardson BC, Freedman BI, Kaufman KM,
Kelly JA, et al. Phenotypic associations of genetic susceptibility loci
in systemic lupus erythematosus. Ann Rheum Dis 2011;70:1752–7.
11. Gomez M, Clark RM, Nath SK, Bhatti S, Sharma R, Alonso E,
et al. Genetic admixture of European FRDA genes is the cause
of Friedreich ataxia in the Mexican population. Genomics 2004;
84:779–84.
12. Wang S, Ray N, Rojas W, Parra MV, Bedoya G, Gallo C, et al.
Geographic patterns of genome admixture in Latin American
Mestizos. PLoS Genet 2008;4:e1000037.
13. Williams AL, Jacobs SB, Moreno-Macias H, Huerta-Chagoya A,
Churchhouse C, Marquez-Luna C, et al. Sequence variants in
SLC16A11 are a common risk factor for type 2 diabetes in Mexi-
co. Nature 2014;506:97–101.
14. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen
WM. Robust relationship inference in genome-wide association
studies. Bioinformatics 2010;26:2867—73.
15. Alexander DH, Novembre J, Lange K. Fast model-based estima-
tion of ancestry in unrelated individuals. Genome Res 2009;19:
1655–64.
16. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-
analysis of genomewide association scans. Bioinformatics 2010;
26:2190–1.
17. Zhou X, Stephens M. Genome-wide efficient mixed-model ana-
lysis for association studies. Nat Genet 2012;44:821–4.
18. Wright SP. Adjusted P-values for simultaneous inference. Bio-
metrics 1992;48:1005–13.
19. Hancock DB, Levy JL, Gaddis NC, Bierut LJ, Saccone NL,
Page GP, et al. Assessment of genotype imputation performance
using 1000 Genomes in African American studies. PLoS One
2012;7:e50610.
942 ALARC
ON-RIQUELME ET AL
20. Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR.
Fast and accurate genotype imputation in genome-wide associa-
tion studies through pre-phasing. Nat Genet 2012;44:955–9.
21. Zheng X, Shen J, Cox C, Wakefield JC, Ehm MG, Nelson MR,
et al. HIBAG-HLA genotype imputation with attribute bagging.
Pharmacogenomics J 2014;14:192–200.
22. Brisbin A, Bryc K, Byrnes J, Zakharia F, Omberg L, Degenhardt J,
et al. PCAdmix: principal components-based assignment of ancestry
along each chromosome in individuals with admixed ancestry from
two or more populations. Hum Biology 2012;84:343–64.
23. International HapMap Consortium. The International HapMap
Project. Nature 2003;426:789–96.
24. Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner
SF, Yu F, et al. Integrating common and rare genetic variation
in diverse human populations. Nature 2010;467:52–8.
25. Browning SR, Browning BL. Rapid and accurate haplotype
phasing and missing-data inference for whole-genome associa-
tion studies by use of localized haplotype clustering. Am J Hum
Genet 2007;81:1084–97.
26. Kidd JM, Gravel S, Byrnes J, Moreno-Estrada A, Musharoff S,
Bryc K, et al. Population genetic inference from personal
genome data: impact of ancestry and admixture on human geno-
mic variation. Am J Hum Genet 2012;91:660–71.
27. SIGMA Type 2 Diabetes Consortium. Sequence variants in
SLC16A11 are a common risk factor for type 2 diabetes in Mexi-
co. Nature 2014;506:97–101.
28. Kim JH, Jung SH, Bae JS, Lee HS, Yim SH, Park SY, et al.
Deletion variants of RABGAP1L, 10q21.3, and C4 are associat-
ed with the risk of systemic lupus erythematosus in Korean
women. Arthritis Rheum 2013;65:1055–63.
29. Lessard CJ, Sajuthi S, Zhao J, Kim K, Ice JA, Li H, et al. Iden-
tification of a systematic lupus erythematosus risk locus spanning
ATG16L2, FCHSD2, and P2RY2 in Koreans. Arythritis Rheu-
matol 2015. E-pub aheas of print.
30. Zhu JY, Pfuhl T, Motsch N, Barth S, Nicholls J, Grasser F,
et al. Identification of novel Epstein-Barr virus microRNA genes
from nasopharyngeal carcinomas. J Virol 2009;83:3333–41.
31. Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O,
Ingle CE, et al. Patterns of cis regulatory variation in diverse
human populations. PLoS Genet 2012;8:e1002639.
32. Fernando MM, Stevens CR, Sabeti PC, Walsh EC, McWhinnie
AJ, Shah A, et al. Identification of two independent risk factors
for lupus within the MHC in United Kingdom families. PLoS
Genet 2007;3:e192.
33. Zuniga J, Yu N, Barquera R, Alosco S, Ohashi M, Lebedeva T,
et al. HLA class I and class II conserved extended haplotypes and
their fragments or blocks in Mexicans: implications for the study of
genetic diversity in admixed populations. PLoS One 2013;8:74442.
34. Crow YJ. Type I interferonopathies: a novel set of inborn errors
of immunity. Ann N Y Acad Sci 2011;1238:91–8.
35. Shen N, Fu Q, Deng Y, Qian X, Zhao J, Kaufman KM, et al.
Sex-specific association of X-linked Toll-like receptor 7 (TLR7)
with male systemic lupus erythematosus. Proc Natl Acad Sci
U S A 2010;107:15838–43.
36. Deng Y, Zhao J, Sakurai D, Kaufman KM, Edberg JC, Kimberly
RP, et al. MicroRNA-3148 modulates allelic expression of Toll-
like receptor 7 variant associated with systemic lupus erythema-
tosus. PLoS Genet 2013;9:e1003336.
37. Reddy MV, Velazquez-Cruz R, Baca V, Lima G, Granados J,
Orozco L, et al. Genetic association of IRF5 with SLE in Mexi-
cans: higher frequency of the risk haplotype and its homozygoz-
ity than Europeans. Hum Genet 2007;121:721–7.
38. Graham RR, Kozyrev SV, Baechler EC, Reddy MV, Plenge
RM, Bauer JW, et al. A common haplotype of interferon regula-
tory factor 5 (IRF5) regulates splicing and expression and is
associated with increased risk of systemic lupus erythematosus.
Nat Genet 2006;38:550–5.
39. Niewold TB, Kelly JA, Kariuki SN, Franek BS, Kumar AA,
Kaufman KM, et al. IRF5 haplotypes demonstrate diverse sero-
logical associations which predict serum interferon
a
activity and
explain the majority of the genetic association with systemic
lupus erythematosus. Ann Rheum Dis 2012;71:463–8.
40. Kottyan LC, Zoller EE, Bene J, Lu X, Kelly JA, Rupert AM,
et al. The IRF5-TNPO3 association with systemic lupus erythe-
matosus has two components that other autoimmune disorders
variably share. Hum Mol Genet 2015;24:582–96.
41. Zhao M, Wu X, Zhang Q, Luo S, Liang G, Su Y, et al. RFX1
regulates CD70 and CD11a expression in lupus T cells by
recruiting the histone methyltransferase SUV39H1. Arthritis Res
Ther 2010;12:R227.
GWAS OF AN AMERINDIAN ANCESTRY POPULATION WITH SLE 943