Raymond K. Walters’s research while affiliated with Massachusetts General Hospital and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (90)


Schematic of overall analytic plan
Displays the outline of analyses performed in the study as well as number of phenotypes and participants at each step.
Representation of item types across factors
Horizontal bars represent proportion variance explained in a given factor score by each of 6 major data types in UKB, estimated using hierarchical partitioning. To the left, factors are numbered in order of variance extraction in the exploratory factor analysis.
Comparison of EFA to PCA
a) The expected absolute correlations across the 36 EFA factors and principal components. b) For each of the 36 EFA factors, the proportion of variance explained by all 36 PCs. c and d) Per-item scatterplots of scoring coefficients for factors vs. PCs across thematically similar pairs, demonstrating sparser loadings amongst the factor scoring coefficients vs. the PC scoring coefficients.
Phecode associations by factor
Box-and-whisker plots are shown for associations with 403 derived medical phecodes grouped by category. These associations are defined as the test statistics (that is, z scores from estimated regression coefficients and Huber-White robust standard errors) for the factor score in a logistic regression model including our standard covariates (that is, first 20 genetic PCs, age, chromosomal sex, age², age-x-chromosomal sex, age²-x-chromosomal sex, and dummy variables representing the assessment centers of origin). Boxes represent the middle quartiles of a factor’s test statistics across phecodes within a category, with whiskers extending to 1.5x the interquartile range. Median values per category are indicated by individual black lines inside the boxes. The dotted grey lines represent the critical test statistics for significance at two-sided p < 0.05 after correcting for multiple comparisons across all 403 phecodes.
Biomarker associations by factor
Phenotypic associations between factors and 28 biomarkers assayed in UKB. Colors represent the magnitude and direction of correlation, and asterisks (*) indicate which associations remain significant in ordinary least squares regression with Huber-White robust standard errors after correction for multiple testing (that is, two-sided p < 0.05 / (28 biomarkers x 35 factors)).

+10

Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation
  • Article
  • Full-text available

July 2024

·

79 Reads

·

1 Citation

Nature Human Behaviour

Caitlin E. Carey

·

Rebecca Shafee

·

Robbee Wedow

·

[...]

·

Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours. We go on to demonstrate the power of this approach to clarify genetic signal, enhance discovery and identify associations between underlying phenotypic structure and health outcomes. In building a deeper understanding of ways in which constructs such as socioeconomic status, trauma, or physical activity are structured in the dataset, we emphasize the importance of considering the interwoven nature of the human phenome when evaluating public health patterns.

Download

Fig. 4 | GISMO and GISMO-mis capture common variant heritability and rare variant association signals. (A) Mean enrichment of per-SNV partitioned heritability of XX traits explained by common variants within 100-kb of genes for GISMO and GISMO-mis deciles. (B) Mean neurodevelopmental disorder association across GISMO and GISMO-mis deciles. The association reflects P-values from NDD rare variant associations that were converted to absolute Z-scores. (C) The percentage of genes present in each decile of GISMO and GISMO-mis for 1,183 recessive disorder genes aggregated from OMIM.
The landscape of gene loss and missense variation across the mammalian tree informs on gene essentiality

May 2024

·

51 Reads

·

1 Citation

Background The degree of gene and sequence preservation across species provides valuable insights into the relative necessity of genes from the perspective of natural selection. Here, we developed novel interspecies metrics across 462 mammalian species, GISMO (Gene identity score of mammalian orthologs) and GISMO-mis (GISMO-missense), to quantify gene loss traversing millions of years of evolution. GISMO is a measure of gene loss across mammals weighed by evolutionary distance relative to humans, whereas GISMO-mis quantifies the ratio of missense to synonymous variants across mammalian species for a given gene. Rationale Despite large sample sizes, current human constraint metrics are still not well calibrated for short genes. Traversing over 100 million years of evolution across hundreds of mammals can identify the most essential genes and improve gene-disease association. Beyond human genetics, these metrics provide measures of gene constraint to further enable mammalian genetics research. Results Our analyses showed that both metrics are strongly correlated with measures of human gene constraint for loss-of-function, missense, and copy number dosage derived from upwards of a million human samples, which highlight the power of interspecies constraint. Importantly, neither GISMO nor GISMO-mis are strongly correlated with coding sequence length. Therefore both metrics can identify novel constrained genes that were too small for existing human constraint metrics to capture. We also found that GISMO scores capture rare variant association signals across a range of phenotypes associated with decreased fecundity, such as schizophrenia, autism, and neurodevelopmental disorders. Moreover, common variant heritability of disease traits are highly enriched in the most constrained deciles of both metrics, further underscoring the biological relevance of these metrics in identifying functionally important genes. We further showed that both scores have the lowest duplication and deletion rate in the most constrained deciles for copy number variants in the UK Biobank, suggesting that it may be an important metric for dosage sensitivity. We additionally demonstrate that GISMO can improve prioritization of recessive disorder genes and captures homozygous selection. Conclusions Overall, we demonstrate that the most constrained genes for gene loss and missense variation capture the largest fraction of heritability, GISMO can help prioritize recessive disorder genes, and identify the most conserved genes across the mammalian tree.


Pan-UK Biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects

March 2024

·

182 Reads

·

35 Citations

Large biobanks, such as the UK Biobank (UKB), enable massive phenome by genome-wide association studies that elucidate genetic etiology of complex traits. However, individuals from diverse genetic ancestry groups are often excluded from association analyses due to concerns about population structure introducing false positive associations. Here, we generate mixed model associations and meta-analyses across genetic ancestry groups, inclusive of a larger fraction of the UKB than previous efforts, to produce freely-available summary statistics for 7,271 traits. We build a quality control and analysis framework informed by genetic architecture. Overall, we identify 14,676 significant loci in the meta-analysis that were not found in the European genetic ancestry group alone, including novel associations for example between CAMK2D and triglycerides. We also highlight associations from ancestry-enriched variation, including a known pleiotropic missense variant in G6PD associated with several biomarker traits. We release these results publicly alongside FAQs that describe caveats for interpretation of results, enhancing available resources for interpretation of risk variants across diverse populations.


Genome-wide association results for GDM
a, Manhattan plot of GWAS of GDM in 12,332 cases and 131,109 parous female controls of Finnish ancestry with REGENIE 2.2.4. The x axis reflects chromosomal positions, and the y axis reflects −log10(P) values for the two-tailed association test for each variant, presented on a log scale. Red dotted line indicates the significance threshold (P = 5 × 10⁻⁸). Colored SNPs represent the credible set members for the 13 genome-wide significant loci, with blue indicating loci previously associated with GDM and orange indicating new associations. Labels indicate the gene nearest to the fine-mapped lead SNP. b, The genetic correlation (SNP-rg) between GDM and 53 other diseases, traits or biomarkers was computed using LD score regression. We plot the SNP-rg with confidence intervals for all traits that were significant after Bonferroni correction for two-sided tests of 53 traits (P < 9.4 × 10⁻⁴). Results for all tested traits are reported in Supplementary Tables 19 and 20. Colors indicate phenotype category.
Classification of the genetic effects of SNPs in GDM and T2D
Comparison of log odds ratios in GWAS of GDM (x axis) and T2D in males (y axis) for top-associated SNPs from GDM (13 SNPs) and T2D (15 SNPs). The following two distinct classes of SNP effects were identified by a Bayesian classifier in shared variants analysis: class T (blue) containing SNPs with T2D-predominant genetic effects and class G (red) with GDM-predominant effects (Supplementary Table 27). Gray SNPs were not confidently assigned to either class (posterior probability >95%). Dotted ellipses indicate the 95% probability regions of the fitted bivariate effect size distributions with each class.
Cell-type specificity analysis of GDM and T2D highlights different cell associations
Cell-type specificity analysis was performed for GDM (a) and for prior meta-analysis of T2D (b) from ref. ¹⁸ using high-quality mouse single-cell RNA-seq datasets with FUMA v1.3.4 (Supplementary Tables 32–35). Unadjusted P values are reported for the two-sided association test between relative gene expression in the given cell type and multimarker analysis of genomic annotation (MAGMA) gene-level associations in the GWAS. Results are shown for cell types that both are significantly associated with at least one GWAS after correction for multiple testing of all cell types in all datasets (a) and have putatively independent association conditional on other cell types in the same RNA-seq dataset (b). Colors indicate the RNA-seq dataset source and significance.
Distinct and shared genetic architectures of gestational diabetes mellitus and type 2 diabetes

January 2024

·

51 Reads

·

24 Citations

Nature Genetics

Gestational diabetes mellitus (GDM) is a common metabolic disorder affecting more than 16 million pregnancies annually worldwide1,2. GDM is related to an increased lifetime risk of type 2 diabetes (T2D)1–3, with over a third of women developing T2D within 15 years of their GDM diagnosis. The diseases are hypothesized to share a genetic predisposition1–7, but few studies have sought to uncover the genetic underpinnings of GDM. Most studies have evaluated the impact of T2D loci only8–10, and the three prior genome-wide association studies of GDM11–13 have identified only five loci, limiting the power to assess to what extent variants or biological pathways are specific to GDM. We conducted the largest genome-wide association study of GDM to date in 12,332 cases and 131,109 parous female controls in the FinnGen study and identified 13 GDM-associated loci, including nine new loci. Genetic features distinct from T2D were identified both at the locus and genomic scale. Our results suggest that the genetics of GDM risk falls into the following two distinct categories: one part conventional T2D polygenic risk and one part predominantly influencing mechanisms disrupted in pregnancy. Loci with GDM-predominant effects map to genes related to islet cells, central glucose homeostasis, steroidogenesis and placental expression.



Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci

June 2023

·

120 Reads

·

14 Citations

Nature Human Behaviour

Response to survey questionnaires is vital for social and behavioural research, and most analyses assume full and accurate response by participants. However, nonresponse is common and impedes proper interpretation and generalizability of results. We examined item nonresponse behaviour across 109 questionnaire items in the UK Biobank (N = 360,628). Phenotypic factor scores for two participant-selected nonresponse answers, ‘Prefer not to answer’ (PNA) and ‘I don’t know’ (IDK), each predicted participant nonresponse in follow-up surveys (incremental pseudo-R² = 0.056), even when controlling for education and self-reported health (incremental pseudo-R² = 0.046). After performing genome-wide association studies of our factors, PNA and IDK were highly genetically correlated with one another (rg = 0.73 (s.e. = 0.03)) and with education (rg,PNA = −0.51 (s.e. = 0.03); rg,IDK = −0.38 (s.e. = 0.02)), health (rg,PNA = 0.51 (s.e. = 0.03); rg,IDK = 0.49 (s.e. = 0.02)) and income (rg,PNA = –0.57 (s.e. = 0.04); rg,IDK = −0.46 (s.e. = 0.02)), with additional unique genetic associations observed for both PNA and IDK (P < 5 × 10⁻⁸). We discuss how these associations may bias studies of traits correlated with item nonresponse and demonstrate how this bias may substantially affect genome-wide association studies. While the UK Biobank data are deidentified, we further protected participant privacy by avoiding exploring non-response behaviour to single questions, assuring that no information can be used to associate results with any particular respondents.


Fig. 4 | correlations among exemplary DePict gene sets. (a,b) There were 68 clusters available for SmkInit (a) and 10 for DrnkWk (b) (CigDay, AgeSmk, and SmkCes did not have > 1 exemplary set). Purple shading represents negative correlations, and red shading represents positive correlations, with increasing color intensity reflecting increasing correlation strength. Cluster names are truncated for space, with a full list of all names in Supplementary Table 18. The number after each name is the number of gene sets in each cluster. The matrix naturally falls into three red superclusters along the diagonal. The largest supercluster contains primarily gene sets related to neurotransmitter receptors, ion channels (sodium, potassium, calcium), learning/memory, and other aspects of central nervous system function. The middle supercluster includes gene sets defined by regulation of transcription and translation, including RNA binding and transcription factor activity. The final supercluster is composed primarily of gene sets related to development of the nervous system.
Non-synonymous sentinel variants
Multivariate genome-wide association meta-analysis of over 1 million subjects identifies loci underlying multiple substance use disorders

March 2023

·

314 Reads

·

111 Citations

Nature Mental Health

Genetic liability to substance use disorders can be parsed into loci that confer general or substance-specific addiction risk. We report a multivariate genome-wide association meta-analysis that disaggregates general and substance-specific loci for published summary statistics of problematic alcohol use, problematic tobacco use, cannabis use disorder, and opioid use disorder in a sample of 1,025,550 individuals of European descent and 92,630 individuals of African descent. Nineteen independent SNPs were genome-wide significant (P < 5e-8) for the general addiction risk factor (addiction-rf), which showed high polygenicity. Across ancestries, PDE4B was significant (among other genes), suggesting dopamine regulation as a cross-substance vulnerability. An addiction-rf polygenic risk score was associated with substance use disorders, psychopathologies, somatic conditions, and environments associated with the onset of addictions. Substance-specific loci (9 for alcohol, 32 for tobacco, 5 for cannabis, 1 for opioids) included metabolic and receptor genes. These findings provide insight into genetic risk loci for substance use disorders that could be leveraged as treatment targets.


Figure 1: Genome-wide association results for GDM. (A) Manhattan plot of GWAS of GDM in 12,332 cases and 131,109 parous female controls of Finnish ancestry. The x-axis reflects chromosomal positions and the y-axis reflects −log10(P) values for the two-tailed association test for each variant, presented on a log scale. Red dotted line indicates the
Figures A.
Distinct and shared genetic architectures of Gestational diabetes mellitus and Type 2 Diabetes Mellitus

February 2023

·

40 Reads

·

3 Citations

Gestational diabetes mellitus (GDM) affects more than 16 million pregnancies annually worldwide and is related to an increased lifetime risk of Type 2 diabetes (T2D). The diseases are hypothesized to share a genetic predisposition, but there are few GWAS studies of GDM and none of them is sufficiently powered to assess whether any variants or biological pathways are specific to GDM. We conducted the largest genome-wide association study of GDM to date in 12,332 cases and 131,109 parous female controls in the FinnGen Study and identified 13 GDM-associated loci including 8 novel loci. Genetic features distinct from T2D were identified both at the locus and genomic scale. Our results suggest that the genetics of GDM risk falls into two distinct categories - one part conventional T2D polygenic risk and one part predominantly influencing mechanisms disrupted in pregnancy. Loci with GDM-predominant effects map to genes related to islet cells, central glucose homeostasis, steroidogenesis, and placental expression. These results pave the way for an improved biological understanding of GDM pathophysiology and its role in the development and course of T2D.


Figure 3. Phenotypic and genetic associations of E-factors with psychiatric disorders. (a) Phenotypic associations of the E-factors with the six psychiatric disorders tested using logistic regression. Standardised effect sizes (beta coefficient) along with standard errors are shown; (b) Genetic correlations (r g ) of the E-factors with the six psychiatric disorders estimated using bivariate LD score regression; (c) Associations of polygenic scores for the six psychiatric disorders with the E-factors analyzed only in the controls (N = 12,487). Beta coefficients and standard errors are shown. Star symbols indicate statistical significance after multiple testing corrections (P < 0.002). SCZ schizophrenia, BD bipolar disorder, MDD major depressive disorder, ADHD attention deficit hyperactivity disorder, ASD autism spectrum disorder, AN anorexia nervosa.
Figure 4. Phenotypic and genetic associations of math and language grades with psychiatric disorders. (a) Phenotypic associations of math and language grades with the six disorders tested using multiple logistic regression; standardized effect sizes (beta) and standard errors are shown; (b) Associations of polygenic scores for the six psychiatric disorders with math and language grades analyzed only in the controls (N = 12,487); standardized effect sizes (beta) and standard errors are shown. Star symbols indicate statistical significance after multiple testing corrections (P < 0.004). SCZ schizophrenia, BD bipolar disorder, MDD major depressive disorder, ADHD attention deficit hyperactivity disorder, ASD autism spectrum disorder, AN anorexia nervosa.
Figure 5. Replication analysis in TEDS. (a) Pearson correlations of E-factors with subject-specific school grades; (b) Genetic correlations of TEDS E-factors with iPSYCH E-factors estimated using bivariate LD score regression; (c) Associations of polygenic scores for E1, E2, E3 and E4 (based on iPSYCH SNP weights) with E1 and E2 in TEDS tested using linear regression; standardized effect sizes (beta) and standard errors are shown; (d) Associations of polygenic scores for the six psychiatric disorders with E1 and E2 in TED; standardized effect sizes (beta) and standard errors are shown. SCZ schizophrenia, BD bipolar disorder, MDD major depressive disorder, ADHD attention deficit hyperactivity disorder, ASD autism spectrum disorder, AN anorexia nervosa. Star symbols indicate statistical significance after multiple testing corrections (P < 0.006 for panel c and P < 0.004 for d).
Figure 6. Genetic associations of E-factors with creativity. (a) Associations of polygenic scores for E-factors with creative professions ('arts, design, entertainment, sports and media' vs rest) in MVP tested using logistic regression; odds ratio and 95% confidence intervals are shown; (b) Associations of polygenic scores for E-factors with all 24 occupation categories in the MVP tested using logistic regression; Negative log10 P values are shown; the dotted line statistical significance threshold after multiple testing correction (P = 0.002).
Sample characteristics. The values represent either frequency (proportion) if categorical variable or mean (SD) if continuous variable. a Exam age is the age of the individuals when they sat for the exit exam. b The age at first diagnosis was calculated based on the first time the diagnosis was recorded in the register. c The date of first diagnosis and the date of exit exam were compared to identify if the individual has received the diagnosis before they sat for the exam. d The mean value was calculated across all the grades: Danish written, oral and grammar and English oral and mathematics written and oral (if sat for the exam before on or before 2006) or problem solving (if sat for the exam after 2006); The lowest and highest mean values observed were − 1.7 and 12.
Genome-wide association study of school grades identifies genetic overlap between language ability, psychopathology and creativity

January 2023

·

284 Reads

·

20 Citations

Cognitive functions of individuals with psychiatric disorders differ from that of the general population. Such cognitive differences often manifest early in life as differential school performance and have a strong genetic basis. Here we measured genetic predictors of school performance in 30,982 individuals in English, Danish and mathematics via a genome-wide association study (GWAS) and studied their relationship with risk for six major psychiatric disorders. When decomposing the school performance into math and language-specific performances, we observed phenotypically and genetically a strong negative correlation between math performance and risk for most psychiatric disorders. But language performance correlated positively with risk for certain disorders, especially schizophrenia, which we replicate in an independent sample (n = 4547). We also found that the genetic variants relating to increased risk for schizophrenia and better language performance are overrepresented in individuals involved in creative professions (n = 2953) compared to the general population (n = 164,622). The findings together suggest that language ability, creativity and psychopathology might stem from overlapping genetic roots.


A second update on mapping the human genetic architecture of COVID-19

December 2022

·

1,399 Reads

·

4 Citations

Investigating the role of host genetic factors in COVID-19 severity and susceptibility can inform our understanding of the underlying biological mechanisms that influence adverse outcomes and drug development. Here we present a second updated genome-wide association study (GWAS) on COVID-19 severity and infection susceptibility to SARS-CoV-2 from the COVID-19 Host Genetic Initiative (data release 7). We performed a meta-analysis of up to 219,692 cases and over 3 million controls, identifying 51 distinct genome-wide significant loci—adding 28 loci from the previous data release. The increased number of candidate genes at the identified loci helped to map three major biological pathways involved in susceptibility and severity: viral entry, airway defense in mucus, and type I interferon.


Citations (66)


... We uncovered associations between change in gene expression and pain intensity in several genes previously associated with chronic pain in GWAS. BDNF-AS [18], GFPT1 [46] and WWP2 [47] correlated positively, while PLCG2 [48], BTN2A2 [49] and NUMB [15,19] correlated negatively. The correlations were strictly sex-specific, and all but one (WWP2) were only observed in men. ...

Reference:

Gene Expression Correlates with Disability and Pain Intensity in Patients with Chronic Low Back Pain and Modic Changes in a Sex-Specific Manner
Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation

Nature Human Behaviour

... We additionally benchmarked the CNVs in saliva and blood against two independent constrained gene sets (top 1000) based on LOEUF 12 from gnomAD v2 and GISMO-mis. 13 Imputation To impute the low-coverage whole genome sequencing data, we used the Genotype Likelihoods IMputation and Phasing (GLIMPSE2) method. 14 We used the HGDP+1kGP reference panel 16 after filtering out singleton variants and indels, resulting in about over 67 million variants available to be imputed. ...

The landscape of gene loss and missense variation across the mammalian tree informs on gene essentiality

... Complete data was retained for 63,058 (DPW), 63,018 (height), 62,955 (BMI), 62,030 (BF%), and 20,578 (MDDsx) subjects. Four independent groups of non-European ancestry participants, as defined by panUKB [24], were used as holdout sets for replication. Four data filtering stages were used in this work. ...

Pan-UK Biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects

... To evaluate the influence of sex on mvPuberty-associated loci, we compared the effects of loci in the Tanner stage 7 , a measure of sexual maturation, across sex via SCOUTJOY, addressing heterogeneity detection while allowing both sample overlap and estimation error in comparison GWASs 64 . We tested each of the lead SNPs from mvPuberty for differences in Tanner stage effect sizes between males and females via the values of effect sizes (beta) and standard errors (se) in the Tanner stage GWAS, including data for 3769 boys and 6147 girls, and we tested whether the overall trend in effect sizes for the lead SNPs differed when Tanner in males versus females was compared via SCOUTJOY. ...

Distinct and shared genetic architectures of gestational diabetes mellitus and type 2 diabetes

Nature Genetics

... Many studies have shown that this "healthy volunteer bias" distorts the associations among phenotypes [104][105][106], and with genetic variants [107] that are associated with self-selection. Notably, several genetic variants that are associated with self-selection are also associated with psychiatric disorders [108][109][110][111]. Unless adequately mitigated through statistical approaches [104,[112][113][114][115] or validated through experimental means [112], genetic findings from volunteer samples may compound biases [104]. ...

Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci

Nature Human Behaviour

... Of specific interest in this article is joint modeling of AUD and CUD -two widely prevalent SUDs. They have some shared etiology and risk factors (Crane et al., 2021;Hatoum et al., 2023) Moreover, from a clinical perspective, it is important to identify substance users at high risk of either disorder. In addition, the risk factors for the two SUDs and their effects are likely to differ across three groups of users: those who use only alcohol (A), both alcohol and cannabis (B), and only cannabis (C). ...

Multivariate genome-wide association meta-analysis of over 1 million subjects identifies loci underlying multiple substance use disorders

Nature Mental Health

... Several studies have indicated that the genetics of GDM risk can be classified into two groups: one is T2DM risk, and the other is the main factor specific to GDM. The GDM-specific mechanism can be associated with islet cells, central glucose homeostasis, steroidogenesis, and placental expression [54]. It suggests that GDM and T2DM seek differences in gene mechanisms. ...

Distinct and shared genetic architectures of Gestational diabetes mellitus and Type 2 Diabetes Mellitus

... Our work further expands findings on the overlap in genetic correlations between psychiatric disorders, school performance and creativity 56 ; replicates and further extends the link between creativity and neuropsychiatric diagnoses 9,14,15,21,22 ; and identifies new associations between neuropsychiatric traits and broad employment categories. Importantly, at the individual level, our work shows that utilizing PGS of neuropsychiatric traits towards informing or predicting membership in a professional category does not hold much potential since PGS explain a very small part of the variance. ...

Genome-wide association study of school grades identifies genetic overlap between language ability, psychopathology and creativity

... We also re-estimated and leveraged the genetic correlation between the traits to amplify statistical power to discover IPF associations by performing multi-trait meta-analysis (MTAG) 24 . For these analyses we used the IPF meta-analysis statistics produced in this study and the latest meta-analysis summaries (Freeze 7) from the COVID-19 Host Genetics Initiative (HGI), for the Hospitalised covid vs. population phenotype (B2_ALL_leave_23andme), as this has produced the best balance between a carefully curated phenotype for severity and statistical power 15,25 (Supplementary Table 15). ...

A second update on mapping the human genetic architecture of COVID-19

... Additionally, we performed multi-ancestry meta-analysis (MAMA) 30 , a GWAS meta-analysis method, which models differences in effect sizes, allele frequencies and LD patterns between populations and provides population-specific meta-analysis results. Using MAMA, we identified 94 independent genome-wide significant SNPs with EAS-specific meta-analysis (Fig. 1c and Supplementary Table 13), 2 of which were previously unreported for EduYears (rs2881903 and rs16930687); they were located beyond ±500 kb of the lead SNPs reported in previous EduYears GWAS (Supplementary Figs. 10 and 11) 10,14 . ...

Multi-Ancestry Meta-Analysis yields novel genetic discoveries and ancestry-specific associations