ArticlePDF Available

SimHap GUI: An intuitive graphical user interface for genetic association analysis

January 2009
BMC Bioinformatics 9(1):557

January 2009
9(1):557

DOI:10.1186/1471-2105-9-557

Source
PubMed

License
CC BY 2.0

Authors:

Kim Warwick Carter

University of Western Australia

Pamela McCaskie

Curtin University

Lyle John Palmer

University of Adelaide

Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.

Example SimHap GUI model building interface.

…

Figures - available via license: Creative Commons Attribution 2.0 Generic

Content may be subject to copyright.

Content uploaded by Kim Warwick Carter

Content may be subject to copyright.

Available via license: CC BY 2.0

Content may be subject to copyright.

BioMed Central

Page 1 of 6

(page number not for citation purposes)

BMC Bioinformatics

Open Access

Software

SimHap GUI: An intuitive graphical user interface for genetic

association analysis

Kim W Carter*1,4, Pamela A McCaskie2,3 and Lyle J Palmer3

Address: 1Western Australian Institute for Medical Research and UWA Centre for Medical Research, University of Western Australia, Perth,

Australia, 2School of Mathematics and Statistics, University of Western Australia, Perth, Australia, 3Centre for Genetic Epidemiology and

Biostatistics, University of Western Australia, Perth, Australia and 4Telethon Institute for Child Health Research, UWA Centre for Child Health

Research, University of Western Australia, 100 Roberts Rd, Subiaco, Western Australia 6008, Australia

Email: Kim W Carter* - kcarter@ichr.uwa.edu.au; Pamela A McCaskie - pmccask@cyllene.uwa.edu.au; Lyle J Palmer - lyle@cyllene.uwa.edu.au

* Corresponding author

Abstract

Background: Researchers wishing to conduct genetic association analysis involving single

nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly

graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform

relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language,

provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a

graphical user interface that allows anyone but a professional statistician to effectively utilise the

tool.

Results: We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for

conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI

features a novel workflow interface that guides the user through each logical step of the analysis

process, making it accessible to both novice and advanced users. This tool provides a seamless

interface to the SimHap R package, while providing enhanced functionality such as sophisticated

data checking, automated data conversion, and real-time estimations of haplotype simulation

progress.

Conclusion: SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a

range of genetic and non-genetic association analyses. This provides a free alternative to

commercial statistics packages that is specifically designed for genetic association analysis.

Background

While the growth in the volume of genetic data available

has led to many new discoveries, it is becoming increas-

ingly important to find ways in which to easily analyse

large of volumes of data. This is certainly the case with

genetic association studies, where high-throughput geno-

typing technologies have brought about the potential for

hundreds of thousands of data points per individual sub-

ject [1].

A graphical user interface (GUI) is still a rare feature

amongst currently available genetic analysis packages,

particularly those used to analyse single nucleotide poly-

morphisms (SNPs) or haplotypes. A well designed user

Published: 25 December 2008

BMC Bioinformatics 2008, 9:557 doi:10.1186/1471-2105-9-557

Received: 29 September 2008

Accepted: 25 December 2008

This article is available from: http://www.biomedcentral.com/1471-2105/9/557

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),

which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557

Page 2 of 6

(page number not for citation purposes)

interface would allow users without a comprehensive

knowledge of statistical modelling or command line oper-

ation to perform complex analyses.

Commercially available statistics software packages, such

as SPSS (SPSS Inc., 2008) and Stata (StataCorp. 2008),

may be useful, but are not specifically designed to analyse

genetic data, requiring sophisticated prior knowledge for

the end-user. Another major annoyance is the lack of inte-

gration between statistical and analytical packages [2],

often with one program required for epidemiological

analysis, a separate program for SNP analysis, and a third

used for haplotype analysis.

SimHap [3] is a statistical analysis package for genetic asso-

ciation testing, available in R [4], which amongst other

features, infers haplotypes for unrelated individuals with

unknown phase. Although various programs currently

exist for haplotype analysis, SimHap is unique in a

number of ways. It uses a multiple-imputation (MI) based

approach to test for association, which incorporates infor-

mation about uncertainty around inferred haplotypes.

This approach uses current expectation maximisation

(EM) methods for the estimation of haplotype frequen-

cies from unphased genotype data [5]. To utilize the pos-

terior distribution of diplotype (a haplotype pair)

probabilities, the MI approach of Rubin [6] was imple-

mented, where a series of "complete" data sets are gener-

ated containing all data from the original set as well as

additional dummy variables for each haplotype, the val-

ues of which indicate the number of copies of that haplo-

type observed in an individual's diplotype (0, 1 or 2). For

individuals with known phase (only one diplotype), the

values for these haplotype variables remain constant for

each of the generated data sets. For individuals with

ambiguous phase, their haplotype values will be sampled

from their predictive distribution, containing only those

diplotypes consistent with their genotypes. This is a novel

approach that provides an empirical distribution of the

haplotypic effects and their significance levels.

We have developed SimHap GUI as an intuitive graphical

tool for conducting genetic association analysis. At its

core, SimHap GUI utilises the SimHap R package and the

R statistical language. SimHap GUI is a novel custom-

designed integrated tool for conducting epidemiological,

single SNP and haplotype-based association analyses

within a single application, and provides a free alternative

to commercially available statistics packages.

Results and discussion

Implementation

SimHap GUI is written in Java (requires Java 1.5+) and

will operate on platforms where Java is available. This tool

has been successfully tested on Windows, Linux and

MacOS operating systems. SimHap GUI requires an

installation of the R statistics lanuguage (2.4.0+) and an

installation of the SimHap R package. This tool runs opti-

mally on a computer with a monitor resolution of 1024 ×

768, at least 128 Mb of RAM and a Pentium 4+ CPU. Sim-

Hap GUI has been successfully operated on datasets with

thousands of individuals, hundreds of phenotype varia-

bles, and thousands of SNPs. SimHap GUI is generally

only limited by the amount of system memory available

to Java.

The SimHap GUI interface is written in Java Swing, and

uses the Synthetica look-and-feel suite [7] to enhance the

useability and functionality of the interface (compared

with standard Swing interfaces). We have also utilised the

Swing Worker [8] library, which provides a mechanism

for providing updates to the user interface while running

long analytical tasks, such as performing thousands of

haplotype simulations. Both Synthetica and Swing

Worker are provided with the SimHap GUI installation.

SimHap GUI is provided as a single cross-platform

installer, using the IzPack [9] packaging system, which

provides a simple standardised graphical installer tool

that both technical and non-technical users will be com-

fortable with.

Graphical User Interface (GUI)

SimHap GUI allows the user to conduct association anal-

ysis of binary, quantitative, longitudinal and survival

(right-censored) outcomes using phenotypic data, and

genetic SNP data and haplotype data, in unrelated indi-

viduals.

One key feature of SimHap GUI is the workflow interface,

which guides the user through each logical step of the ana-

lytical process. This workflow concept is central to provid-

ing an intuitive user interface accessible to both novice

and advanced users.

The user initially selects a standard comma separated

value (CSV) file containing phenotypic information for a

set of individuals (one row of data per person), as can be

obtained from most spreadsheet and statistics software.

The user also selects a CSV file containing genotypes for a

series of SNP markers for the same individuals (not

required for non-genetic modelling), and selects the char-

acter(s) signifying missing data in the input files. SimHap

GUI examines the input files to ensure correct formatting,

completeness, and the correct corresponding individual

identifier between phenotype and genotype files. Geno-

type files are examined to ensure biallelic SNPs are input,

where the user is given the option to remove multi-allelic

markers. Once data checking is complete, the user can

choose to perform epidemiological modelling (without

genetic markers), single SNP association analysis, or hap-

BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557

Page 3 of 6

(page number not for citation purposes)

lotype association analysis. Users are guided through each

of these analytical tasks in a straight-forward series of

steps, with a standardised model building screen central

to each of the analysis types.

Figure 1 is an example of the model building screen for a

single SNP analysis with a quantitative outcome using

SimHap GUI. At the top of the screen, hdl (cholesterol)

has been selected as the outcome of interest, with the out-

come normally distributed (Untransformed). Log base 10

and natural log of the outcome are available to transform

non-Normally distributed outcomes. In the MAIN

EFFECTS section are the available and selected covariates

for this model, namely sex, age, bmi and smoke. Covariates

can also be added as squared or cubic terms, logged (base

10 or natural log), and as factors (for categorical terms). In

the GENOTYPES section are the available and selected

SNPs to be analysed in the model. SNP covariates are

denoted with the S_ prefix, while the _add, _dom and _rec

terms refer to analysing the SNP under an additive, dom-

inant or recessive genetic model. SNPs can also be ana-

lysed under a codominant model by adding the SNP as a

factor. In the INTERACTIONS section are available and

selected covariate terms to be analysed for statistical inter-

actions; in this case, an interaction between sex and SNP_1

under a codominant model. Additional files 1, 2, 3, 4, 5,

6, 7, 8 provide a graphical representation of each of the

phases of analysis for an example single SNP analysis. The

SimHap GUI software manual also provides a detailed

description of the analysis process.

Case Studies

SimHap GUI, and its earlier Beta 1 and Beta 2.1 releases,

have been extensively utilised in a range of genetics

projects recently published.

In the area of cancer research, SimHap GUI has been used

in studies such as Sak et al [10], to examine the association

Example SimHap GUI model building interfaceFigure 1

Example SimHap GUI model building interface.

BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557

Page 4 of 6

(page number not for citation purposes)

between polymorphisms in the XPC gene and bladder

cancer susceptibility. Choudhury et al [11] also examined

haplotypes of DNA repair proteins to find genetic variants

that may modulate predisposition to bladder cancer.

SimHap GUI has been used extensively in the field of car-

diovascular disease genetics. Several studies has used this

tool to examine SNP and haplotype effects of genes

related to abdominal aortic aneurysm [12-14]. Studies by

both Horne et al [15] and McCaskie et al [16] have used

SimHap GUI to investigate the association between

genetic variation in the cholesteryl ester transfer protein

gene and cardiovascular disease. SimHap GUI has also

been used to investigate SNP and haplotype associations

with metabolic syndrome [17-20] and atherosclerosis

[21-24] related outcomes.

In the area of genetic epidemiology related to the Mende-

lian Randomization (MR) technique, a number of groups

have utilised SimHap GUI. Brunner and colleagues [25]

used SimHap GUI to generate haplotypes for three tagging

polymorphisms from the C-reactive protein (CRP) gene in

a study of 5,274 men and women. Studies by Lawlor et al

[26] and Kivimaki et al [27] similarly this software for

analysis of CRP mutations using MR.

Other diverse studies include the use of SimHap GUI to

investigate genetic influences of the melanocortin 1 recep-

tor with sensitivity to photochemotherapy [28], polymor-

phisms within the macrophage migration inhibitory

factor with relation to acute lung injury in patients with

sepsis [29], associations between cytokine polymor-

phisms and outcomes after renal transplantation [30],

and genetic predictors for the development of microalbu-

minuria in children with type 1 diabetes [31].

The wide range of example publications described here

highlights the significance of the SimHap GUI software

providing an easy-to-use powerful interface for both nov-

ice and advanced genetic association analyses.

GUI versus R package

One of the critical distinctions to make with the SimHap

GUI software is the difference between the SimHap R pack-

age, and the Java based interface described in this manu-

script. The backend SimHap R package simply provides the

statistical operations to conduct particular analytical

tasks, with the onus on the user to have technical knowl-

edge of the statistical methods being employed and exper-

tise with the command line interface of the R language.

Users who are not professional statisticians may be dis-

couraged by the difficulty of operating under a command-

line interface.

The SimHap GUI interface provides the functionality,

accessibility and the guided analytical approach that can-

not be found in the command line package. The user

interface is designed around the premise of a workflow

analysis model, which mimics the logical analytical proc-

esses required to conduct a particular statistical test. This

user-friendly, intuitive interface has been designed to sat-

isfy the needs of both the technical and non-technical sta-

tistical user, and does not require sophisticated

informatics knowledge to operate. Using the novel model

building interface, users can perform tasks ranging from

simple univariate linear modelling, through to more

sophisticated tasks such as multivariate modelling of lon-

gitudinal outcomes with gene:gene and gene:environ-

ment interactions. A standardised interface is provide for

users to conduct epidemiological (no genetics factors),

single SNP and haplotype association analyses.

Features of SimHap GUI that are not provided in the Sim-

Hap R package include: an intuitive GUI for model build-

ing and guiding the overall analysis process; sophisticated

data checking of phenotype and genotype data; automatic

conversion of data for single SNP and haplotype associa-

tion analysis; automatic calculation of allele frequencies

and genotype distribution; quantile-quantile plotting for

Normality of quantitative traits; and real-time estimation

of the haplotype imputation simulation progress. Sim-

Hap GUI implements all of the functions from the Sim-

Hap R package.

Conclusion

In summary, SimHap GUI provides a cross-platform, intu-

itive and integrated interface for conducting a range of

genetic and non-genetic association analyses.

Availability and requirements

- Project name: SimHap GUI

- Project home page: http://www.genepi.org.au/simhap

- Operating system(s): Platform independent (tested on

Windows, Linux and MacOS)

- Programming language: Java

- Other requirements: Java 1.5+; R 2.4.0+ (available from

http://www.r-project.org/); SimHap R package from

CRAN (available from http://cran.r-project.org/web/pack

ages/SimHap/index.html)

- Licence: Free for non-commercial use

Authors' contributions

KWC designed and developed the Java GUI interface.

PAM assisted with integration of statistical methods and

aided with design of the GUI. LJP supervised the design

and coordinated the development of the software.

BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557

Page 5 of 6

(page number not for citation purposes)

Additional material

Acknowledgements

KWC was supported by the Australian Research Council Discovery

Project DP0663247. This work was supported by the National Health and

Medical Research Council of Australia Project Grant 404009.

References

1. Hirschhorn JN, Daly MJ: Genome-wide association studies for

common diseases and complex traits. Nat Rev Genet 2005,

6(2):95-108.

2. Excoffier L, Heckel G: Computer programs for population

genetics data analysis: a survival guide. Nat Rev Genet 2006,

7(10):745-758.

3. CRAN – SimHap package [http://cran.r-project.org/web/pack

ages/SimHap/index.html]

4. Ihaka R, Gentleman R: R: A Language for Data Analysis and

Graphics. Journal of Computational and Graphical Statistics 1996,

5(3):299-314.

5. Chiano MN, Clayton DG: Fine genetic mapping using haplotype

analysis and the missing data problem. Ann Hum Genet 1998,

62(Pt 1):55-60.

6. Rubin DB: Multiple Imputation After 18+ Years. Journal of the

American Statistical Association 1996, 91(434):473-489.

7. Synthetica – Java Look and Feel [http://www.javasoft.de/jsf/pub

lic/products/synthetica]

8. SwingWorker [https://swingworker.dev.java.net/]

9. IzPack – Package once, Deploy everywhere [http://

www.izpack.org/]

10. Sak SC, Barrett JH, Paul AB, Bishop DT, Kiltie AE: Comprehensive

analysis of 22 XPC polymorphisms and bladder cancer risk.

Cancer Epidemiol Biomarkers Prev 2006, 15(12):2537-2541.

11. Choudhury A, Elliott F, Iles MM, Churchman M, Bristow RG, Bishop

DT, Kiltie AE: Analysis of variants in DNA damage signalling

genes in bladder cancer. BMC Med Genet 2008, 9:69.

12. Golledge J, Muller J, Shephard N, Clancy P, Smallwood L, Moran C,

Dear AE, Palmer LJ, Norman PE: Association between osteopon-

tin and human abdominal aortic aneurysm. Arterioscler Thromb

Vasc Biol 2007, 27(3):655-660.

13. Smallwood L, Allcock R, van Bockxmeer F, Warrington N, Palmer LJ,

Iacopetta B, Golledge J, Norman PE: Polymorphisms of the

matrix metalloproteinase 9 gene and abdominal aortic aneu-

rysm. Br J Surg 2008, 95(10):1239-1244.

14. Smallwood L, Allcock R, van Bockxmeer F, Warrington N, Palmer LJ,

Iacopetta B, Norman PE: Polymorphisms of the interleukin-6

gene promoter and abdominal aortic aneurysm. Eur J Vasc

Endovasc Surg 2008, 35(1):31-36.

15. Horne BD, Camp NJ, Anderson JL, Mower CP, Clarke JL, Kolek MJ,

Carlquist JF: Multiple less common genetic variants explain

the association of the cholesteryl ester transfer protein gene

with coronary artery disease. J Am Coll Cardiol 2007,

49(20):2053-2060.

16. McCaskie PA, Beilby JP, Chapman CM, Hung J, McQuillan BM,

Thompson PL, Palmer LJ: Cholesteryl ester transfer protein

gene haplotypes, plasma high-density lipoprotein levels and

the risk of coronary heart disease. Hum Genet 2007, 121(3–

4):401-411.

17. Carter KW, Hung J, Powell BL, Wiltshire S, Foo BT, Leow YC,

McQuillan BM, Jennens M, McCaskie PA, Thompson PL, Beilby JP,

Palmer LJ: Association of Interleukin-1 gene polymorphisms

with central obesity and metabolic syndrome in a coronary

heart disease population. Hum Genet 2008, 124(3):199-206.

Additional file 1

SimHap GUI file selection screen. This screenshot shows the selection of

phenotype and genotype CSV files for analysis in SimHap GUI.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S1.png]

Additional file 2

SimHap GUI input parameter selection screen. Following selection of

input files, this screenshot shows the user specifying input parameters, and

a summary of the input data file characteristics.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S2.png]

Additional file 3

SimHap GUI major allele selection screen. After the user has selected to

perform a 'single SNP' analysis, the user can specify the major allele for

polymorphism in the input genotype file (as illustrated in this screenshot).

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S3.png]

Additional file 4

SimHap GUI normality plots. This screenshot shows the user checking

whether quantitative variables to be analysed are normally distributed.

This screen option is available when the user is ready to select a particular

type of outcome (binary, quantitative, longitudinal and right-censored)

for analysis.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S4.png]

Additional file 5

SimHap GUI model building screen for single SNP analysis. This

screenshot shows the model building screen in SimHap GUI, where the

user has selected to analyse a quantitative outcome (HDL), and has

selected various covariates (SEX, AGE, BMI, SMOKE) and a polymor-

phism of interest (SNP1).

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S5.png]

Additional file 6

SimHap GUI model parameters. This screenshot shows the display pre-

sented after the model building screen, where the user can specify addi-

tional subset parameters, and other statistical parameters.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S6.png]

Additional file 7

SimHap GUI results summary. After the user has built their desired sta-

tistical model, SimHap GUI runs the analysis, and the summary results

are presented as illustrated in this screenshot. Statistically significant

results are highlighted in red for easy identification.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S7.png]

Additional file 8

SimHap GUI detailed results summary. The screenshot shows the

detailed statistical information provided, in addition to the summary sta-

tistics described in the previous figure. For example, marginal means by

genotype group are provided in this detailed summary.

Click here for file

[http://www.biomedcentral.com/content/supplementary/1471-

2105-9-557-S8.png]

Publish with BioMed Central and ever y

scientist can read your work free of charge

"BioMed Central will be the most significant development for

disseminating the results of biomedical research in our lifetime."

Sir Paul Nurse, Cancer Research UK

Your research papers will be:

available free of charge to the entire biomedical community

peer reviewed and published immediately upon acceptance

cited in PubMed and archived on PubMed Central

yours — you keep the copyright

Submit your manuscript here:

http://www.biomedcentral.com/info/publishing_adv.asp

BioMedcentral

BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557

Page 6 of 6

(page number not for citation purposes)

18. Powell BL, Wiltshire S, Arscott G, McCaskie PA, Hung J, McQuillan

BM, Thompson PL, Carter KW, Palmer LJ, Beilby JP: Association of

PARL rs3732581 genetic variant with insulin levels, meta-

bolic syndrome and coronary artery disease. Hum Genet 2008,

124(3):263-270.

19. Thompson SR, McCaskie PA, Beilby JP, Hung J, Jennens M, Chapman

C, Thompson P, Humphries SE: IL18 haplotypes are associated

with serum IL-18 concentrations in a population-based study

and a cohort of individuals with premature coronary heart

disease. Clin Chem 2007, 53(12):2078-2085.

20. Thompson SR, Sanders J, Stephens JW, Miller GJ, Humphries SE: A

common interleukin 18 haplotype is associated with higher

body mass index in subjects with diabetes and coronary

heart disease. Metabolism 2007, 56(5):662-669.

21. McCaskie PA, Beilby JP, Hung J, Chapman CM, McQuillan BM, Powell

BL, Thompson PL, Palmer LJ: 15-Lipoxygenase gene variants are

associated with carotid plaque but not carotid intima-media

thickness. Hum Genet 2008, 123(5):445-453.

22. McCaskie PA, Cadby G, Hung J, McQuillan BM, Chapman CM, Carter

KW, Thompson PL, Palmer LJ, Beilby JP: The C-480T hepatic

lipase polymorphism is associated with HDL-C but not with

risk of coronary heart disease. Clin Genet 2006, 70(2):114-121.

23. Wiltshire S, Powell BL, Jennens M, McCaskie PA, Carter KW, Palmer

LJ, Thompson PL, McQuillan BM, Hung J, Beilby JP: Investigating the

association between K198N coding polymorphism in EDN1

and hypertension, lipoprotein levels, the metabolic syn-

drome and cardiovascular disease. Hum Genet 2008,

123(3):307-313.

24. Xiao J, Zhang F, Wiltshire S, Hung J, Jennens M, Beilby JP, Thompson

PL, McQuillan BM, McCaskie PA, Carter KW, Palmer LJ, Powell BL:

The apolipoprotein AII rs5082 variant is associated with

reduced risk of coronary artery disease in an Australian male

population. Atherosclerosis 2008, 199(2):333-339.

25. Brunner EJ, Kivimaki M, Witte DR, Lawlor DA, Davey Smith G,

Cooper JA, Miller M, Lowe GD, Rumley A, Casas JP, Shah T, Hum-

phries SE, Hingorani AD, Marmot MG, Timpson NJ, Kumari M:

Inflammation, insulin resistance, and diabetes – Mendelian

randomization using CRP haplotypes points upstream. PLoS

Med 2008, 5(8):e155.

26. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G:

Mendelian randomization: Using genes as instruments for

making causal inferences in epidemiology. Statistics in Medicine

2008, 27(8):1133-1163.

27. Kivimaki M, Lawlor DA, Smith GD, Kumari M, Donald A, Britton A,

Casas JP, Shah T, Brunner E, Timpson NJ, Halcox JP, Miller MA, Hum-

phries SE, Deanfield J, Marmot MG, Hingorani AD: Does high C-

reactive protein concentration increase atherosclerosis?

The Whitehall II Study. PLoS ONE 2008, 3(8):e3013.

28. Smith G, Wilkie MJ, Deeni YY, Farr PM, Ferguson J, Wolf CR, Ibbot-

son SH: Melanocortin 1 receptor (MC1R) genotype influences

erythemal sensitivity to psoralen-ultraviolet A photochemo-

therapy. Br J Dermatol 2007, 157(6):1230-1234.

29. Gao L, Flores C, Fan-Ma S, Miller EJ, Moitra J, Moreno L, Wadgaonkar

R, Simon B, Brower R, Sevransky J, Tuder RM, Maloney JP, Moss M,

Shanholtz C, Yates CR, Meduri GU, Ye SQ, Barnes KC, Garcia JG:

Macrophage migration inhibitory factor in acute lung injury:

expression, biomarker, and associations. Transl Res 2007,

150(1):18-29.

30. Thakkinstian A, Dmitrienko S, Gerbase-Delima M, McDaniel DO,

Inigo P, Chow KM, McEvoy M, Ingsathit A, Trevillian P, Barber WH,

Attia J: Association between cytokine gene polymorphisms

and outcomes in renal transplantation: a meta-analysis of

individual patient data. Nephrol Dial Transplant 2008,

23(9):3017-3023.

31. Gallego PH, Shephard N, Bulsara MK, van Bockxmeer FM, Powell BL,

Beilby JP, Arscott G, Le Page M, Palmer LJ, Davis EA, Jones TW,

Choong CS: Angiotensinogen gene T235 variant: a marker for

the development of persistent microalbuminuria in children

and adolescents with type 1 diabetes mellitus. J Diabetes Com-

plications 2008, 22(3):191-198.

Additional file 6

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 5

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 1

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 8

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 3

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 2

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 7

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Additional file 4

Data

December 2008

Kim Warwick Carter · Pamela McCaskie · Lyle John Palmer

Download

Vitamin D receptor and calcium-sensing receptor polymorphisms and colorectal cancer survival in the Newfoundland population

Article

Full-text available

Aug 2017

Background: Increased serum levels of vitamin D and calcium have been associated with lower risks of colorectal cancer (CRC) incidence and mortality. These inverse associations may be mediated by the vitamin D receptor (VDR) and the calcium-sensing receptor (CASR). We investigated genetic variants in VDR and CASR for their relevance to CRC prognosis. Methods: A population-based cohort of 531 CRC patients diagnosed from 1999 to 2003 in Newfoundland and Labrador, Canada, was followed for mortality and cancer recurrence until April 2010. Germline DNA samples were genotyped with the Illumina Omni-Quad 1 Million chip. Multivariate Cox models assessed 41 tag single-nucleotide polymorphisms and relative haplotypes on VDR and CASR in relation to all-cause mortality (overall survival, OS) and disease-free survival (DFS). Results: Gene-level associations were observed between VDR and the DFS of rectal cancer patients (P=0.037) as well as between CASR and the OS of colon cancer patients (P=0.014). Haplotype analysis within linkage blocks of CASR revealed the G-G-G-G-G-A-C haplotype (rs10222633-rs10934578-rs3804592-rs17250717-A986S-R990G-rs1802757) to be associated with a decreased OS of colon cancer (HR, 3.15; 95% CI, 1.66-5.96). Potential interactions were seen among prediagnostic dietary calcium intake with the CASR R990G (Pint=0.040) and the CASR G-T-G-G-G-G-C haplotype for rs10222633-rs10934578-rs3804592-rs17250717-A986S-R990G-rs1802757 (Pint=0.017), with decreased OS time associated with these variants limited to patients consuming dietary calcium below the median, although the stratified results were not statistically significant after correction for multiple testing. Conclusions: Polymorphic variations in VDR and CASR may be associated with survival after a diagnosis of CRC.British Journal of Cancer advance online publication, 1 August 2017; doi:10.1038/bjc.2017.242 www.bjcancer.com.

Effect of angiotensin-converting enzyme tag single nucleotide polymorphisms on the outcome of patients with traumatic brain injury

Article

Jul 2015

Genetic variants appear to influence, at least to some degree, the extent of brain injury and the clinical outcome of patients who have sustained a traumatic brain injury (TBI). Angiotensin-converting enzyme (ACE) is a zinc metallopeptidase that is implicated in the regulation of blood pressure and cerebral circulation. ACE gene polymorphisms were found to regulate serum ACE enzyme activity. The present study aimed to investigate possible influence of ACE gene region variants on patients' outcome after TBI. In total, 363 TBI patients prospectively enrolled in the study were genotyped for five tag single nucleotide polymorphisms (SNPs) across the ACE gene. Using logistic regression analyses, tag SNPs and their constructed haplotypes were tested for associations with 6-month Glasgow Outcome Scale scores, after adjustment for age, sex, Glasgow Coma Scale scores at admission, and the presence of a hemorrhagic event in the initial computed tomography scan. Significant effects on TBI outcome were found for three neighboring tag SNPs in the codominant (genotypic) model of inheritance [rs4461142: odds ratio (OR) 0.26, 95% confidence interval (CI) 0.12-0.57, P=0.0001; rs7221780: OR 2.67, 95% CI 1.25-5.72, P=0.0003; and rs8066276: OR 3.82, 95% CI 1.80-8.13, P=0.0002; for the heterozygous variants compared with the common alleles]. None of the constructed common tag SNPs haplotypes was associated with TBI outcome. The present study provides evidence of the possible influence of genetic variations in a specific region of the ACE gene on the outcome of TBI patients. This association may have pharmacogenetic implications in identifying those TBI patients who may benefit from ACE inhibition.

Dannlowski et al Int J neuropsychopharmacol 2013

Data

Full-text available

Jun 2014

Association of the Common Genetic Polymorphisms and Haplotypes of the Chymase Gene with Left Ventricular Mass in Male Patients with Symptomatic Aortic Stenosis

Article

Full-text available

May 2014
PLOS ONE

We investigated the association between polymorphisms and haplotypes of the chymase 1 gene (CMA1) and the left ventricular mass index (LVM/BSA) in a large cohort of patients with aortic stenosis (AS). Additionally, the gender differences in cardiac remodeling and hypertrophy were analyzed. The genetic background may affect the myocardial response to pressure overload. In human cardiac tissue, CMA1 is involved in angiotensin II production and TGF-β activation, which are two major players in the pathogenesis of hypertrophy and fibrosis. Preoperative echocardiographic data from 648 patients with significant symptomatic AS were used. The LVM/BSA was significantly lower (p<0.0001), but relative wall thickness (RWT) was significantly higher (p = 0.0009) in the women compared with the men. The haplotypes were reconstructed using six genotyped polymorphisms: rs5248, rs4519248, rs1956932, rs17184822, rs1956923, and rs1800875. The haplotype h1.ACAGGA was associated with higher LVM/BSA (p = 9.84×10-5), and the haplotype h2.ATAGAG was associated with lower LVM/BSA (p = 0.0061) in men, and no significant differences were found in women. Two polymorphisms within the promoter region of the CMA1 gene, namely rs1800875 (p = 0.0067) and rs1956923 (p = 0.0015), influenced the value of the LVM/BSA in males. The polymorphisms and haplotypes of the CMA1 locus are associated with cardiac hypertrophy in male patients with symptomatic AS. Appropriate methods for the indexation of heart dimensions revealed substantial sex-related differences in the myocardial response to pressure overload.

A comprehensive investigation of variants in genes encoding adiponectin (ADIPOQ) and its receptors (ADIPOR1/R2), and their association with serum adiponectin, type 2 diabetes, insulin resistance and the metabolic syndrome

Article

Full-text available

Jan 2013
BMC MED GENET

Background Low levels of serum adiponectin have been linked to central obesity, insulin resistance, metabolic syndrome, and type 2 diabetes. Variants in ADIPOQ, the gene encoding adiponectin, have been shown to influence serum adiponectin concentration, and along with variants in the adiponectin receptors (ADIPOR1 and ADIPOR2) have been implicated in metabolic syndrome and type 2 diabetes. This study aimed to comprehensively investigate the association of common variants in ADIPOQ, ADIPOR1 and ADIPOR2 with serum adiponectin and insulin resistance syndromes in a large cohort of European-Australian individuals. Methods Sixty-four tagging single nucleotide polymorphisms in ADIPOQ, ADIPOR1 and ADIPOR2 were genotyped in two general population cohorts consisting of 2,355 subjects, and one cohort of 967 subjects with type 2 diabetes. The association of tagSNPs with outcomes were evaluated using linear or logistic modelling. Meta-analysis of the three cohorts was performed by random-effects modelling. Results Meta-analysis revealed nine genotyped tagSNPs in ADIPOQ significantly associated with serum adiponectin across all cohorts after adjustment for age, gender and BMI, including rs10937273, rs12637534, rs1648707, rs16861209, rs822395, rs17366568, rs3774261, rs6444175 and rs17373414. The results of haplotype-based analyses were also consistent. Overall, the variants in the ADIPOQ gene explained <5% of the variance in serum adiponectin concentration. None of the ADIPOR1/R2 tagSNPs were associated with serum adiponectin. There was no association between any of the genetic variants and insulin resistance or metabolic syndrome. A multi-SNP genotypic risk score for ADIPOQ alleles revealed an association with 3 independent SNPs, rs12637534, rs16861209, rs17366568 and type 2 diabetes after adjusting for adiponectin levels (OR=0.86, 95% CI=(0.75, 0.99), P=0.0134). Conclusions Genetic variation in ADIPOQ, but not its receptors, was associated with altered serum adiponectin. However, genetic variation in ADIPOQ and its receptors does not appear to contribute to the risk of insulin resistance or metabolic syndrome but did for type 2 diabetes in a European-Australian population.

Vitamin D, Vitamin D Binding Protein, and Cardiovascular Disease

Chapter

Aug 2013

Common genetic polymorphisms and haplotypes of chymase gene affect left ventricular hypertrophy in male patients with symptomatic aortic stenosis

Article

Aug 2013

Purpose: The factors that are responsible for the extend of cardiac hypertrophy in patients with aortic stenosis (AS) are not well defined and a polygenic background is suggested. Angiotensin II (AngII) and transforming growth factor-b (TGF-b) are considered to be crucial factors in the pathogenesis of fibrotic and hypertrophic remodeling in the pressure-overloaded heart. Chymase is not only the major Ang II–generating enzyme in the human heart but also is able to activate TGFb in the cardiac tissue. The goal of this study was to examine the association between genetic variation of the CMA1 locus and left ventricular hypertrophy in a large cohort of patients with aortic stenosis. Methods: A total of 648 patients with significant, symptomatic AS were studied. Preoperative echocardiographic data were used to calculate left ventricular mass index (LVMI). Comprehensive tagging SNP and haplotypes approach was implemented. Polymorphisms were genotyped using SNPlex or TaqMan methodology. Generalized-linear models were applied to analyze data using crude or full model adjusted for sex, age, ejection fraction, weight, maximal aortic gradient. P-values were designated as significant for p < 0.0125 on the basis of Bonferroni correction for independent tests. Results: Haplotypes were reconstructed using six genotyped rs5248, rs4519248, rs1956932, rs17184822, rs1956923, rs1800875. The haplotype h1.ACAGGA (51.46%) was associated with higher LVMI (p=0.0006), in contrast h2.ATAGAG (23.09%) had protective effect (p=0.0113) in additive genetic model in all patients. The associations were stronger among male patients: h1.ACAGGA (p=9.84x10-5 and p=0.0012, crude and full model respectively) and h2.ATAGAG (p=0.0061 and p=0.0161) no significant differences were found in women. Two SNPs located in the promoter region of the CMA1 gene rs1800875 (p=0.0067) and rs1956923 (p=0.0015) showed an association with the LVMI in additive genetic model in males Conclusions: Genetic variants of CMA1 locus are associated with cardiac hypertrophy in male patients with symptomatic aortic stenosis. This gender effect could be explained by a profound differences in the TGFb pathway activation between women and men in response to pressure overload.

coreSNP: Parallel Processing of Microarray Data

Article

Dec 2014

The availability of high-throughput technologies, such as next generation sequencing and microarray, and the diffusion of genomics studies to large populations are producing an increasing amount of experimental data. In particular, pharmacogenomics studies the impact of genetic variation on drug response in patients and correlates gene expression or single nucleotide polymorphisms (SNPs) with the toxicity or efficacy of a drug, with the aim to improve drug therapy with respect to the patients' genotype ensuring maximum efficacy with minimal adverse effects. However, the storage, preprocessing, and analysis of experimental data are becoming a main bottleneck in the pharmacogenomics analysis pipeline, due to the increasing number of genes and patients investigated. This paper presents a new parallel software tool named coreSNP for the parallel preprocessing and statistical analysis of DMET (Drug Metabolism Enzymes and Transporters) SNP microarray data produced by Affymetrix for pharmacogenomics studies. The scalable multi-threaded implementation of coreSNP allows to handle the huge volumes of experimental pharmacogenomics data in a very efficient way, while its easy to use graphical user interface and its ability to annotate significant SNPs allow biologists to interpret the results easily. Performance evaluation conducted using real datasets shows good speed-up and scalability and effective response times.

AQP4 Tag Single Nucleotide Polymorphisms in Patients with Traumatic Brain Injury

Article

Full-text available

Jul 2014

Accumulating evidence suggests that the extent of brain injury and the clinical outcome after Traumatic Brain Injury (TBI) are modulated, to some degree, by genetic variants. Aquaporin-4 (AQP4) is the predominant water channel in the central nervous system and plays a critical role in controlling the water content of brain cells and the development of brain edema after TBI. We sought to investigate the influence of the AQP4 gene region on the patients' outcome after TBI by genotyping tag single nucleotide polymorphisms (SNPs) along AQP4 gene. A total of 363 TBI patients (19.6% female) were prospectively evaluated. Data including the Glasgow Coma Scale (GCS) scores at admission, the presence of intracranial haemorrhage and the 6-month Glasgow Outcome Scale (GOS) scores were collected. Seven tag single SNPs across AQP4 gene were identified based on the HapMap data. Using logistic regression analyses SNPs and haplotypes were tested for associations with 6-month GOS after adjusting for age, GCS and gender. Significant associations with TBI outcome was detected for rs3763043 [OR, (95% CI): 5.15, (1.60-16.5), p =0.006, for recessive model] rs3875089 [OR, (95% CI): 0.18, (0.07-0.50) p=0.0009, for allele difference model] and a common haplotype of AQP4 tag SNPs [OR, (95% CI): 2.94, (1.34-6.36), p=0.0065]. AQP4 tag SNPs were not found to influence the initial severity of TBI or the presence of intracranial haemorrhages. In conclusion, the present study provides evidence for possible involvement of genetic variations in AQP4 gene in the functional outcome of TBI patients.

Polymorphisms in genes within the IGF-axis influence antenatal and postnatal growth

Article

Apr 2013

Two pregnancy cohorts were used to investigate the association between single-nucleotide polymorphisms (SNPs) in genes within the insulin-like growth factor (IGF)-axis and antenatal and postnatal growth from birth to adolescence. Longitudinal analyses were conducted in the Raine pregnancy cohort (n = 1162) using repeated measures of fetal head circumference (HC), abdominal circumference (AC) and femur length (FL) from 18 to 38 weeks gestation and eight measures of postnatal height and weight (1–17 years). Replications of significant associations up to birth were undertaken in the Generation R Study (n = 2642). Of the SNPs within the IGF-axis genes, 40% (n = 58) were associated with measures of antenatal growth (P ≤ 0.05). The majority of these SNPs were in receptors; IGF-1R (23%; n = 34) and IGF-2R (13%; n = 9). Fifteen SNPs were associated with antenatal growth (either AC or HC or FL) in Raine (P ≤ 0.005): five of which remained significant after adjusting for multiple testing. Four of these replicated in Generation R. Associations were identified between 38% (n = 55) of the IGF-axis SNPs and postnatal height and weight; 21% in IGF-1R (n = 31) and 9% in IGF-2R (n = 13). Twenty-six SNPs were significantly associated with both antenatal and postnatal growth; 17 with discordant effects and nine with concordant effects. Genetic variants in the IGF-axis appear to play a significant role in antenatal and postnatal growth. Further replication and new analytic methods are required in order to better understand this key metabolic pathway integrating biologic knowledge about the interaction between IGF-axis components.

Analysis of variants in DNA damage signalling genes in bladder cancer

Article

Full-text available

Jul 2008
BMC MED GENET

Chemicals from occupational exposure and components of cigarette smoke can cause DNA damage in bladder urothelium. Failure to repair DNA damage by DNA repair proteins may result in mutations leading to genetic instability and the development of bladder cancer. Immunohistochemistry studies have shown DNA damage signal activation in precancerous bladder lesions which is lost on progression, suggesting that the damage signalling mechanism acts as a brake to further tumorigenesis. Single nucleotide polymorphisms (SNPs) in DSB signalling genes may alter protein function. We hypothesized that SNPs in DSB signalling genes may modulate predisposition to bladder cancer and influence the effects of environmental exposures. We recruited 771 cases and 800 controls (573 hospital-based and 227 population-based from a previous case-control study) and interviewed them regarding their smoking habits and occupational history. DNA was extracted from a peripheral blood sample and genotyping of 24 SNPs in MRE11, NBS1, RAD50, H2AX and ATM was undertaken using an allelic discrimination method (Taqman). Smoking and occupational dye exposure were strongly associated with bladder cancer risk. Using logistic regression adjusting for age, sex, smoking and occupational dye exposure, there was a marginal increase in risk of bladder cancer for an MRE11 3'UTR SNP (rs2155209, adjusted odds ratio 1.54 95% CI (1.13-2.08, p = 0.01) for individuals homozygous for the rare allele compared to those carrying the common homozygous or heterozygous genotype). However, in the hospital-based controls, the genotype distribution for this SNP deviated from Hardy-Weinberg equilibrium. None of the other SNPs showed an association with bladder cancer and we did not find any significant interaction between any of these polymorphisms and exposure to smoking or dye exposure. Apart from a possible effect for one MRE11 3'UTR SNP, our study does not support the hypothesis that SNPs in DSB signaling genes modulate predisposition to bladder cancer.

Association of Interleukin-1 gene polymorphisms with central obesity and metabolic syndrome in a coronary heart disease population

Article

Full-text available

Sep 2008
HUM GENET

The objective of this study was to determine whether single nucleotide polymorphisms (SNPs) in the Interleukin-1 (IL-1) gene family are associated with central obesity and metabolic syndrome in a coronary heart disease population. The IL-1α C-889T (rs1800587) and IL-1β +3954 (rs1143634) SNPs were studied in a Western Australian coronary heart disease (CHD) population (N = 556). Subjects who were TT homozygous at either SNP had larger waist circumference (IL-1α: 1.8 cm greater, P = 0.04; IL-1β: 4 cm greater, P = 0.0004) compared with major allele homozygotes. Individuals with two copies of the IL-1α:IL-1β T:T haplotype had greater waist circumference (4.7 cm greater, P = 0.0001) compared to other haplotypes. There was a significant interaction between the IL-1β SNP and BMI level on waist circumference (P = 0.01). When the cohort was stratified by median BMI, TT carriers for IL-1β with above median BMI had greater waist circumference (6.1 cm greater, P = 0.007) compared to baseline carriers, whilst no significant association was seen in the below median group. Similarly, when the cohort was stratified by median fibrinogen level (IL-1α interaction P = 0.01; IL-1β interaction P = 0.04), TT carriers for both SNPs in the above median fibrinogen group had greater waist circumference (IL-1α 2.7 cm greater, P = 0.007; IL-1β 3.3 cm greater, P = 0.003) compared with major allele homozygotes. This association was not seen in the below median group. Also, we found a trend of increased metabolic syndrome for IL-1β TT homozygotes (P = 0.07). In conclusion, our findings suggest that in a CHD population IL-1 gene polymorphisms may be involved in increased central obesity, and the genetic influences are more evident among patients who have a higher level of obesity or inflammatory markers.

Does High C-reactive Protein Concentration Increase Atherosclerosis? The Whitehall II Study

Article

Full-text available

Feb 2008
PLOS ONE

C-reactive protein (CRP), a marker of systemic inflammation, is associated with risk of coronary events and sub-clinical measures of atherosclerosis. Evidence in support of this link being causal would include an association robust to adjustments for confounders (multivariable standard regression analysis) and the association of CRP gene polymorphisms with atherosclerosis (Mendelian randomization analysis). We genotyped 3 tag single nucleotide polymorphisms (SNPs) [+1444T>C (rs1130864); +2303G>A (rs1205) and +4899T>G (rs 3093077)] in the CRP gene and assessed CRP and carotid intima-media thickness (CIMT), a structural marker of atherosclerosis, in 4941 men and women aged 50-74 (mean 61) years (the Whitehall II Study). The 4 major haplotypes from the SNPs were consistently associated with CRP level, but not with other risk factors that might confound the association between CRP and CIMT. CRP, assessed both at mean age 49 and at mean age 61, was associated both with CIMT in age and sex adjusted standard regression analyses and with potential confounding factors. However, the association of CRP with CIMT attenuated to the null with adjustment for confounding factors in both prospective and cross-sectional analyses. When examined using genetic variants as the instrument for serum CRP, there was no inferred association between CRP and CIMT. Both multivariable standard regression analysis and Mendelian randomization analysis suggest that the association of CRP with carotid atheroma indexed by CIMT may not be causal.

Inflammation, Insulin Resistance, and Diabetes—Mendelian Randomization Using CRP Haplotypes Points Upstream

Article

Full-text available

Sep 2008
PLOS MED

Raised C-reactive protein (CRP) is a risk factor for type 2 diabetes. According to the Mendelian randomization method, the association is likely to be causal if genetic variants that affect CRP level are associated with markers of diabetes development and diabetes. Our objective was to examine the nature of the association between CRP phenotype and diabetes development using CRP haplotypes as instrumental variables. We genotyped three tagging SNPs (CRP + 2302G > A; CRP + 1444T > C; CRP + 4899T > G) in the CRP gene and measured serum CRP in 5,274 men and women at mean ages 49 and 61 y (Whitehall II Study). Homeostasis model assessment-insulin resistance (HOMA-IR) and hemoglobin A1c (HbA1c) were measured at age 61 y. Diabetes was ascertained by glucose tolerance test and self-report. Common major haplotypes were strongly associated with serum CRP levels, but unrelated to obesity, blood pressure, and socioeconomic position, which may confound the association between CRP and diabetes risk. Serum CRP was associated with these potential confounding factors. After adjustment for age and sex, baseline serum CRP was associated with incident diabetes (hazard ratio = 1.39 [95% confidence interval 1.29-1.51], HOMA-IR, and HbA1c, but the associations were considerably attenuated on adjustment for potential confounding factors. In contrast, CRP haplotypes were not associated with HOMA-IR or HbA1c (p = 0.52-0.92). The associations of CRP with HOMA-IR and HbA1c were all null when examined using instrumental variables analysis, with genetic variants as the instrument for serum CRP. Instrumental variables estimates differed from the directly observed associations (p = 0.007-0.11). Pooled analysis of CRP haplotypes and diabetes in Whitehall II and Northwick Park Heart Study II produced null findings (p = 0.25-0.88). Analyses based on the Wellcome Trust Case Control Consortium (1,923 diabetes cases, 2,932 controls) using three SNPs in tight linkage disequilibrium with our tagging SNPs also demonstrated null associations. Observed associations between serum CRP and insulin resistance, glycemia, and diabetes are likely to be noncausal. Inflammation may play a causal role via upstream effectors rather than the downstream marker CRP.

Multiple Imputation After 18+ Years

Article

Jun 1996

Donald B. Rubin

Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entities. The objective is valid frequency inference for ultimate users who in general have access only to complete-data software and possess limited knowledge of specific reasons and models for nonresponse. For this situation and objective, I believe that multiple imputation by the data-base constructor is the method of choice. This article first provides a description of the assumed context and objectives, and second, reviews the multiple imputation framework and its standard results. These preliminary discussions are especially important because some recent commentaries on multiple imputation have reflected either misunderstandings of the practical objectives of multiple imputation or misunderstandings of fundamental theoretical results. Then, criticisms of multiple imputation are considered, and, finally, comparisons are made to alternative strategies.

Fine genetic mapping using haplotype analysis and the missing data problem

Article

Jan 1998
ANN HUM GENET

D. G. CLAYTON

The genetic basis of many human diseases, especially those with substantial genetic determinants, has been identified. Notable amongst others are cystic fibrosis, Huntington's disease and some forms of cancer. However, the detection of genetic factors with more modest effects such as in bipolar disorders and a majority of the cancers, has been more complicated. Standard linkage analysis procedures may not only have little power to detect such genes but they do, at best, only narrow the location of the disease susceptibility gene to a rather large region. Association studies are therefore necessary to further unveil the aetiological relevance of these factors to disease. However, the number of tests required if such procedures were used in extended genome-wide screens, is prohibitive and as such association studies have seen limited application, except in the investigation of candidate genes. In this paper, we discuss a logistic regression approach as a generalization of this procedure so that it can accommodate clusters of linked markers or candidate genes. Furthermore, we introduce an expectation maximization (E–M) algorithm with which to estimate haplotype frequencies for multiple locus systems with incomplete information on phase.

R: A Language for Data Analysis and Graphics

Article

Sep 1996

In this article we discuss our experience designing and implementing a statistical computing language. In developing this new language, we sought to combine what we felt were useful features from two existing computer languages. We feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scoping.

Polymorphisms of the matrix metalloproteinase 9 gene and abdominal aortic aneurysm

Article

Oct 2008
BRIT J SURG

Increased matrix metalloproteinase (MMP) 9 activity has been implicated in the formation of abdominal aortic aneurysm (AAA). The aim was to explore the association between potentially functional variants of the MMP-9 gene and AAA. The -1562C > T and -1811A > T variants of the MMP-9 gene were genotyped in 678 men with an AAA (at least 30 mm in diameter) and 659 control subjects (aortic diameter 19-22 mm) recruited from a population-based trial of screening for AAA. Levels of MMP-9 were measured in a random subset of 300 cases and 84 controls. The association between genetic variants (including haplotypes) and AAA was assessed by multivariable logistic regression. There was no association between the MMP-9-1562C > T (odds ratio (OR) 0.70 (95 per cent confidence interval (c.i.) 0.27 to 1.82)) or -1811A > T (OR 0.71 (95 per cent c.i. 0.28 to 1.85)) genotypes, or the most common haplotype (OR 0.81 (95 per cent c.i. 0.62 to 1.05)) and AAA. The serum MMP-9 concentration was higher in cases than controls, and in minor allele carriers in cases and controls, although the differences were not statistically significant. In this study, the genetic tendency to higher levels of circulating MMP-9 was not associated with AAA.

Association of PARL rs3732581 genetic variant with insulin levels, metabolic syndrome and coronary artery disease

Article

Sep 2008
HUM GENET

PARL (presenilin-associated rhomboid-like) is a mitochondrial protein involved in mitochondrial membrane remodelling, and maps to a quantitative trait locus (3q27) associated with metabolic traits. Recently the rs3732581 (Leu262Val) variant was found to be associated with increased levels of plasma insulin, a finding not replicated in a larger cohort. The aim of the current study was to investigate the associations between rs3732581 and levels of plasma insulin, metabolic syndrome (MetS) and its components, and cardiovascular disease. The CUPID population consisted of 556 subjects with angiographically proven CAD and the CUDAS cohort consisted of 1,109 randomly selected individuals from Perth, Western Australia. Samples were genotyped using mutation-specific PCR. No significant associations were observed between rs3732581 and levels of plasma insulin, glucose, BMI or MetS in either population. However, carriers of the minor allele had significantly lower mean intima-media thickness (IMT) [0.69 mm, 95% CI (0.69, 0.70 mm); P = 0.004], compared with major allele homozygotes [mean IMT = 0.71 mm, 95% CI (0.70, 0.72 mm)] in the CUDAS population. Further analysis using a recessive model showed homozygous carriers of the minor allele were predisposed to CAD [OR 1.55, 95% CI (1.11, 2.16); P = 0.01]. Despite the functional evidence for a role of PARL in regulating insulin levels, no association with rs3732581 was found in the current study. Additionally, there were no associations with glucose levels, BMI or MetS. There were significant effects of the variant on mean IMT and risk of CAD. A role for PARL in metabolic conditions cannot be excluded and more comprehensive genetic studies are warranted.

Fine genetic mapping using haplotypes and the missing data problem

Article

Feb 1998

The genetic basis of many human diseases, especially those with substantial genetic determinants, has been identified. Notable amongst others are cystic fibrosis, Huntington's disease and some forms of cancer. However, the detection of genetic factors with more modest effects such as in bipolar disorders and a majority of the cancers, has been more complicated. Standard linkage analysis procedures may not only have little power to detect such genes but they do, at best, only narrow the location of the disease susceptibility gene to a rather large region. Association studies are therefore necessary to further unveil the aetiological relevance of these factors to disease. However, the number of tests required if such procedures were used in extended genome-wide screens, is prohibitive and as such association studies have seen limited application, except in the investigation of candidate genes. In this paper, we discuss a logistic regression approach as a generalization of this procedure so that it can accommodate clusters of linked markers or candidate genes. Furthermore, we introduce an expectation maximization (E-M) algorithm with which to estimate haplotype frequencies for multiple locus systems with incomplete information on phase.

SimHap GUI: An intuitive graphical user interface for genetic association analysis

Abstract and Figures

Supplementary resources (8)

Recommended publications

A Graphical User Interface for BIOEQS: A Program for Simulating and Analyzing Complex Bio-molecular...

High-throughput bioinformatics with the Cyrille2 pipeline system

Comparing Response Time, Errors, and Satisfaction Between Text-based and Graphical User Interfaces D...

MATLAB and graphical user interfaces: tools for experimental management