ArticlePDF Available

SimHap GUI: An intuitive graphical user interface for genetic association analysis

Authors:

Abstract and Figures

Researchers wishing to conduct genetic association analysis involving single nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language, provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a graphical user interface that allows anyone but a professional statistician to effectively utilise the tool. We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI features a novel workflow interface that guides the user through each logical step of the analysis process, making it accessible to both novice and advanced users. This tool provides a seamless interface to the SimHap R package, while providing enhanced functionality such as sophisticated data checking, automated data conversion, and real-time estimations of haplotype simulation progress. SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a range of genetic and non-genetic association analyses. This provides a free alternative to commercial statistics packages that is specifically designed for genetic association analysis.
Content may be subject to copyright.
BioMed Central
Page 1 of 6
(page number not for citation purposes)
BMC Bioinformatics
Open Access
Software
SimHap GUI: An intuitive graphical user interface for genetic
association analysis
Kim W Carter*1,4, Pamela A McCaskie2,3 and Lyle J Palmer3
Address: 1Western Australian Institute for Medical Research and UWA Centre for Medical Research, University of Western Australia, Perth,
Australia, 2School of Mathematics and Statistics, University of Western Australia, Perth, Australia, 3Centre for Genetic Epidemiology and
Biostatistics, University of Western Australia, Perth, Australia and 4Telethon Institute for Child Health Research, UWA Centre for Child Health
Research, University of Western Australia, 100 Roberts Rd, Subiaco, Western Australia 6008, Australia
Email: Kim W Carter* - kcarter@ichr.uwa.edu.au; Pamela A McCaskie - pmccask@cyllene.uwa.edu.au; Lyle J Palmer - lyle@cyllene.uwa.edu.au
* Corresponding author
Abstract
Background: Researchers wishing to conduct genetic association analysis involving single
nucleotide polymorphisms (SNPs) or haplotypes are often confronted with the lack of user-friendly
graphical analysis tools, requiring sophisticated statistical and informatics expertise to perform
relatively straightforward tasks. Tools, such as the SimHap package for the R statistics language,
provide the necessary statistical operations to conduct sophisticated genetic analysis, but lacks a
graphical user interface that allows anyone but a professional statistician to effectively utilise the
tool.
Results: We have developed SimHap GUI, a cross-platform integrated graphical analysis tool for
conducting epidemiological, single SNP and haplotype-based association analysis. SimHap GUI
features a novel workflow interface that guides the user through each logical step of the analysis
process, making it accessible to both novice and advanced users. This tool provides a seamless
interface to the SimHap R package, while providing enhanced functionality such as sophisticated
data checking, automated data conversion, and real-time estimations of haplotype simulation
progress.
Conclusion: SimHap GUI provides a novel, easy-to-use, cross-platform solution for conducting a
range of genetic and non-genetic association analyses. This provides a free alternative to
commercial statistics packages that is specifically designed for genetic association analysis.
Background
While the growth in the volume of genetic data available
has led to many new discoveries, it is becoming increas-
ingly important to find ways in which to easily analyse
large of volumes of data. This is certainly the case with
genetic association studies, where high-throughput geno-
typing technologies have brought about the potential for
hundreds of thousands of data points per individual sub-
ject [1].
A graphical user interface (GUI) is still a rare feature
amongst currently available genetic analysis packages,
particularly those used to analyse single nucleotide poly-
morphisms (SNPs) or haplotypes. A well designed user
Published: 25 December 2008
BMC Bioinformatics 2008, 9:557 doi:10.1186/1471-2105-9-557
Received: 29 September 2008
Accepted: 25 December 2008
This article is available from: http://www.biomedcentral.com/1471-2105/9/557
© 2008 Carter et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557
Page 2 of 6
(page number not for citation purposes)
interface would allow users without a comprehensive
knowledge of statistical modelling or command line oper-
ation to perform complex analyses.
Commercially available statistics software packages, such
as SPSS (SPSS Inc., 2008) and Stata (StataCorp. 2008),
may be useful, but are not specifically designed to analyse
genetic data, requiring sophisticated prior knowledge for
the end-user. Another major annoyance is the lack of inte-
gration between statistical and analytical packages [2],
often with one program required for epidemiological
analysis, a separate program for SNP analysis, and a third
used for haplotype analysis.
SimHap [3] is a statistical analysis package for genetic asso-
ciation testing, available in R [4], which amongst other
features, infers haplotypes for unrelated individuals with
unknown phase. Although various programs currently
exist for haplotype analysis, SimHap is unique in a
number of ways. It uses a multiple-imputation (MI) based
approach to test for association, which incorporates infor-
mation about uncertainty around inferred haplotypes.
This approach uses current expectation maximisation
(EM) methods for the estimation of haplotype frequen-
cies from unphased genotype data [5]. To utilize the pos-
terior distribution of diplotype (a haplotype pair)
probabilities, the MI approach of Rubin [6] was imple-
mented, where a series of "complete" data sets are gener-
ated containing all data from the original set as well as
additional dummy variables for each haplotype, the val-
ues of which indicate the number of copies of that haplo-
type observed in an individual's diplotype (0, 1 or 2). For
individuals with known phase (only one diplotype), the
values for these haplotype variables remain constant for
each of the generated data sets. For individuals with
ambiguous phase, their haplotype values will be sampled
from their predictive distribution, containing only those
diplotypes consistent with their genotypes. This is a novel
approach that provides an empirical distribution of the
haplotypic effects and their significance levels.
We have developed SimHap GUI as an intuitive graphical
tool for conducting genetic association analysis. At its
core, SimHap GUI utilises the SimHap R package and the
R statistical language. SimHap GUI is a novel custom-
designed integrated tool for conducting epidemiological,
single SNP and haplotype-based association analyses
within a single application, and provides a free alternative
to commercially available statistics packages.
Results and discussion
Implementation
SimHap GUI is written in Java (requires Java 1.5+) and
will operate on platforms where Java is available. This tool
has been successfully tested on Windows, Linux and
MacOS operating systems. SimHap GUI requires an
installation of the R statistics lanuguage (2.4.0+) and an
installation of the SimHap R package. This tool runs opti-
mally on a computer with a monitor resolution of 1024 ×
768, at least 128 Mb of RAM and a Pentium 4+ CPU. Sim-
Hap GUI has been successfully operated on datasets with
thousands of individuals, hundreds of phenotype varia-
bles, and thousands of SNPs. SimHap GUI is generally
only limited by the amount of system memory available
to Java.
The SimHap GUI interface is written in Java Swing, and
uses the Synthetica look-and-feel suite [7] to enhance the
useability and functionality of the interface (compared
with standard Swing interfaces). We have also utilised the
Swing Worker [8] library, which provides a mechanism
for providing updates to the user interface while running
long analytical tasks, such as performing thousands of
haplotype simulations. Both Synthetica and Swing
Worker are provided with the SimHap GUI installation.
SimHap GUI is provided as a single cross-platform
installer, using the IzPack [9] packaging system, which
provides a simple standardised graphical installer tool
that both technical and non-technical users will be com-
fortable with.
Graphical User Interface (GUI)
SimHap GUI allows the user to conduct association anal-
ysis of binary, quantitative, longitudinal and survival
(right-censored) outcomes using phenotypic data, and
genetic SNP data and haplotype data, in unrelated indi-
viduals.
One key feature of SimHap GUI is the workflow interface,
which guides the user through each logical step of the ana-
lytical process. This workflow concept is central to provid-
ing an intuitive user interface accessible to both novice
and advanced users.
The user initially selects a standard comma separated
value (CSV) file containing phenotypic information for a
set of individuals (one row of data per person), as can be
obtained from most spreadsheet and statistics software.
The user also selects a CSV file containing genotypes for a
series of SNP markers for the same individuals (not
required for non-genetic modelling), and selects the char-
acter(s) signifying missing data in the input files. SimHap
GUI examines the input files to ensure correct formatting,
completeness, and the correct corresponding individual
identifier between phenotype and genotype files. Geno-
type files are examined to ensure biallelic SNPs are input,
where the user is given the option to remove multi-allelic
markers. Once data checking is complete, the user can
choose to perform epidemiological modelling (without
genetic markers), single SNP association analysis, or hap-
BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557
Page 3 of 6
(page number not for citation purposes)
lotype association analysis. Users are guided through each
of these analytical tasks in a straight-forward series of
steps, with a standardised model building screen central
to each of the analysis types.
Figure 1 is an example of the model building screen for a
single SNP analysis with a quantitative outcome using
SimHap GUI. At the top of the screen, hdl (cholesterol)
has been selected as the outcome of interest, with the out-
come normally distributed (Untransformed). Log base 10
and natural log of the outcome are available to transform
non-Normally distributed outcomes. In the MAIN
EFFECTS section are the available and selected covariates
for this model, namely sex, age, bmi and smoke. Covariates
can also be added as squared or cubic terms, logged (base
10 or natural log), and as factors (for categorical terms). In
the GENOTYPES section are the available and selected
SNPs to be analysed in the model. SNP covariates are
denoted with the S_ prefix, while the _add, _dom and _rec
terms refer to analysing the SNP under an additive, dom-
inant or recessive genetic model. SNPs can also be ana-
lysed under a codominant model by adding the SNP as a
factor. In the INTERACTIONS section are available and
selected covariate terms to be analysed for statistical inter-
actions; in this case, an interaction between sex and SNP_1
under a codominant model. Additional files 1, 2, 3, 4, 5,
6, 7, 8 provide a graphical representation of each of the
phases of analysis for an example single SNP analysis. The
SimHap GUI software manual also provides a detailed
description of the analysis process.
Case Studies
SimHap GUI, and its earlier Beta 1 and Beta 2.1 releases,
have been extensively utilised in a range of genetics
projects recently published.
In the area of cancer research, SimHap GUI has been used
in studies such as Sak et al [10], to examine the association
Example SimHap GUI model building interfaceFigure 1
Example SimHap GUI model building interface.
BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557
Page 4 of 6
(page number not for citation purposes)
between polymorphisms in the XPC gene and bladder
cancer susceptibility. Choudhury et al [11] also examined
haplotypes of DNA repair proteins to find genetic variants
that may modulate predisposition to bladder cancer.
SimHap GUI has been used extensively in the field of car-
diovascular disease genetics. Several studies has used this
tool to examine SNP and haplotype effects of genes
related to abdominal aortic aneurysm [12-14]. Studies by
both Horne et al [15] and McCaskie et al [16] have used
SimHap GUI to investigate the association between
genetic variation in the cholesteryl ester transfer protein
gene and cardiovascular disease. SimHap GUI has also
been used to investigate SNP and haplotype associations
with metabolic syndrome [17-20] and atherosclerosis
[21-24] related outcomes.
In the area of genetic epidemiology related to the Mende-
lian Randomization (MR) technique, a number of groups
have utilised SimHap GUI. Brunner and colleagues [25]
used SimHap GUI to generate haplotypes for three tagging
polymorphisms from the C-reactive protein (CRP) gene in
a study of 5,274 men and women. Studies by Lawlor et al
[26] and Kivimaki et al [27] similarly this software for
analysis of CRP mutations using MR.
Other diverse studies include the use of SimHap GUI to
investigate genetic influences of the melanocortin 1 recep-
tor with sensitivity to photochemotherapy [28], polymor-
phisms within the macrophage migration inhibitory
factor with relation to acute lung injury in patients with
sepsis [29], associations between cytokine polymor-
phisms and outcomes after renal transplantation [30],
and genetic predictors for the development of microalbu-
minuria in children with type 1 diabetes [31].
The wide range of example publications described here
highlights the significance of the SimHap GUI software
providing an easy-to-use powerful interface for both nov-
ice and advanced genetic association analyses.
GUI versus R package
One of the critical distinctions to make with the SimHap
GUI software is the difference between the SimHap R pack-
age, and the Java based interface described in this manu-
script. The backend SimHap R package simply provides the
statistical operations to conduct particular analytical
tasks, with the onus on the user to have technical knowl-
edge of the statistical methods being employed and exper-
tise with the command line interface of the R language.
Users who are not professional statisticians may be dis-
couraged by the difficulty of operating under a command-
line interface.
The SimHap GUI interface provides the functionality,
accessibility and the guided analytical approach that can-
not be found in the command line package. The user
interface is designed around the premise of a workflow
analysis model, which mimics the logical analytical proc-
esses required to conduct a particular statistical test. This
user-friendly, intuitive interface has been designed to sat-
isfy the needs of both the technical and non-technical sta-
tistical user, and does not require sophisticated
informatics knowledge to operate. Using the novel model
building interface, users can perform tasks ranging from
simple univariate linear modelling, through to more
sophisticated tasks such as multivariate modelling of lon-
gitudinal outcomes with gene:gene and gene:environ-
ment interactions. A standardised interface is provide for
users to conduct epidemiological (no genetics factors),
single SNP and haplotype association analyses.
Features of SimHap GUI that are not provided in the Sim-
Hap R package include: an intuitive GUI for model build-
ing and guiding the overall analysis process; sophisticated
data checking of phenotype and genotype data; automatic
conversion of data for single SNP and haplotype associa-
tion analysis; automatic calculation of allele frequencies
and genotype distribution; quantile-quantile plotting for
Normality of quantitative traits; and real-time estimation
of the haplotype imputation simulation progress. Sim-
Hap GUI implements all of the functions from the Sim-
Hap R package.
Conclusion
In summary, SimHap GUI provides a cross-platform, intu-
itive and integrated interface for conducting a range of
genetic and non-genetic association analyses.
Availability and requirements
- Project name: SimHap GUI
- Project home page: http://www.genepi.org.au/simhap
- Operating system(s): Platform independent (tested on
Windows, Linux and MacOS)
- Programming language: Java
- Other requirements: Java 1.5+; R 2.4.0+ (available from
http://www.r-project.org/); SimHap R package from
CRAN (available from http://cran.r-project.org/web/pack
ages/SimHap/index.html)
- Licence: Free for non-commercial use
Authors' contributions
KWC designed and developed the Java GUI interface.
PAM assisted with integration of statistical methods and
aided with design of the GUI. LJP supervised the design
and coordinated the development of the software.
BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557
Page 5 of 6
(page number not for citation purposes)
Additional material
Acknowledgements
KWC was supported by the Australian Research Council Discovery
Project DP0663247. This work was supported by the National Health and
Medical Research Council of Australia Project Grant 404009.
References
1. Hirschhorn JN, Daly MJ: Genome-wide association studies for
common diseases and complex traits. Nat Rev Genet 2005,
6(2):95-108.
2. Excoffier L, Heckel G: Computer programs for population
genetics data analysis: a survival guide. Nat Rev Genet 2006,
7(10):745-758.
3. CRAN – SimHap package [http://cran.r-project.org/web/pack
ages/SimHap/index.html]
4. Ihaka R, Gentleman R: R: A Language for Data Analysis and
Graphics. Journal of Computational and Graphical Statistics 1996,
5(3):299-314.
5. Chiano MN, Clayton DG: Fine genetic mapping using haplotype
analysis and the missing data problem. Ann Hum Genet 1998,
62(Pt 1):55-60.
6. Rubin DB: Multiple Imputation After 18+ Years. Journal of the
American Statistical Association 1996, 91(434):473-489.
7. Synthetica – Java Look and Feel [http://www.javasoft.de/jsf/pub
lic/products/synthetica]
8. SwingWorker [https://swingworker.dev.java.net/]
9. IzPack – Package once, Deploy everywhere [http://
www.izpack.org/]
10. Sak SC, Barrett JH, Paul AB, Bishop DT, Kiltie AE: Comprehensive
analysis of 22 XPC polymorphisms and bladder cancer risk.
Cancer Epidemiol Biomarkers Prev 2006, 15(12):2537-2541.
11. Choudhury A, Elliott F, Iles MM, Churchman M, Bristow RG, Bishop
DT, Kiltie AE: Analysis of variants in DNA damage signalling
genes in bladder cancer. BMC Med Genet 2008, 9:69.
12. Golledge J, Muller J, Shephard N, Clancy P, Smallwood L, Moran C,
Dear AE, Palmer LJ, Norman PE: Association between osteopon-
tin and human abdominal aortic aneurysm. Arterioscler Thromb
Vasc Biol 2007, 27(3):655-660.
13. Smallwood L, Allcock R, van Bockxmeer F, Warrington N, Palmer LJ,
Iacopetta B, Golledge J, Norman PE: Polymorphisms of the
matrix metalloproteinase 9 gene and abdominal aortic aneu-
rysm. Br J Surg 2008, 95(10):1239-1244.
14. Smallwood L, Allcock R, van Bockxmeer F, Warrington N, Palmer LJ,
Iacopetta B, Norman PE: Polymorphisms of the interleukin-6
gene promoter and abdominal aortic aneurysm. Eur J Vasc
Endovasc Surg 2008, 35(1):31-36.
15. Horne BD, Camp NJ, Anderson JL, Mower CP, Clarke JL, Kolek MJ,
Carlquist JF: Multiple less common genetic variants explain
the association of the cholesteryl ester transfer protein gene
with coronary artery disease. J Am Coll Cardiol 2007,
49(20):2053-2060.
16. McCaskie PA, Beilby JP, Chapman CM, Hung J, McQuillan BM,
Thompson PL, Palmer LJ: Cholesteryl ester transfer protein
gene haplotypes, plasma high-density lipoprotein levels and
the risk of coronary heart disease. Hum Genet 2007, 121(3–
4):401-411.
17. Carter KW, Hung J, Powell BL, Wiltshire S, Foo BT, Leow YC,
McQuillan BM, Jennens M, McCaskie PA, Thompson PL, Beilby JP,
Palmer LJ: Association of Interleukin-1 gene polymorphisms
with central obesity and metabolic syndrome in a coronary
heart disease population. Hum Genet 2008, 124(3):199-206.
Additional file 1
SimHap GUI file selection screen. This screenshot shows the selection of
phenotype and genotype CSV files for analysis in SimHap GUI.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S1.png]
Additional file 2
SimHap GUI input parameter selection screen. Following selection of
input files, this screenshot shows the user specifying input parameters, and
a summary of the input data file characteristics.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S2.png]
Additional file 3
SimHap GUI major allele selection screen. After the user has selected to
perform a 'single SNP' analysis, the user can specify the major allele for
polymorphism in the input genotype file (as illustrated in this screenshot).
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S3.png]
Additional file 4
SimHap GUI normality plots. This screenshot shows the user checking
whether quantitative variables to be analysed are normally distributed.
This screen option is available when the user is ready to select a particular
type of outcome (binary, quantitative, longitudinal and right-censored)
for analysis.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S4.png]
Additional file 5
SimHap GUI model building screen for single SNP analysis. This
screenshot shows the model building screen in SimHap GUI, where the
user has selected to analyse a quantitative outcome (HDL), and has
selected various covariates (SEX, AGE, BMI, SMOKE) and a polymor-
phism of interest (SNP1).
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S5.png]
Additional file 6
SimHap GUI model parameters. This screenshot shows the display pre-
sented after the model building screen, where the user can specify addi-
tional subset parameters, and other statistical parameters.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S6.png]
Additional file 7
SimHap GUI results summary. After the user has built their desired sta-
tistical model, SimHap GUI runs the analysis, and the summary results
are presented as illustrated in this screenshot. Statistically significant
results are highlighted in red for easy identification.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S7.png]
Additional file 8
SimHap GUI detailed results summary. The screenshot shows the
detailed statistical information provided, in addition to the summary sta-
tistics described in the previous figure. For example, marginal means by
genotype group are provided in this detailed summary.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2105-9-557-S8.png]
Publish with BioMed Central and ever y
scientist can read your work free of charge
"BioMed Central will be the most significant development for
disseminating the results of biomedical research in our lifetime."
Sir Paul Nurse, Cancer Research UK
Your research papers will be:
available free of charge to the entire biomedical community
peer reviewed and published immediately upon acceptance
cited in PubMed and archived on PubMed Central
yours — you keep the copyright
Submit your manuscript here:
http://www.biomedcentral.com/info/publishing_adv.asp
BioMedcentral
BMC Bioinformatics 2008, 9:557 http://www.biomedcentral.com/1471-2105/9/557
Page 6 of 6
(page number not for citation purposes)
18. Powell BL, Wiltshire S, Arscott G, McCaskie PA, Hung J, McQuillan
BM, Thompson PL, Carter KW, Palmer LJ, Beilby JP: Association of
PARL rs3732581 genetic variant with insulin levels, meta-
bolic syndrome and coronary artery disease. Hum Genet 2008,
124(3):263-270.
19. Thompson SR, McCaskie PA, Beilby JP, Hung J, Jennens M, Chapman
C, Thompson P, Humphries SE: IL18 haplotypes are associated
with serum IL-18 concentrations in a population-based study
and a cohort of individuals with premature coronary heart
disease. Clin Chem 2007, 53(12):2078-2085.
20. Thompson SR, Sanders J, Stephens JW, Miller GJ, Humphries SE: A
common interleukin 18 haplotype is associated with higher
body mass index in subjects with diabetes and coronary
heart disease. Metabolism 2007, 56(5):662-669.
21. McCaskie PA, Beilby JP, Hung J, Chapman CM, McQuillan BM, Powell
BL, Thompson PL, Palmer LJ: 15-Lipoxygenase gene variants are
associated with carotid plaque but not carotid intima-media
thickness. Hum Genet 2008, 123(5):445-453.
22. McCaskie PA, Cadby G, Hung J, McQuillan BM, Chapman CM, Carter
KW, Thompson PL, Palmer LJ, Beilby JP: The C-480T hepatic
lipase polymorphism is associated with HDL-C but not with
risk of coronary heart disease. Clin Genet 2006, 70(2):114-121.
23. Wiltshire S, Powell BL, Jennens M, McCaskie PA, Carter KW, Palmer
LJ, Thompson PL, McQuillan BM, Hung J, Beilby JP: Investigating the
association between K198N coding polymorphism in EDN1
and hypertension, lipoprotein levels, the metabolic syn-
drome and cardiovascular disease. Hum Genet 2008,
123(3):307-313.
24. Xiao J, Zhang F, Wiltshire S, Hung J, Jennens M, Beilby JP, Thompson
PL, McQuillan BM, McCaskie PA, Carter KW, Palmer LJ, Powell BL:
The apolipoprotein AII rs5082 variant is associated with
reduced risk of coronary artery disease in an Australian male
population. Atherosclerosis 2008, 199(2):333-339.
25. Brunner EJ, Kivimaki M, Witte DR, Lawlor DA, Davey Smith G,
Cooper JA, Miller M, Lowe GD, Rumley A, Casas JP, Shah T, Hum-
phries SE, Hingorani AD, Marmot MG, Timpson NJ, Kumari M:
Inflammation, insulin resistance, and diabetes – Mendelian
randomization using CRP haplotypes points upstream. PLoS
Med 2008, 5(8):e155.
26. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G:
Mendelian randomization: Using genes as instruments for
making causal inferences in epidemiology. Statistics in Medicine
2008, 27(8):1133-1163.
27. Kivimaki M, Lawlor DA, Smith GD, Kumari M, Donald A, Britton A,
Casas JP, Shah T, Brunner E, Timpson NJ, Halcox JP, Miller MA, Hum-
phries SE, Deanfield J, Marmot MG, Hingorani AD: Does high C-
reactive protein concentration increase atherosclerosis?
The Whitehall II Study. PLoS ONE 2008, 3(8):e3013.
28. Smith G, Wilkie MJ, Deeni YY, Farr PM, Ferguson J, Wolf CR, Ibbot-
son SH: Melanocortin 1 receptor (MC1R) genotype influences
erythemal sensitivity to psoralen-ultraviolet A photochemo-
therapy. Br J Dermatol 2007, 157(6):1230-1234.
29. Gao L, Flores C, Fan-Ma S, Miller EJ, Moitra J, Moreno L, Wadgaonkar
R, Simon B, Brower R, Sevransky J, Tuder RM, Maloney JP, Moss M,
Shanholtz C, Yates CR, Meduri GU, Ye SQ, Barnes KC, Garcia JG:
Macrophage migration inhibitory factor in acute lung injury:
expression, biomarker, and associations. Transl Res 2007,
150(1):18-29.
30. Thakkinstian A, Dmitrienko S, Gerbase-Delima M, McDaniel DO,
Inigo P, Chow KM, McEvoy M, Ingsathit A, Trevillian P, Barber WH,
Attia J: Association between cytokine gene polymorphisms
and outcomes in renal transplantation: a meta-analysis of
individual patient data. Nephrol Dial Transplant 2008,
23(9):3017-3023.
31. Gallego PH, Shephard N, Bulsara MK, van Bockxmeer FM, Powell BL,
Beilby JP, Arscott G, Le Page M, Palmer LJ, Davis EA, Jones TW,
Choong CS: Angiotensinogen gene T235 variant: a marker for
the development of persistent microalbuminuria in children
and adolescents with type 1 diabetes mellitus. J Diabetes Com-
plications 2008, 22(3):191-198.
... Haplotype analyses were performed using SimHap GUI version 1.0.2 (Carter et al, 2008). Gene-environment (G Â E) interactions were tested through stratified analysis and verified with the Wald method by introducing a multiplicative interaction term into the model and assessing its significance. ...
Article
Full-text available
Background: Increased serum levels of vitamin D and calcium have been associated with lower risks of colorectal cancer (CRC) incidence and mortality. These inverse associations may be mediated by the vitamin D receptor (VDR) and the calcium-sensing receptor (CASR). We investigated genetic variants in VDR and CASR for their relevance to CRC prognosis. Methods: A population-based cohort of 531 CRC patients diagnosed from 1999 to 2003 in Newfoundland and Labrador, Canada, was followed for mortality and cancer recurrence until April 2010. Germline DNA samples were genotyped with the Illumina Omni-Quad 1 Million chip. Multivariate Cox models assessed 41 tag single-nucleotide polymorphisms and relative haplotypes on VDR and CASR in relation to all-cause mortality (overall survival, OS) and disease-free survival (DFS). Results: Gene-level associations were observed between VDR and the DFS of rectal cancer patients (P=0.037) as well as between CASR and the OS of colon cancer patients (P=0.014). Haplotype analysis within linkage blocks of CASR revealed the G-G-G-G-G-A-C haplotype (rs10222633-rs10934578-rs3804592-rs17250717-A986S-R990G-rs1802757) to be associated with a decreased OS of colon cancer (HR, 3.15; 95% CI, 1.66-5.96). Potential interactions were seen among prediagnostic dietary calcium intake with the CASR R990G (Pint=0.040) and the CASR G-T-G-G-G-G-C haplotype for rs10222633-rs10934578-rs3804592-rs17250717-A986S-R990G-rs1802757 (Pint=0.017), with decreased OS time associated with these variants limited to patients consuming dietary calcium below the median, although the stratified results were not statistically significant after correction for multiple testing. Conclusions: Polymorphic variations in VDR and CASR may be associated with survival after a diagnosis of CRC.British Journal of Cancer advance online publication, 1 August 2017; doi:10.1038/bjc.2017.242 www.bjcancer.com.
... Power Calculator for Genetic Studies (Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA). On the basis of the expectation maximization algorithm, haplotype frequencies were estimated within each LD block using the SimHap program (Centre for Genetic Epidemiology and Biostatistics, Western Australian Institute for Medical Research, Crawley, Australia) [15] and the SHEsis platform [16] (http://analysis.bio-x.cn). Only common (frequency > 0.05) haplotypes were used in the analyses. ...
Article
Genetic variants appear to influence, at least to some degree, the extent of brain injury and the clinical outcome of patients who have sustained a traumatic brain injury (TBI). Angiotensin-converting enzyme (ACE) is a zinc metallopeptidase that is implicated in the regulation of blood pressure and cerebral circulation. ACE gene polymorphisms were found to regulate serum ACE enzyme activity. The present study aimed to investigate possible influence of ACE gene region variants on patients' outcome after TBI. In total, 363 TBI patients prospectively enrolled in the study were genotyped for five tag single nucleotide polymorphisms (SNPs) across the ACE gene. Using logistic regression analyses, tag SNPs and their constructed haplotypes were tested for associations with 6-month Glasgow Outcome Scale scores, after adjustment for age, sex, Glasgow Coma Scale scores at admission, and the presence of a hemorrhagic event in the initial computed tomography scan. Significant effects on TBI outcome were found for three neighboring tag SNPs in the codominant (genotypic) model of inheritance [rs4461142: odds ratio (OR) 0.26, 95% confidence interval (CI) 0.12-0.57, P=0.0001; rs7221780: OR 2.67, 95% CI 1.25-5.72, P=0.0003; and rs8066276: OR 3.82, 95% CI 1.80-8.13, P=0.0002; for the heterozygous variants compared with the common alleles]. None of the constructed common tag SNPs haplotypes was associated with TBI outcome. The present study provides evidence of the possible influence of genetic variations in a specific region of the ACE gene on the outcome of TBI patients. This association may have pharmacogenetic implications in identifying those TBI patients who may benefit from ACE inhibition.
... All computations were performed using SimHap v. 1.0.2 software (Carter et al. 2008). All statistical analyses were performed using Stata 11.0 (StataCorp, USA). ...
... Generalized linear models were used to analyze the effects of multiple covariates on a continuous outcome. Haplotypes were inferred for individuals with ambiguous phase, and the haplotype frequencies were estimated and analyzed using the SimHap software [23]. P-values ,0.0125 were designated as significant based on the Bonferroni correction for multiple comparisons of the four htSNPs, which were evaluated as independent statistical tests using the simpleM algorithm [24]. ...
Article
Full-text available
We investigated the association between polymorphisms and haplotypes of the chymase 1 gene (CMA1) and the left ventricular mass index (LVM/BSA) in a large cohort of patients with aortic stenosis (AS). Additionally, the gender differences in cardiac remodeling and hypertrophy were analyzed. The genetic background may affect the myocardial response to pressure overload. In human cardiac tissue, CMA1 is involved in angiotensin II production and TGF-β activation, which are two major players in the pathogenesis of hypertrophy and fibrosis. Preoperative echocardiographic data from 648 patients with significant symptomatic AS were used. The LVM/BSA was significantly lower (p<0.0001), but relative wall thickness (RWT) was significantly higher (p = 0.0009) in the women compared with the men. The haplotypes were reconstructed using six genotyped polymorphisms: rs5248, rs4519248, rs1956932, rs17184822, rs1956923, and rs1800875. The haplotype h1.ACAGGA was associated with higher LVM/BSA (p = 9.84×10-5), and the haplotype h2.ATAGAG was associated with lower LVM/BSA (p = 0.0061) in men, and no significant differences were found in women. Two polymorphisms within the promoter region of the CMA1 gene, namely rs1800875 (p = 0.0067) and rs1956923 (p = 0.0015), influenced the value of the LVM/BSA in males. The polymorphisms and haplotypes of the CMA1 locus are associated with cardiac hypertrophy in male patients with symptomatic AS. Appropriate methods for the indexation of heart dimensions revealed substantial sex-related differences in the myocardial response to pressure overload.
... Adiponectin levels were normalised using a natural logarithm transformation prior to analysis. Associations between transformed values and genotypes at each tSNP were examined using a generalised linear model approach implemented in SimHap [25]. Each polymorphism was modelled as a genotypic (codominant) genetic effect, accommodating the effects of age, gender, and BMI as significant covariates. ...
Article
Full-text available
Background Low levels of serum adiponectin have been linked to central obesity, insulin resistance, metabolic syndrome, and type 2 diabetes. Variants in ADIPOQ, the gene encoding adiponectin, have been shown to influence serum adiponectin concentration, and along with variants in the adiponectin receptors (ADIPOR1 and ADIPOR2) have been implicated in metabolic syndrome and type 2 diabetes. This study aimed to comprehensively investigate the association of common variants in ADIPOQ, ADIPOR1 and ADIPOR2 with serum adiponectin and insulin resistance syndromes in a large cohort of European-Australian individuals. Methods Sixty-four tagging single nucleotide polymorphisms in ADIPOQ, ADIPOR1 and ADIPOR2 were genotyped in two general population cohorts consisting of 2,355 subjects, and one cohort of 967 subjects with type 2 diabetes. The association of tagSNPs with outcomes were evaluated using linear or logistic modelling. Meta-analysis of the three cohorts was performed by random-effects modelling. Results Meta-analysis revealed nine genotyped tagSNPs in ADIPOQ significantly associated with serum adiponectin across all cohorts after adjustment for age, gender and BMI, including rs10937273, rs12637534, rs1648707, rs16861209, rs822395, rs17366568, rs3774261, rs6444175 and rs17373414. The results of haplotype-based analyses were also consistent. Overall, the variants in the ADIPOQ gene explained <5% of the variance in serum adiponectin concentration. None of the ADIPOR1/R2 tagSNPs were associated with serum adiponectin. There was no association between any of the genetic variants and insulin resistance or metabolic syndrome. A multi-SNP genotypic risk score for ADIPOQ alleles revealed an association with 3 independent SNPs, rs12637534, rs16861209, rs17366568 and type 2 diabetes after adjusting for adiponectin levels (OR=0.86, 95% CI=(0.75, 0.99), P=0.0134). Conclusions Genetic variation in ADIPOQ, but not its receptors, was associated with altered serum adiponectin. However, genetic variation in ADIPOQ and its receptors does not appear to contribute to the risk of insulin resistance or metabolic syndrome but did for type 2 diabetes in a European-Australian population.
Article
Purpose: The factors that are responsible for the extend of cardiac hypertrophy in patients with aortic stenosis (AS) are not well defined and a polygenic background is suggested. Angiotensin II (AngII) and transforming growth factor-b (TGF-b) are considered to be crucial factors in the pathogenesis of fibrotic and hypertrophic remodeling in the pressure-overloaded heart. Chymase is not only the major Ang II–generating enzyme in the human heart but also is able to activate TGFb in the cardiac tissue. The goal of this study was to examine the association between genetic variation of the CMA1 locus and left ventricular hypertrophy in a large cohort of patients with aortic stenosis. Methods: A total of 648 patients with significant, symptomatic AS were studied. Preoperative echocardiographic data were used to calculate left ventricular mass index (LVMI). Comprehensive tagging SNP and haplotypes approach was implemented. Polymorphisms were genotyped using SNPlex or TaqMan methodology. Generalized-linear models were applied to analyze data using crude or full model adjusted for sex, age, ejection fraction, weight, maximal aortic gradient. P-values were designated as significant for p < 0.0125 on the basis of Bonferroni correction for independent tests. Results: Haplotypes were reconstructed using six genotyped rs5248, rs4519248, rs1956932, rs17184822, rs1956923, rs1800875. The haplotype h1.ACAGGA (51.46%) was associated with higher LVMI (p=0.0006), in contrast h2.ATAGAG (23.09%) had protective effect (p=0.0113) in additive genetic model in all patients. The associations were stronger among male patients: h1.ACAGGA (p=9.84x10-5 and p=0.0012, crude and full model respectively) and h2.ATAGAG (p=0.0061 and p=0.0161) no significant differences were found in women. Two SNPs located in the promoter region of the CMA1 gene rs1800875 (p=0.0067) and rs1956923 (p=0.0015) showed an association with the LVMI in additive genetic model in males Conclusions: Genetic variants of CMA1 locus are associated with cardiac hypertrophy in male patients with symptomatic aortic stenosis. This gender effect could be explained by a profound differences in the TGFb pathway activation between women and men in response to pressure overload.
Article
The availability of high-throughput technologies, such as next generation sequencing and microarray, and the diffusion of genomics studies to large populations are producing an increasing amount of experimental data. In particular, pharmacogenomics studies the impact of genetic variation on drug response in patients and correlates gene expression or single nucleotide polymorphisms (SNPs) with the toxicity or efficacy of a drug, with the aim to improve drug therapy with respect to the patients' genotype ensuring maximum efficacy with minimal adverse effects. However, the storage, preprocessing, and analysis of experimental data are becoming a main bottleneck in the pharmacogenomics analysis pipeline, due to the increasing number of genes and patients investigated. This paper presents a new parallel software tool named coreSNP for the parallel preprocessing and statistical analysis of DMET (Drug Metabolism Enzymes and Transporters) SNP microarray data produced by Affymetrix for pharmacogenomics studies. The scalable multi-threaded implementation of coreSNP allows to handle the huge volumes of experimental pharmacogenomics data in a very efficient way, while its easy to use graphical user interface and its ability to annotate significant SNPs allow biologists to interpret the results easily. Performance evaluation conducted using real datasets shows good speed-up and scalability and effective response times.
Article
Full-text available
Accumulating evidence suggests that the extent of brain injury and the clinical outcome after Traumatic Brain Injury (TBI) are modulated, to some degree, by genetic variants. Aquaporin-4 (AQP4) is the predominant water channel in the central nervous system and plays a critical role in controlling the water content of brain cells and the development of brain edema after TBI. We sought to investigate the influence of the AQP4 gene region on the patients' outcome after TBI by genotyping tag single nucleotide polymorphisms (SNPs) along AQP4 gene. A total of 363 TBI patients (19.6% female) were prospectively evaluated. Data including the Glasgow Coma Scale (GCS) scores at admission, the presence of intracranial haemorrhage and the 6-month Glasgow Outcome Scale (GOS) scores were collected. Seven tag single SNPs across AQP4 gene were identified based on the HapMap data. Using logistic regression analyses SNPs and haplotypes were tested for associations with 6-month GOS after adjusting for age, GCS and gender. Significant associations with TBI outcome was detected for rs3763043 [OR, (95% CI): 5.15, (1.60-16.5), p =0.006, for recessive model] rs3875089 [OR, (95% CI): 0.18, (0.07-0.50) p=0.0009, for allele difference model] and a common haplotype of AQP4 tag SNPs [OR, (95% CI): 2.94, (1.34-6.36), p=0.0065]. AQP4 tag SNPs were not found to influence the initial severity of TBI or the presence of intracranial haemorrhages. In conclusion, the present study provides evidence for possible involvement of genetic variations in AQP4 gene in the functional outcome of TBI patients.
Article
Two pregnancy cohorts were used to investigate the association between single-nucleotide polymorphisms (SNPs) in genes within the insulin-like growth factor (IGF)-axis and antenatal and postnatal growth from birth to adolescence. Longitudinal analyses were conducted in the Raine pregnancy cohort (n = 1162) using repeated measures of fetal head circumference (HC), abdominal circumference (AC) and femur length (FL) from 18 to 38 weeks gestation and eight measures of postnatal height and weight (1–17 years). Replications of significant associations up to birth were undertaken in the Generation R Study (n = 2642). Of the SNPs within the IGF-axis genes, 40% (n = 58) were associated with measures of antenatal growth (P ≤ 0.05). The majority of these SNPs were in receptors; IGF-1R (23%; n = 34) and IGF-2R (13%; n = 9). Fifteen SNPs were associated with antenatal growth (either AC or HC or FL) in Raine (P ≤ 0.005): five of which remained significant after adjusting for multiple testing. Four of these replicated in Generation R. Associations were identified between 38% (n = 55) of the IGF-axis SNPs and postnatal height and weight; 21% in IGF-1R (n = 31) and 9% in IGF-2R (n = 13). Twenty-six SNPs were significantly associated with both antenatal and postnatal growth; 17 with discordant effects and nine with concordant effects. Genetic variants in the IGF-axis appear to play a significant role in antenatal and postnatal growth. Further replication and new analytic methods are required in order to better understand this key metabolic pathway integrating biologic knowledge about the interaction between IGF-axis components.
Article
Full-text available
Chemicals from occupational exposure and components of cigarette smoke can cause DNA damage in bladder urothelium. Failure to repair DNA damage by DNA repair proteins may result in mutations leading to genetic instability and the development of bladder cancer. Immunohistochemistry studies have shown DNA damage signal activation in precancerous bladder lesions which is lost on progression, suggesting that the damage signalling mechanism acts as a brake to further tumorigenesis. Single nucleotide polymorphisms (SNPs) in DSB signalling genes may alter protein function. We hypothesized that SNPs in DSB signalling genes may modulate predisposition to bladder cancer and influence the effects of environmental exposures. We recruited 771 cases and 800 controls (573 hospital-based and 227 population-based from a previous case-control study) and interviewed them regarding their smoking habits and occupational history. DNA was extracted from a peripheral blood sample and genotyping of 24 SNPs in MRE11, NBS1, RAD50, H2AX and ATM was undertaken using an allelic discrimination method (Taqman). Smoking and occupational dye exposure were strongly associated with bladder cancer risk. Using logistic regression adjusting for age, sex, smoking and occupational dye exposure, there was a marginal increase in risk of bladder cancer for an MRE11 3'UTR SNP (rs2155209, adjusted odds ratio 1.54 95% CI (1.13-2.08, p = 0.01) for individuals homozygous for the rare allele compared to those carrying the common homozygous or heterozygous genotype). However, in the hospital-based controls, the genotype distribution for this SNP deviated from Hardy-Weinberg equilibrium. None of the other SNPs showed an association with bladder cancer and we did not find any significant interaction between any of these polymorphisms and exposure to smoking or dye exposure. Apart from a possible effect for one MRE11 3'UTR SNP, our study does not support the hypothesis that SNPs in DSB signaling genes modulate predisposition to bladder cancer.
Article
Full-text available
The objective of this study was to determine whether single nucleotide polymorphisms (SNPs) in the Interleukin-1 (IL-1) gene family are associated with central obesity and metabolic syndrome in a coronary heart disease population. The IL-1α C-889T (rs1800587) and IL-1β +3954 (rs1143634) SNPs were studied in a Western Australian coronary heart disease (CHD) population (N = 556). Subjects who were TT homozygous at either SNP had larger waist circumference (IL-1α: 1.8 cm greater, P = 0.04; IL-1β: 4 cm greater, P = 0.0004) compared with major allele homozygotes. Individuals with two copies of the IL-1α:IL-1β T:T haplotype had greater waist circumference (4.7 cm greater, P = 0.0001) compared to other haplotypes. There was a significant interaction between the IL-1β SNP and BMI level on waist circumference (P = 0.01). When the cohort was stratified by median BMI, TT carriers for IL-1β with above median BMI had greater waist circumference (6.1 cm greater, P = 0.007) compared to baseline carriers, whilst no significant association was seen in the below median group. Similarly, when the cohort was stratified by median fibrinogen level (IL-1α interaction P = 0.01; IL-1β interaction P = 0.04), TT carriers for both SNPs in the above median fibrinogen group had greater waist circumference (IL-1α 2.7 cm greater, P = 0.007; IL-1β 3.3 cm greater, P = 0.003) compared with major allele homozygotes. This association was not seen in the below median group. Also, we found a trend of increased metabolic syndrome for IL-1β TT homozygotes (P = 0.07). In conclusion, our findings suggest that in a CHD population IL-1 gene polymorphisms may be involved in increased central obesity, and the genetic influences are more evident among patients who have a higher level of obesity or inflammatory markers.
Article
Full-text available
C-reactive protein (CRP), a marker of systemic inflammation, is associated with risk of coronary events and sub-clinical measures of atherosclerosis. Evidence in support of this link being causal would include an association robust to adjustments for confounders (multivariable standard regression analysis) and the association of CRP gene polymorphisms with atherosclerosis (Mendelian randomization analysis). We genotyped 3 tag single nucleotide polymorphisms (SNPs) [+1444T>C (rs1130864); +2303G>A (rs1205) and +4899T>G (rs 3093077)] in the CRP gene and assessed CRP and carotid intima-media thickness (CIMT), a structural marker of atherosclerosis, in 4941 men and women aged 50-74 (mean 61) years (the Whitehall II Study). The 4 major haplotypes from the SNPs were consistently associated with CRP level, but not with other risk factors that might confound the association between CRP and CIMT. CRP, assessed both at mean age 49 and at mean age 61, was associated both with CIMT in age and sex adjusted standard regression analyses and with potential confounding factors. However, the association of CRP with CIMT attenuated to the null with adjustment for confounding factors in both prospective and cross-sectional analyses. When examined using genetic variants as the instrument for serum CRP, there was no inferred association between CRP and CIMT. Both multivariable standard regression analysis and Mendelian randomization analysis suggest that the association of CRP with carotid atheroma indexed by CIMT may not be causal.
Article
Full-text available
Raised C-reactive protein (CRP) is a risk factor for type 2 diabetes. According to the Mendelian randomization method, the association is likely to be causal if genetic variants that affect CRP level are associated with markers of diabetes development and diabetes. Our objective was to examine the nature of the association between CRP phenotype and diabetes development using CRP haplotypes as instrumental variables. We genotyped three tagging SNPs (CRP + 2302G > A; CRP + 1444T > C; CRP + 4899T > G) in the CRP gene and measured serum CRP in 5,274 men and women at mean ages 49 and 61 y (Whitehall II Study). Homeostasis model assessment-insulin resistance (HOMA-IR) and hemoglobin A1c (HbA1c) were measured at age 61 y. Diabetes was ascertained by glucose tolerance test and self-report. Common major haplotypes were strongly associated with serum CRP levels, but unrelated to obesity, blood pressure, and socioeconomic position, which may confound the association between CRP and diabetes risk. Serum CRP was associated with these potential confounding factors. After adjustment for age and sex, baseline serum CRP was associated with incident diabetes (hazard ratio = 1.39 [95% confidence interval 1.29-1.51], HOMA-IR, and HbA1c, but the associations were considerably attenuated on adjustment for potential confounding factors. In contrast, CRP haplotypes were not associated with HOMA-IR or HbA1c (p = 0.52-0.92). The associations of CRP with HOMA-IR and HbA1c were all null when examined using instrumental variables analysis, with genetic variants as the instrument for serum CRP. Instrumental variables estimates differed from the directly observed associations (p = 0.007-0.11). Pooled analysis of CRP haplotypes and diabetes in Whitehall II and Northwick Park Heart Study II produced null findings (p = 0.25-0.88). Analyses based on the Wellcome Trust Case Control Consortium (1,923 diabetes cases, 2,932 controls) using three SNPs in tight linkage disequilibrium with our tagging SNPs also demonstrated null associations. Observed associations between serum CRP and insulin resistance, glycemia, and diabetes are likely to be noncausal. Inflammation may play a causal role via upstream effectors rather than the downstream marker CRP.
Article
Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entities. The objective is valid frequency inference for ultimate users who in general have access only to complete-data software and possess limited knowledge of specific reasons and models for nonresponse. For this situation and objective, I believe that multiple imputation by the data-base constructor is the method of choice. This article first provides a description of the assumed context and objectives, and second, reviews the multiple imputation framework and its standard results. These preliminary discussions are especially important because some recent commentaries on multiple imputation have reflected either misunderstandings of the practical objectives of multiple imputation or misunderstandings of fundamental theoretical results. Then, criticisms of multiple imputation are considered, and, finally, comparisons are made to alternative strategies.
Article
The genetic basis of many human diseases, especially those with substantial genetic determinants, has been identified. Notable amongst others are cystic fibrosis, Huntington's disease and some forms of cancer. However, the detection of genetic factors with more modest effects such as in bipolar disorders and a majority of the cancers, has been more complicated. Standard linkage analysis procedures may not only have little power to detect such genes but they do, at best, only narrow the location of the disease susceptibility gene to a rather large region. Association studies are therefore necessary to further unveil the aetiological relevance of these factors to disease. However, the number of tests required if such procedures were used in extended genome-wide screens, is prohibitive and as such association studies have seen limited application, except in the investigation of candidate genes. In this paper, we discuss a logistic regression approach as a generalization of this procedure so that it can accommodate clusters of linked markers or candidate genes. Furthermore, we introduce an expectation maximization (E–M) algorithm with which to estimate haplotype frequencies for multiple locus systems with incomplete information on phase.
Article
In this article we discuss our experience designing and implementing a statistical computing language. In developing this new language, we sought to combine what we felt were useful features from two existing computer languages. We feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scoping.
Article
Increased matrix metalloproteinase (MMP) 9 activity has been implicated in the formation of abdominal aortic aneurysm (AAA). The aim was to explore the association between potentially functional variants of the MMP-9 gene and AAA. The -1562C > T and -1811A > T variants of the MMP-9 gene were genotyped in 678 men with an AAA (at least 30 mm in diameter) and 659 control subjects (aortic diameter 19-22 mm) recruited from a population-based trial of screening for AAA. Levels of MMP-9 were measured in a random subset of 300 cases and 84 controls. The association between genetic variants (including haplotypes) and AAA was assessed by multivariable logistic regression. There was no association between the MMP-9-1562C > T (odds ratio (OR) 0.70 (95 per cent confidence interval (c.i.) 0.27 to 1.82)) or -1811A > T (OR 0.71 (95 per cent c.i. 0.28 to 1.85)) genotypes, or the most common haplotype (OR 0.81 (95 per cent c.i. 0.62 to 1.05)) and AAA. The serum MMP-9 concentration was higher in cases than controls, and in minor allele carriers in cases and controls, although the differences were not statistically significant. In this study, the genetic tendency to higher levels of circulating MMP-9 was not associated with AAA.
Article
PARL (presenilin-associated rhomboid-like) is a mitochondrial protein involved in mitochondrial membrane remodelling, and maps to a quantitative trait locus (3q27) associated with metabolic traits. Recently the rs3732581 (Leu262Val) variant was found to be associated with increased levels of plasma insulin, a finding not replicated in a larger cohort. The aim of the current study was to investigate the associations between rs3732581 and levels of plasma insulin, metabolic syndrome (MetS) and its components, and cardiovascular disease. The CUPID population consisted of 556 subjects with angiographically proven CAD and the CUDAS cohort consisted of 1,109 randomly selected individuals from Perth, Western Australia. Samples were genotyped using mutation-specific PCR. No significant associations were observed between rs3732581 and levels of plasma insulin, glucose, BMI or MetS in either population. However, carriers of the minor allele had significantly lower mean intima-media thickness (IMT) [0.69 mm, 95% CI (0.69, 0.70 mm); P = 0.004], compared with major allele homozygotes [mean IMT = 0.71 mm, 95% CI (0.70, 0.72 mm)] in the CUDAS population. Further analysis using a recessive model showed homozygous carriers of the minor allele were predisposed to CAD [OR 1.55, 95% CI (1.11, 2.16); P = 0.01]. Despite the functional evidence for a role of PARL in regulating insulin levels, no association with rs3732581 was found in the current study. Additionally, there were no associations with glucose levels, BMI or MetS. There were significant effects of the variant on mean IMT and risk of CAD. A role for PARL in metabolic conditions cannot be excluded and more comprehensive genetic studies are warranted.
Article
The genetic basis of many human diseases, especially those with substantial genetic determinants, has been identified. Notable amongst others are cystic fibrosis, Huntington's disease and some forms of cancer. However, the detection of genetic factors with more modest effects such as in bipolar disorders and a majority of the cancers, has been more complicated. Standard linkage analysis procedures may not only have little power to detect such genes but they do, at best, only narrow the location of the disease susceptibility gene to a rather large region. Association studies are therefore necessary to further unveil the aetiological relevance of these factors to disease. However, the number of tests required if such procedures were used in extended genome-wide screens, is prohibitive and as such association studies have seen limited application, except in the investigation of candidate genes. In this paper, we discuss a logistic regression approach as a generalization of this procedure so that it can accommodate clusters of linked markers or candidate genes. Furthermore, we introduce an expectation maximization (E-M) algorithm with which to estimate haplotype frequencies for multiple locus systems with incomplete information on phase.