To read the full-text of this research, you can request a copy directly from the authors.
Freckles or ephelides are hyperpigmented spots observed on skin surface mainly in European and Asian populations. Easy recognition and external visibility make prediction of ephelides, the potentially useful target in the field of forensic DNA phenotyping. Prediction of freckles would be a step forward in sketching the physical appearance of unknown perpetrators or decomposed cadavers for the forensic DNA intelligence purposes. Freckles are especially common in people with pale skin and red hair and therefore it is expected that predisposition to freckles may partially share the genetic background with other pigmentation traits. The first proposed freckle prediction model was developed based on investigation that involved variation of MC1R and 8 SNPs from 7 genes in a Spanish cohort . In this study we examined 113 DNA variants from 46 genes previously associated with human pigmentation traits and assessed their impact on freckles presence in a group of 960 individuals from Poland. Nineteen DNA variants revealed associations with the freckle phenotype and the study also revealed that females have ∼1.8 higher odds of freckles presence comparing to males (p-value = 9.5 × 10-5). Two alternative prediction models were developed using regression methods. A simplified binomial 12-variable model predicts the presence of ephelides with cross-validated AUC = 0.752. A multinomial 14-variable model predicts one of three categories - non-freckled, medium freckled and heavily freckled. The two extreme categories, non-freckled and heavily freckled were predicted with moderately high accuracy of cross-validated AUC = 0.754 and 0.792, respectively. Prediction accuracy of the intermediate category was lower, AUC = 0.657. The study presents novel DNA models for prediction of freckles that can be used in forensic investigations and emphasizes significance of pigmentation genes and sex in predictive DNA analysis of freckles.
To read the full-text of this research, you can request a copy directly from the authors.
... Thus far, for eye, hair and skin color various underlying genes have been identified, predictive DNA markers have been identified, DNA tests suitable for analyzing such genetic markers in forensic DNA samples and statistical prediction models have been developed , and some of these DNA test systems have been forensically validated [9,11,12]. For traits such as freckles and hair structure, some associated genetic markers and the first predictive models have already been published, respectively ; however, no forensically validated tool has been established so far. Prediction models for some other EVCs are currently under investigation . ...
... Here, we assess the impact of incorporating prior knowledge on EVC trait prevalence in a Bayesian setting on improving the accuracy of DNAbased EVC prediction, but also potential pitfalls caused by misspecification of such prior probabilities. To this end, we consider EVCs such as eye, hair and skin color for which prior-free genetic prediction models have previously been established [9,11,12], but also traits such as hair structure and freckles for which the first prediction models were recently proposed without considering priors [13,15,16]. Given the sparsity or even lack of spatial or population-specific prevalence information available for each of these EVCs , we investigated the impact of prevalence-informed priors across a grid in the complete space of all possible values for each trait category, thereby emulating the (mis-) specification of the informative prior values. ...
... In particular, for eye color prediction, we used the 6 SNPs from the previous IrisPlex eye color model ; for hair color prediction we used the 22 hair color informative SNPs from the previous HIrisPlex hair color model ; for skin color prediction, we used the 36 Table 1 EVC-specific data sets used for prediction model training and testing with and without the use of prevalence-informed priors. skin color informative SNPs from the previous HIrisPlex-S skin color model ; for hair shape prediction, we used the 38 SNPs from the previous EUROFORGEN study on hair shape prediction ; and for freckles prediction, we used the 13 out of the 22 SNPs recently proposed for this purpose by Kukla-Bartoszek . Not using the remaining 9 previously proposed freckles DNA predictors is explained by data availability and quality control issues (see below). ...
The prediction of appearance traits by use of solely genetic information has become an established approach and a number of statistical prediction models have already been developed for this purpose. However, given limited knowledge on appearance genetics, currently available models are incomplete and do not include all causal genetic variants as predictors. Therefore such prediction models may benefit from the inclusion of additional information that acts as a proxy for this unknown genetic background. Use of priors, possibly informed by trait category prevalence values in biogeographic ancestry groups, in a Bayesian framework may thus improve the prediction accuracy of previously predicted externally visible characteristics, but has not been investigated as of yet. In this study, we assessed the impact of using trait prevalence-informed priors on the prediction performance in Bayesian models for eye, hair and skin color as well as hair structure and freckles in comparison to the respective prior-free models. Those prior-free models were either similarly defined either very close to the already established ones by using a reduced predictive marker set. However, these differences in the number of the predictive markers should not affect significantly our main outcomes. We observed that such priors often had a strong effect on the prediction performance, but to varying degrees between different traits and also different trait categories, with some categories barely showing an effect. While we found potential for improving the prediction accuracy of many of the appearance trait categories tested by using priors, our analyses also showed that misspecification of those prior values often severely diminished the accuracy compared to the respective prior-free approach. This emphasizes the importance of accurate specification of prevalence-informed priors in Bayesian prediction modeling of appearance traits. However, the existing literature knowledge on spatial prevalence is sparse for most appearance traits, including those investigated here. Due to the limitations in appearance trait prevalence knowledge, our results render the use of trait prevalence-informed priors in DNA-based appearance trait prediction currently infeasible.
... In recent years, intensive research has been carried out on the prediction of various human appearance characteristics [e.g. . The most significant progress was made in the prediction of pigmentation characteristics, and eye colour in particular . ...
... Because of technical problems, four SNPs were replaced by SNPs in LD (rs2004775-> rs60247077 in RBFOX1, rs7762830-> rs743589 in MYB, rs224223-> rs224219 in MEFV and rs12052928-> rs9636495 in ANKRD36). DNA libraries were prepared manually and sequenced as described previously . Missing SNP data were at the level of 0.2% and were imputed using the 'missForest' method in R v3.5.2 (with a total number of trees equal to 500). ...
Increasing understanding of human genome variability allows for better use of the predictive potential of DNA. An obvious direct application is the prediction of the physical phenotypes. Significant success has been achieved, especially in predicting pigmentation characteristics, but the inference of some phenotypes is still challenging. In search of further improvements in predicting human eye colour, we conducted whole-exome (enriched in regulome) sequencing of 150 Polish samples to discover new markers. For this, we adopted quantitative characterization of eye colour phenotypes using high-resolution photographic images of the iris in combination with DIAT software analysis. An independent set of 849 samples was used for subsequent predictive modelling. Newly identified candidates and 114 additional literature-based selected SNPs, previously associated with pigmentation, and advanced machine learning algorithms were used. Whole-exome sequencing analysis found 27 previously unreported candidate SNP markers for eye colour. The highest overall prediction accuracies were achieved with LASSO-regularized and BIC-based selected regression models. A new candidate variant, rs2253104, located in the ARFIP2 gene and identified with the HyperLasso method, revealed predictive potential and was included in the best-performing regression models. Advanced machine learning approaches showed a significant increase in sensitivity of intermediate eye colour prediction (up to 39%) compared to 0% obtained for the original IrisPlex model. We identified a new potential predictor of eye colour and evaluated several widely used advanced machine learning algorithms in predictive analysis of this trait. Our results provide useful hints for developing future predictive models for eye colour in forensic and anthropological studies.
... The tool was originally developed for the statistical interpretation of data in ancestry inference studies, but a number of new functionalities have subsequently been added to enable the prediction of pigmentation and even age . A more complete prediction of pigmentation will be provided by the developed algorithms for freckle prediction [111,112]. It is worth noting that the use of extended DNA variant sets for prediction has begun to be explored, which may lead to the development of next-generation prediction tools. ...
The idea of forensic DNA intelligence is to extract from genomic data any information that can help guide the investigation. The clues to the externally visible phenotype are of particular practical importance. The high heritability of the physical phenotype suggests that genetic data can be easily predicted, but this has only become possible with less polygenic traits. The forensic community has developed DNA-based predictive tools by employing a limited number of the most important markers analysed with targeted massive parallel sequencing. The complexity of the genetics of many other appearance phenotypes requires big data coupled with sophisticated machine learning methods to develop accurate genomic predictors. A significant challenge in developing universal genomic predictive methods will be the collection of sufficiently large data sets. These should be created using whole-genome sequencing technology to enable the identification of rare DNA variants implicated in phenotype determination. It is worth noting that the correctness of the forensic sketch generated from the DNA data depends on the inclusion of an age factor. This, however, can be predicted by analysing epigenetic data. An important limitation preventing whole-genome approaches from being commonly used in forensics is the slow progress in the development and implementation of high-throughput, low DNA input sequencing technologies. The example of palaeoanthropology suggests that such methods may possibly be developed in forensics.
... Based on genetic predictors previously correlated with human pigmentation, Kukla-Bartoszek et al.  developed a predictive model for freckle presence, divided into three categories, non-, medium-, and heavily-freckled, and obtained a moderate accuracy (respectively, AUC = 0.75, 0.66 and 0.79). ...
Numerous major advances have been made in forensic genetics over the past decade. One recent field of research has been focused on the analysis of External Visible Characteristics (EVC) such as eye colour, hair colour (including hair greying), hair morphology, skin colour, freckles, facial morphology, high myopia, obesity, and adult height, with important repercussions in the forensic field. Its use could be especially useful in investigative cases where there are no potential suspects and no match between the evidence DNA sample under investigation and any genetic profiles entered into criminal databases. The present review represents the current state of knowledge of SNPs (Single Nucleotide Polymorphisms) regarding visible characteristics, including the latest research progress in identifying new genetic markers, their most promising applications in the forensic field and the implications for police investigations. The applicability of these techniques to concrete cases has stoked a heated debate in the literature on the ethical implications of using these predictive tools for visible traits.
... Aged bones are challenging forensic samples because a limited yield of DNA is usually recovered from them , their level of degradation is high, and DNA typing success is also constrained by the inhibitors that are present and possible contamination with contemporary DNA [1,18,52]. Improvements in DNA analysis using next-generation sequencing (NGS) technology paired with a multidisciplinary approach have offered an range of effective methods for disaster victim identification, or DVI [32,50], and have introduced robust means to determine certain characteristics of the deceased, such as pigmentation traits, including eye, skin, and hair color [8,23,49], freckles , baldness [16,25], height , and biogeographical ancestry . Questions regarding the victims of wars and various disasters-either natural, or crime-related-that seemed virtually impossible to answer in the past are therefore gaining new prospects for resolution. ...
Genetic identification of a Slovenian prewar elite couple killed in 1944 was performed by typing autosomal and Y-chromosomal STRs, and phenotypic HIrisPlex SNPs for hair and eye color prediction were analyzed for the female skeleton using next-generation sequencing (NGS) technology. The clandestine grave containing the couple’s skeletal remains was found in 2015 and only the partial remains were found. Living distant relatives could be found only for the male victim. Because of a lack of comparative reference samples, it was not possible to identify the female victim through autosomal and mitochondrial DNA typing. However, the possibility of comparison of eye and hair color with a painting exhibited in the City Museum of Ljubljana by the prominent Slovenian painter Ivana Kobilca existed. Nuclear DNA obtained from the samples was quantified using the PowerQuant System, and then STR typing was carried out with different autosomal and Y-STR kits. From 0.09 to 9.36 ng DNA/g of powder was obtained from teeth and bones analyzed. Complete autosomal and Y-STR profiles made it possible to identify the male skeleton via comparison with two nephews. For the female victim, predicted eye and hair color was compared to colors on the painting. Kobilca’s painting confirms the genetically predicted eye and hair color. After more than seventy years, the skeletal remains of the couple were handed over to their relatives, who buried the victims with dignity in a family grave.
... Panels containing SNPs associated with phenotype [22, and ancestry [21, have been introduced, and recently assays combining those have been published [23,24,52]. Forensic DNA phenotyping is now legislated in different European countries , and it is expected that police investigators will seek expert opinions on samples of interest. ...
Single-cell sequencing is a fast developing and very promising field; however, it is not commonly used in forensics. The main motivation behind introducing this technology into forensics is to improve mixture deconvolution, especially when a trace consists of the same cell type. Successful studies demonstrate the ability to analyze a mixture by separating single cells and obtaining CE-based STR profiles. This indicates a potential use of the method in other forensic investigations, like forensic DNA phenotyping, in which using mixed traces is not fully recommended. For this study, we collected single-source autopsy blood from which the white cells were first stained and later separated with the DEPArray™ N×T System. Groups of 20, 10, and 5 cells, as well as 20 single cells, were collected and submitted for DNA extraction. Libraries were prepared using the Ion AmpliSeq™ PhenoTrivium Panel, which includes both phenotype (HIrisPlex-S: eye, hair, and skin color) and ancestry-associated SNP-markers. Prior to sequencing, half of the single-cell-based libraries were additionally amplified and purified in order to improve the library concentrations. Ancestry and phenotype analysis resulted in nearly full consensus profiles resulting in correct predictions not only for the cells groups but also for the ten re-amplified single-cell libraries. Our results suggest that sequencing of single cells can be a promising tool used to deconvolute mixed traces submitted for forensic DNA phenotyping.
... Forensic genetics currently stands in front of a new era of DNA analysis as Massively Parallel Sequencing (MPS) is becoming a more commonly used tool for DNA analysis. The enhanced multiplexing capabilities of MPS technology coupled with the ability to analyze a variety of marker types has led to increased research and use of single nucleotide polymorphisms (SNPs) to predict externally visible characteristics (EVCs) and biogeographical ancestry (BGA) from a DNA sample . To implement the new capabilities in DNA testing, legal changes are obligatory for SNP analysis by MPS to be applied in new cases. ...
As the field of forensic DNA analysis has started to transition from genetics to genomics, new methods to aid in crime scene investigations have arisen. The development of informative single nucleotide polymorphism (SNP) markers has led the forensic community to question if DNA can be a reliable "eye-witness" and whether the data it provides can shed light on unknown perpetrators. We have developed an assay called the Ion AmpliSeq™ PhenoTrivium Panel, which combines three groups of markers: 41 phenotype-and 163 ancestry-informative autosomal SNPs together with 120 lineage-specific Y-SNPs. Here, we report the results of testing the assay's sensitivity and the predictions obtained for known reference samples. Moreover, we present the outcome of a blind study performed on real casework samples in order to understand the value and reliability of the information that would be provided to police investigators. Furthermore, we evaluated the accuracy of admixture prediction in Converge™ Software. The results show the panel to be a robust and sensitive assay which can be used to analyze casework samples. We conclude that the combination of the obtained predictions of phenotype, biogeographical ancestry, and male lineage can serve as a potential lead in challenging police investigations such as cold cases or cases with no suspect.
... Previously, predictive DNA markers were identified and statistical prediction models were developed for all pigmentation-related traits, namely eye colour , head hair colour [9,, skin colour , eyebrow colour , and freckles [27,28]. The recently established HIrisPlex-S system  currently represents the most complete DNA-based pigmentation prediction tool, allowing simultaneous prediction of eye, head hair, and skin colour from DNA, including low quality and low quantity forensic DNA, based on 41 carefully selected DNA markers and three separate prediction models. ...
Predicting appearance phenotypes from genotypes is relevant for various areas of human genetic research and applications such as genetic epidemiology, human history, anthropology, and particularly in forensics. Many appearance phenotypes, and thus their underlying genotypes, are highly correlated, with pigmentation traits serving as primary examples. However, all available genetic prediction models, including those for pigmentation traits currently used in forensic DNA phenotyping, ignore phenotype correlations. Here, we investigated the impact of appearance phenotype correlations on genetic appearance prediction in the exemplary case of three pigmentation traits. We used data for categorical eye, hair and skin colour as well as 41 DNA markers utilized in the recently established HIrisPlex-S system from 762 individuals with complete phenotype and genotype information. Based on these data, we performed genetic prediction modelling of eye, hair and skin colour via three different strategies, namely the established approach of predicting phenotypes solely based on genotypes while not considering phenotype correlations, and two novel approaches that considered phenotype correlations, either incorporating truly observed correlated phenotypes or DNA-predicted correlated phenotypes in addition to the DNA predictors. We found that using truly observed correlated pigmentation phenotypes as additional predictors increased the DNA-based prediction accuracies for almost all eye, hair and skin colour categories, with the largest increase for intermediate eye colour, brown hair colour, dark to black skin colour, and particularly for dark skin colour. Outcomes of dedicated computer simulations suggest that this prediction accuracy increase is due to the additional genetic information that is implicitly provided by the truly observed correlated pigmentation phenotypes used, yet not covered by the DNA predictors applied. In contrast, considering DNA-predicted correlated pigmentation phenotypes as additional predictors did not improve the performance of the genetic prediction of eye, hair and skin colour, which was in line with the results from our computer simulations. Hence, in practical applications of DNA-based appearance prediction where no phenotype knowledge is available, such as in forensic DNA phenotyping, it is not advised to use DNA-predicted correlated phenotypes as predictors in addition to the DNA predictors. In the very least, this is not recommended for the pigmentation traits and the established pigmentation DNA predictors tested here.
... IRF4 has been previously associated with various appearance traits including hair colour, freckles and hair loss e.g. . IRF4 encodes interferon regulatory factor that interacts with the MITF transcription factor. ...
Greying of the hair is an obvious sign of human aging. In addition to age, sex- and ancestry-specific patterns of hair greying are also observed and the progression of greying may be affected by environmental factors. However, little is known about the genetic control of this process. This study aimed to assess the potential of genetic data to predict hair greying in a population of nearly 1000 individuals from Poland.
The study involved whole-exome sequencing followed by targeted analysis of 378 exome-wide and literature-based selected SNPs. For the selection of predictors, the minimum redundancy maximum relevance (mRMRe) method was used, and then two prediction models were developed. The models included age, sex and 13 unique SNPs. Two SNPs of the highest mRMRe score included whole-exome identified KIF1A rs59733750 and previously linked with hair loss FGF5 rs7680591. The model for greying vs. no greying prediction achieved accuracy of cross-validated AUC = 0.873. In the 3-grade classification cross-validated AUC equalled 0.864 for no greying, 0.791 for mild greying and 0.875 for severe greying. Although these values present fairly accurate prediction, most of the prediction information was brought by age alone. Genetic variants explained < 10% of hair greying variation and the impact of particular SNPs on prediction accuracy was found to be small.
The rate of changes in human progressive traits shows inter-individual variation, therefore they are perceived as biomarkers of the biological age of the organism. The knowledge on the mechanisms underlying phenotypic aging can be of special interest to the medicine, cosmetics industry and forensics. Our study improves the knowledge on the genetics underlying hair greying processes, presents prototype models for prediction and proves hair greying being genetically a very complex trait. Finally, we propose a four-step approach based on genetic and epigenetic data analysis allowing for i) sex determination; ii) genetic ancestry inference; iii) greying-associated SNPs assignment and iv) epigenetic age estimation, all needed for a final prediction of greying.
... FDP can provide investigative leads, when standard DNA identification is not possible, for example, due to the lack of a suspect's reference sample or a lack of matches in National DNA Databases [3,4]. Recent studies focused on fine-tuning the selection of DNA markers to predict physical appearance, e.g., . The most successful studies developed models for eye, hair and skin color with promising prediction performance analyzing only a few dozen single nucleotide polymorphisms (SNPs) . ...
The study of DNA to predict externally visible characteristics (EVCs) and the biogeographical ancestry (BGA) from unknown samples is gaining relevance in forensic genetics. Technical developments in Massively Parallel Sequencing (MPS) enable the simultaneous analysis of hundreds of DNA markers, which improves successful Forensic DNA Phenotyping (FDP). The EU-funded VISAGE (VISible Attributes through GEnomics) Consortium has developed various targeted MPS-based lab tools to apply FDP in routine forensic analyses. Here, we present an evaluation of the VISAGE Basic tool for appearance and ancestry prediction based on PowerSeq chemistry (Promega) on a MiSeq FGx System (Illumina). The panel consists of 153 single nucleotide polymorphisms (SNPs) that provide information about EVCs (41 SNPs for eye, hair and skin color from HIrisPlex-S) and continental BGA (115 SNPs; three overlap with the EVCs SNP set). The assay was evaluated for sensitivity, repeatability and genotyping concordance, as well as its performance with casework-type samples. This targeted MPS assay provided complete genotypes at all 153 SNPs down to 125 pg of input DNA and 99.67% correct genotypes at 50 pg. It was robust in terms of repeatability and concordance and provided useful results with casework-type samples. The results suggest that this MPS assay is a useful tool for basic appearance and ancestry prediction in forensic genetics for users interested in applying PowerSeq chemistry and MiSeq for this purpose.
To date, there has been little study of comparison between picosecond 532 nm laser and 755 nm Q-switched Alexandrite lasers in the treatment of freckles. To evaluate the efficacy and safety of picosecond 532 nm laser (PS 532) and 755 nm Q-switched Alexandrite laser (QSAL) for treatment of freckles in a split-face manner. Eighteen patients with freckles were enrolled in the study. The right and left sides of their faces were randomly assigned to either a QSAL-treated group or PS 532-treated group. The degree of pain, satisfaction with the results, and adverse events associated with the laser treatment were evaluated using a questionnaire. All of the patients were followed up at 4 and 12 weeks after one treatment session. Among the 18 patients, PS 532 was found to be associated with less pain (3.56 ± 2.431) than QSAL (3.94 ± 1.893), but the difference was not statistically significant. The curative effect and satisfaction associated with 755 nm Q-switched Alexandrite laser was greater than that of picosecond 532 nm laser (P < .001). Both picosecond 532 nm laser and QSAL are effective in the treatment of freckles, and QSAL has a greater rate of satisfaction and curative effect.
Genetic prediction of different hair phenotypes can help reconstruct the physical appearance of an individual whose biological sample is analyzed in criminal and identification cases. Up to date, forensic prediction models for hair colour, hair shape, hair loss and hair greying have been developed, but studies investigating predictability of hair thickness and density traits are missing. First data suggesting overlapping associations in various hair features have emerged in recent years, suggesting partially common genetic basis and molecular mechanisms, and this knowledge can be used for predictive purposes. Here we aim to broaden our understanding of the genetics underlying head, facial and body hair thickness and density traits and examine the association for a set of literature SNPs. We characterize the overlap in SNP association for various hair phenotypes, the extent of genetic interactions and the potential for genetic prediction. The study involved 999 samples from Poland, genotyped for 240 SNPs with targeted next-generation sequencing. Logistic regression methods were applied for association and prediction analyses while entropy-based approach was used for interaction testing. As a result, we refined known associations for monobrow and hairiness (PAX3, 5q13.2, TBX) and identified two novel association signals in IGFBP5 and VDR. Both genes were among top significant loci, showed broad association with different hair-related traits and were implicated in multiple interaction effects. Overall, for 14.7% of SNPs previously associated with head hair loss and/or hair shape, a positive signal of association was revealed with at least one hair feature studied in the current research. Overlap in association with at least two hair-related traits was demonstrated for 24 distinct loci. We showed that the associated SNPs explain ∼5-30% of the variation observed in particular hair traits and allow moderate accuracy of prediction. The highest accuracy was achieved for hairiness level prediction in females (AUC=0.69 for the “none”, 0.69 for the “low” and 0.76 for the “excessive” hairiness category) and monobrow (AUC=0.69 for the “none”, 0.62 for the “slight” and 0.70 for the “significant” monobrow category) with 33% of the variation in hairiness level in females explained by 7 SNPs and age, and 20% of the variation in monobrow captured by 7 SNPs and sex. Our study presents clear evidence of pleiotropy and epistasis in the genetics of hair traits. The acquired knowledge may have practical application in forensics, as well as in the cosmetic industry and anthropological research.
“Omic” technologies have opened a new revolution in law enforcement and are now solving decades‐old cold cases as well as current investigations. The Golden State Killer case is probably the most notable of the cold case investigations that has been solved by one of the new “omic” technologies, namely forensic genetic genealogy. The resolution of this case epitomizes the power of the “omic” technologies to solve crime while simultaneously unearthing serious legal and ethical concerns around the individuals' privacy in the use of their genetic information. The legislation that is currently used by the state, territory, and Commonwealth jurisdictions in Australia to regulate the use of DNA for criminal investigation is now two decades old and does not address the application of “omic” technologies to criminal investigation. The Australian government is reviewing current privacy laws, and this review could include the use of “omic” technologies. In the absence of specific legislation, law enforcement must continue to develop processes to consider privacy, law, and ethics around the use of “omic” technologies for criminal investigation and the identification of human remains. This article is categorized under: Forensic Biology > Ethical and Social Implications Jurisprudence and Regulatory Oversight > Communication Across Science and Law Forensic Biology > Forensic DNA Technologies Genetic privacy is in our hands.
Forensic genetics developed from protein-based techniques a quarter of a century ago and became famous as "DNA fingerprinting," this being based on restriction fragment length polymorphisms (RFLPs) of high-molecular-weight DNA. The amplification of much smaller short tandem repeat (STR) sequences using the polymerase chain reaction soon replaced RFLP analysis and advanced to become the gold standard in genetic identification. Meanwhile, STR multiplexes have been developed and made commercially available which simultaneously amplify up to 30 STR loci from as little as 15 cells or fewer. The enormous information content that comes with the large variety of observed STR genotypes allows for genetic individualisation (with the exception of identical twins). Carefully selected core STR loci form the basis of intelligence-led DNA databases that provide investigative leads by linking unsolved crime scenes and criminals through their matched STR profiles. Nevertheless, the success of modern DNA fingerprinting depends on the availability of reference material from suspects. In order to provide new investigative leads in cases where such reference samples are absent, forensic scientists started to explore the prediction of phenotypic traits from the DNA of the evidentiary sample. This paradigm change now uses DNA and epigenetic markers to forecast characteristics that are useful to triage further investigative work. So far, the best investigated externally visible characteristics are eye, hair and skin colour, as well as geographic ancestry and age. Information on the chronological age of a stain donor (or any sample donor) is elemental for forensic investigations in a number of aspects and has, therefore, been explored by researchers in some detail. Among different methodological approaches tested to date, the methylation-sensitive analysis of carefully selected DNA markers (CpG sites) has brought the most promising results by providing prediction accuracies of ±3-4 years, which can be comparable to, or even surpass those from, eyewitness reports. This mini-review puts recent developments in age estimation via (epi)genetic methods in the context of the requirements and goals of forensic genetics and highlights paths to follow in the future of forensic genomics.
Shape variation of human head hair shows striking variation within and between human populations, while its genetic basis is far from being understood. We performed a series of genome-wide association studies (GWASs) and replication studies in a total of 28,964 subjects from 9 cohorts from multiple geographic origins. A meta-analysis of three European GWASs identified 8 novel loci (1p36.23 ERRFI1/SLC45A1, 1p36.22 PEX14, 1p36.13 PADI3, 2p13.3 TGFA, 11p14.1 LGR4, 12q13.13 HOXC13, 17q21.2 KRTAP, and 20q13.33 PTK6), and confirmed 4 previously known ones (1q21.3 TCHH/TCHHL1/LCE3E, 2q35 WNT10A, 4q21.21 FRAS1, and 10p14 LINC00708/GATA3), all showing genome-wide significant association with hair shape (P < 5e-8). All except one (1p36.22 PEX14) were replicated with nominal significance in at least one of the 6 additional cohorts of European, Native American and East Asian origins. Three additional previously known genes (EDAR, OFCC1, and PRSS53) were confirmed at nominal significance level. A multivariable regression model revealed that 14 SNPs from different genes significantly and independently contribute to hair shape variation, reaching a cross-validated AUC value of 0.66 (95% CI: 0.62-0.70) and an AUC value of 0.64 in an independent validation cohort, providing an improved accuracy compared to a previous model. Prediction outcomes of 2,504 individuals from a multiethnic sample were largely consistent with general knowledge on the global distribution of hair shape variation. Our study thus delivers target genes and DNA variants for future functional studies to further evaluate the molecular basis of hair shape in humans.
Human skin colour is highly heritable and externally visible with relevance in medical, forensic, and anthropological genetics. Although eye and hair colour can already be predicted with high accuracies from small sets of carefully selected DNA markers, knowledge about the genetic predictability of skin colour is limited. Here, we investigate the skin colour predictive value of 77 single-nucleotide polymorphisms (SNPs) from 37 genetic loci previously associated with human pigmentation using 2025 individuals from 31 global populations. We identified a minimal set of 36 highly informative skin colour predictive SNPs and developed a statistical prediction model capable of skin colour prediction on a global scale. Average cross-validated prediction accuracies expressed as area under the receiver-operating characteristic curve (AUC) ± standard deviation were 0.97 ± 0.02 for Light, 0.83 ± 0.11 for Dark, and 0.96 ± 0.03 for Dark-Black. When using a 5-category, this resulted in 0.74 ± 0.05 for Very Pale, 0.72 ± 0.03 for Pale, 0.73 ± 0.03 for Intermediate, 0.87±0.1 for Dark, and 0.97 ± 0.03 for Dark-Black. A comparative analysis in 194 independent samples from 17 populations demonstrated that our model outperformed a previously proposed 10-SNP-classifier approach with AUCs rising from 0.79 to 0.82 for White, comparable at the intermediate level of 0.63 and 0.62, respectively, and a large increase from 0.64 to 0.92 for Black. Overall, this study demonstrates that the chosen DNA markers and prediction model, particularly the 5-category level; allow skin colour predictions within and between continental regions for the first time, which will serve as a valuable resource for future applications in forensic and anthropologic genetics.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-017-1808-5) contains supplementary material, which is available to authorized users.
The genetics of eye colour has been extensively studied over the past few years, and the identified polymorphisms have been applied with marked success in the field of Forensic DNA Phenotyping. A picture that arises from evaluation of the currently available eye colour prediction markers shows that only the analysis of HERC2-OCA2 complex has similar effectiveness in different populations, while the predictive potential of other loci may vary significantly. Moreover, the role of gender in the explanation of human eye colour variation should not be neglected in some populations. In the present study, we re-investigated the data for 1020 Polish individuals and using neural networks and logistic regression methods explored predictive capacity of IrisPlex SNPs and gender in this population sample. In general, neural networks provided higher prediction accuracy comparing to logistic regression (AUC increase by 0.02-0.06). Four out of six IrisPlex SNPs were associated with eye colour in the studied population. HERC2 rs12913832, OCA2 rs1800407 and SLC24A4 rs12896399 were found to be the most important eye colour predictors (p < 0.007) while the effect of rs16891982 in SLC45A2 was less significant. Gender was found to be significantly associated with eye colour with males having ~1.5 higher odds for blue eye colour comparing to females (p = 0.002) and was ranked as the third most important factor in blue/non-blue eye colour determination. However, the implementation of gender into the developed prediction models had marginal and ambiguous impact on the overall accuracy of prediction confirming that the effect of gender on eye colour in this population is small. Our study indicated the advantage of neural networks in prediction modeling in forensics and provided additional evidence for population specific differences in the predictive importance of the IrisPlex SNPs and gender.
Androgenetic alopecia, known in men as male pattern baldness (MPB), is a very conspicuous condition that is particularly frequent among European men and thus contributes markedly to variation in physical appearance traits amongst Europeans. Recent studies have revealed multiple genes and polymorphisms to be associated with susceptibility to MPB. In this study, 50 candidate SNPs for androgenetic alopecia were analyzed in order to verify their potential to predict MPB. Significant associations were confirmed for 29 SNPs from chromosomes X, 1, 5, 7, 18 and 20. A simple 5-SNP prediction model and an extended 20-SNP model were developed based on a discovery panel of 305 males from various European populations fitting one of two distinct phenotype categories. The first category consisted of men below 50 years of age with significant baldness and the second; men aged 50 years or older lacking baldness. The simple model comprised the five best predictors: rs5919324 near AR, rs1998076 in the 20p11 region, rs929626 in EBF1, rs12565727 in TARDBP and rs756853 in HDAC9. The extended prediction model added 15 SNPs from five genomic regions that improved overall prevalence-adjusted predictive accuracy measured by area under the receiver characteristic operating curve (AUC). Both models were evaluated for predictive accuracy using a test set of 300 males reflecting the general European population. Applying a 65% probability threshold, high prediction sensitivity of 87.1% but low specificity of 42.4% was obtained in men aged <50 years. In men aged ≥50, prediction sensitivity was slightly lower at 67.7% while specificity reached 90%. Overall, the AUC=0.761 calculated for men at or above 50 years of age indicates these SNPs offer considerable potential for the application of genetic tests to predict MPB patterns, adding a highly informative predictive system to the emerging field of forensic analysis of externally visible characteristics.
In the International Visible Trait Genetics (VisiGen) Consortium, we investigated the genetics of human skin color by combining a series of genome-wide association studies (GWAS) in a total of 17,262 Europeans with functional follow-up of discovered loci. Our GWAS provide the first genome-wide significant evidence for chromosome 20q11.22 harboring the ASIP gene being explicitly associated with skin color in Europeans. In addition, genomic loci at 5p13.2 (SLC45A2), 6p25.3 (IRF4), 15q13.1 (HERC2/OCA2), and 16q24.3 (MC1R) were confirmed to be involved in skin coloration in Europeans. In follow-up gene expression and regulation studies of 22 genes in 20q11.22, we highlighted two novel genes EIF2S2 and GSS, serving as competing functional candidates in this region and providing future research lines. A genetically inferred skin color score obtained from the 9 top-associated SNPs from 9 genes in 940 worldwide samples (HGDP-CEPH) showed a clear gradual pattern in Western Eurasians similar to the distribution of physical skin color, suggesting the used 9 SNPs as suitable markers for DNA prediction of skin color in Europeans and neighboring populations, relevant in future forensic and anthropological
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-015-1559-0) contains supplementary material, which is available to authorized users.
Genomic prediction of the extreme forms of adult body height or stature is of practical relevance in several areas such as pediatric endocrinology and forensic investigations. Here, we examine 770 extremely tall cases and 9,591 normal height controls in a population-based Dutch European sample to evaluate the capability of known height-associated DNA variants in predicting tall stature. Among the 180 normal height-associated single nucleotide polymorphisms (SNPs) previously reported by the Genetic Investigation of ANthropocentric Traits (GIANT) genome-wide association study on normal stature, in our data 166 (92.2 %) showed directionally consistent effects and 75 (41.7 %) showed nominally significant association with tall stature, indicating that the 180 GIANT SNPs are informative for tall stature in our Dutch sample. A prediction analysis based on the weighted allele sums method demonstrated a substantially improved potential for predicting tall stature (AUC = 0.75; 95 % CI 0.72-0.79) compared to a previous attempt using 54 height-associated SNPs (AUC = 0.65). The achieved accuracy is approaching practical relevance such as in pediatrics and forensics. Furthermore, a reanalysis of all SNPs at the 180 GIANT loci in our data identified novel secondary association signals for extreme tall stature at TGFB2 (P = 1.8 × 10(-13)) and PCSK5 (P = 7.8 × 10(-11)) suggesting the existence of allelic heterogeneity and underlining the importance of fine analysis of already discovered loci. Extrapolating from our results suggests that the genomic prediction of at least the extreme forms of common complex traits in humans including common diseases are likely to be informative if large numbers of trait-associated common DNA variants are available.
Prediction of phenotypes from genetic data is considered to be the first practical application of data gained from association studies, with potential importance for medicine and the forensic sciences. Multiple genes and polymorphisms have been found to be associated with variation in human pigmentation. Their analysis enables prediction of blue and brown eye colour with a reasonably high accuracy. More accurate prediction, especially in the case of intermediate eye colours, may require better understanding of gene-gene interactions affecting this polygenic trait. Using multifactor dimensionality reduction and logistic regression methods, a study of gene-gene interactions was conducted based on variation in 11 known pigmentation genes examined in a cohort of 718 individuals of European descent. The study revealed significant interactions of a redundant character between the HERC2 and OCA2 genes affecting determination of hazel eye colour and between HERC2 and SLC24A4 affecting determination of blue eye colour. Our research indicates interactive effects of a synergistic character between HERC2 and OCA2, and also provides evidence for a novel strong synergistic interaction between HERC2 and TYRP1, both affecting determination of green eye colour.
Predicting complex human phenotypes from genotypes is the central concept of widely advocated personalized medicine, but so far has rarely led to high accuracies limiting practical applications. One notable exception, although less relevant for medical but important for forensic purposes, is human eye color, for which it has been recently demonstrated that highly accurate prediction is feasible from a small number of DNA variants. Here, we demonstrate that human hair color is predictable from DNA variants with similarly high accuracies. We analyzed in Polish Europeans with single-observer hair color grading 45 single nucleotide polymorphisms (SNPs) from 12 genes previously associated with human hair color variation. We found that a model based on a subset of 13 single or compound genetic markers from 11 genes predicted red hair color with over 0.9, black hair color with almost 0.9, as well as blond, and brown hair color with over 0.8 prevalence-adjusted accuracy expressed by the area under the receiver characteristic operating curves (AUC). The identified genetic predictors also differentiate reasonably well between similar hair colors, such as between red and blond-red, as well as between blond and dark-blond, highlighting the value of the identified DNA variants for accurate hair color prediction.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-010-0939-8) contains supplementary material, which is available to authorized users.
Despite the recent rapid growth in genome-wide data, much of human variation remains entirely unexplained. A significant challenge in the pursuit of the genetic basis for variation in common human traits is the efficient, coordinated collection of genotype and phenotype data. We have developed a novel research framework that facilitates the parallel study of a wide assortment of traits within a single cohort. The approach takes advantage of the interactivity of the Web both to gather data and to present genetic information to research participants, while taking care to correct for the population structure inherent to this study design. Here we report initial results from a participant-driven study of 22 traits. Replications of associations (in the genes OCA2, HERC2, SLC45A2, SLC24A4, IRF4, TYR, TYRP1, ASIP, and MC1R) for hair color, eye color, and freckling validate the Web-based, self-reporting paradigm. The identification of novel associations for hair morphology (rs17646946, near TCHH; rs7349332, near WNT10A; and rs1556547, near OFCC1), freckling (rs2153271, in BNC2), the ability to smell the methanethiol produced after eating asparagus (rs4481887, near OR2M7), and photic sneeze reflex (rs10427255, near ZEB2, and rs11856995, near NR2F2) illustrates the power of the approach.
Previous studies have successfully identified genetic variants in several genes associated with human iris (eye) color; however, they all used simplified categorical trait information. Here, we quantified continuous eye color variation into hue and saturation values using high-resolution digital full-eye photographs and conducted a genome-wide association study on 5,951 Dutch Europeans from the Rotterdam Study. Three new regions, 1q42.3, 17q25.3, and 21q22.13, were highlighted meeting the criterion for genome-wide statistically significant association. The latter two loci were replicated in 2,261 individuals from the UK and in 1,282 from Australia. The LYST gene at 1q42.3 and the DSCR9 gene at 21q22.13 serve as promising functional candidates. A model for predicting quantitative eye colors explained over 50% of trait variance in the Rotterdam Study. Over all our data exemplify that fine phenotyping is a useful strategy for finding genes involved in human complex traits.
We report a genome-wide association study of melanoma conducted by the GenoMEL consortium based on 317K tagging SNPs for 1,650 selected cases and 4,336 controls, with replication in an additional two cohorts (1,149 selected cases and 964 controls from GenoMEL, and a population-based case-control study in Leeds of 1,163 cases and 903 controls). The genome-wide screen identified five loci with genotyped or imputed SNPs reaching P < 5 x 10(-7). Three of these loci were replicated: 16q24 encompassing MC1R (combined P = 2.54 x 10(-27) for rs258322), 11q14-q21 encompassing TYR (P = 2.41 x 10(-14) for rs1393350) and 9p21 adjacent to MTAP and flanking CDKN2A (P = 4.03 x 10(-7) for rs7023329). MC1R and TYR are associated with pigmentation, freckling and cutaneous sun sensitivity, well-recognized melanoma risk factors. Common variants within the 9p21 locus have not previously been associated with melanoma. Despite wide variation in allele frequency, these genetic variants show notable homogeneity of effect across populations of European ancestry living at different latitudes and show independent association to disease risk.
We sought by use of an adult twin study to investigate the relative contribution of genetic and environmental effects on the expression of nevi and freckles, which are known risk factors for melanoma, and to determine if age and sun exposure influence the heritability of nevi.
Total nevus and freckle counts were conducted on 127 monozygotic twin pairs and 323 dizygotic twin pairs. Intraclass correlations were calculated by use of analysis of variance. Model-fitting analyses were performed to quantify the genetic and environmental components of the variance for nevus and freckle counts.
The intraclass correlation for total nevus counts was.83 in monozygotic pairs compared with.51 in dizygotic pairs. Quantitative genetic analyses showed that the contribution of genetic factors on nevi expression varied according to age. For twins less than 45 years old, the additive genetic variance on total nevus count was 36% (95% confidence interval [CI] = 0.8%-63%), with 38% (95% CI = 14%-61%) and 26% (95% CI = 16%-42%) of the remaining variance attributed to common environment and unique environmental effects, respectively. In twins aged 45 years or older, common environmental effects on total nevus count became negligible, with the additive genetic variance increasing to 84% (95% CI = 77%-88%). Body site was also found to affect the heritability estimates for nevus counts, with a statistically significant difference between sun-exposed and sun-protected sites. The polychoric correlation (i.e., the correlation in liability within twins for more than two categories) for total freckle counts was.91 in monozygotic twin pairs compared with.54 in dizygotic twin pairs. Additive genetic effects explained 91% (95% CI = 86%-94%) of the variance in freckle counts.
The contribution of genetic factors on the variance for total nevus counts increased with age, and sun exposure appears to influence the expression of nevi. The results of this study highlight the need to take into account the age and site of nevus counts for future genetic linkage or association studies in the search for new melanoma genes.
Variants of the melanocortin 1 receptor (MC1R) gene are common in individuals with red hair and fair skin, but the relative contribution to these pigmentary traits in heterozygotes, homozygotes and compound heterozygotes for variants at this locus from the multiple alleles present in Caucasian populations is unclear. We have investigated 174 individuals from 11 large kindreds with a preponderance of red hair and an additional 99 unrelated redheads, for MC1R variants and have confirmed that red hair is usually inherited as a recessive characteristic with the R151C, R160W, D294H, R142H, 86insA and 537insC alleles at this locus. The V60L variant, which is common in the population may act as a partially penetrant recessive allele. These individuals plus 167 randomly ascertained Caucasians demonstrate that heterozygotes for two alleles, R151C and 537insC, have a significantly elevated risk of red hair. The shade of red hair frequently differs in heterozygotes from that in homozygotes/compound heterozygotes and there is also evidence for a heterozygote effect on beard hair colour, skin type and freckling. The data provide evidence for a dosage effect of MC1R variants on hair as well as skin colour.
Ephelides and solar lentigines are different types of pigmented skin lesions. Ephelides appear early in childhood and are associated with fair skin type and red hair. Solar lentigines appear with increasing age and are a sign of photodamage. Both lesions are strong risk indicators for melanoma and non-melanoma skin cancer. Melanocortin-1-receptor (MC1R) gene variants are also associated with fair skin, red hair and melanoma and non-melanoma skin cancer. The purpose of this study was to investigate the relationship between MC1R gene variants, ephelides and solar lentigines. In a large case-control study, patients with melanoma and non-melanoma skin cancer and subjects without a history of skin cancer were studied. In all participants, the presence of ephelides in childhood and solar lentigines by physical examination was assessed according to strict definitions. The entire coding sequence of the MC1R gene was analyzed by single-strand conformation polymorphism analysis followed by sequence analyses. Carriers of one or two MC1R gene variants had a 3- and 11-fold increased risk of developing ephelides, respectively (both P < 0.0001), whereas the risk of developing severe solar lentigines was increased 1.5- and 2-fold (P = 0.035 and P < 0.0001), respectively. These associations were independent of skin type and hair color, and were comparable in patients with and without a history of skin cancer. The population attributable risk for ephelides to MC1R gene variants was 60%, i.e. 60% of the ephelides in the population was caused by MC1R gene variants. A dosage effect was found between the degree of ephelides and the number of MC1R gene variants. As nearly all individuals with ephelides were carriers of at least one MC1R gene variant, our data suggest that MC1R gene variants are necessary to develop ephelides. The results of the study also suggest that MC1R gene variants play a role, although less important, in the development of solar lentigines.
There is increasing awareness that epistasis or gene-gene interaction plays a role in susceptibility to common human diseases. In this paper, we formulate a working hypothesis that epistasis is a ubiquitous component of the genetic architecture of common human diseases and that complex interactions are more important than the independent main effects of any one susceptibility gene. This working hypothesis is based on several bodies of evidence. First, the idea that epistasis is important is not new. In fact, the recognition that deviations from Mendelian ratios are due to interactions between genes has been around for nearly 100 years. Second, the ubiquity of biomolecular interactions in gene regulation and biochemical and metabolic systems suggest that relationship between DNA sequence variations and clinical endpoints is likely to involve gene-gene interactions. Third, positive results from studies of single polymorphisms typically do not replicate across independent samples. This is true for both linkage and association studies. Fourth, gene-gene interactions are commonly found when properly investigated. We review each of these points and then review an analytical strategy called multifactor dimensionality reduction for detecting epistasis. We end with ideas of how hypotheses about biological epistasis can be generated from statistical evidence using biochemical systems models. If this working hypothesis is true, it suggests that we need a research strategy for identifying common disease susceptibility genes that embraces, rather than ignores, the complexity of the genotype to phenotype relationship.
The relationships between MC1R gene variants and red hair, skin reflectance, degree of freckling and nevus count were investigated in 2331 adolescent twins, their sibs and parents in 645 twin families. Penetrance of each MC1R variant allele was consistent with an allelic model where effects were multiplicative for red hair but additive for skin reflectance. Of nine MC1R variant alleles assayed, four common alleles were strongly associated with red hair and fair skin (Asp84Glu, Arg151Cys, Arg160Trp and Asp294His), with a further three alleles having low penetrance (Val60Leu, Val92Met and Arg163Gln). These variants were separately combined for the purposes of this analysis and designated as strong 'R' (OR=63.3; 95% CI 31.9-139.6) and weak 'r ' (OR=5.1; 95% CI 2.5-11.3) red hair alleles. Red-haired individuals are predominantly seen in the R/R and R/r groups with 67.1 and 10.8%, respectively. To assess the interaction of the brown eye color gene OCA2 on the phenotypic effects of variant MC1R alleles we included eye color as a covariate, and also genotyped two OCA2 SNPs (Arg305Trp and Arg419Gln), which were confirmed as modifying eye color. MC1R genotype effects on constitutive skin color, freckling and mole count were modified by eye color, but not genotype for these two OCA2 SNPs. This is probably due to the association of these OCA2 SNPs with brown/green not blue eye color. Amongst individuals with a R/R genotype (but not R/r), those who also had brown eyes had a mole count twice that of those with blue eyes. This suggests that other OCA2 polymorphisms influence mole count and remain to be described.
Natural variation in the coding region of the melanocortin-1 receptor (MC1R) gene is associated with constitutive pigmentation phenotypes and development of melanoma and nonmelanoma skin cancers. We investigated the effect of MC1R variants on melanoma using a large, international population-based study design with complete determination of all MC1R coding region variants. Direct sequencing was completed for 2,202 subjects with a single primary melanoma (controls) and 1,099 subjects with second or higher-order primary melanomas (cases) from Australia, the United States, Canada, and Italy. We observed 85 different MC1R variants, 10 of which occurred at a frequency >1%. Compared with controls, cases were more likely to carry two previously identified red hair ("R") variants [D84E, R151C, R160W, and D294H; odds ratio (OR), 1.6; 95% confidence interval (95% CI), 1.1-2.2]. This effect was similar among individuals carrying one R variant and one r variant (defined as any non-R MC1R variant; OR, 1.6; 95% CI, 1.3-2.2) and among those carrying only one R variant (OR, 1.5; 95% CI, 1.1-1.9). There was no statistically significant association among those carrying only one or two r variants. Effects were similar across geographic regions and categories of pigmentation characteristics or number of moles. Our results confirm that MC1R is a low-penetrance susceptibility locus for melanoma, show that pigmentation characteristics may not modify the relationship of MC1R variants and melanoma risk, and suggest that associations may be smaller than previously reported in part due to the study design.
Human skin pigmentation shows a strong positive correlation with ultraviolet radiation intensity, suggesting that variation in skin color is, at least partially, due to adaptation via natural selection. We investigated the evolution of pigmentation variation by testing for the presence of positive directional selection in 6 pigmentation genes using an empirical F(ST) approach, through an examination of global diversity patterns of these genes in the Centre d'Etude du Polymorphisme Humain (CEPH)-Diversity Panel, and by exploring signatures of selection in data from the International HapMap project. Additionally, we demonstrated a role for MATP in determining normal skin pigmentation variation using admixture mapping methods. Taken together (with the results of previous admixture mapping studies), these results point to the importance of several genes in shaping the pigmentation phenotype and a complex evolutionary history involving strong selection. Polymorphisms in 2 genes, ASIP and OCA2, may play a shared role in shaping light and dark pigmentation across the globe, whereas SLC24A5, MATP, and TYR have a predominant role in the evolution of light skin in Europeans but not in East Asians. These findings support a case for the recent convergent evolution of a lighter pigmentation phenotype in Europeans and East Asians.
Hair, skin and eye colors are highly heritable and visible traits in humans. We carried out a genome-wide association scan for variants associated with hair and eye pigmentation, skin sensitivity to sun and freckling among 2,986 Icelanders. We then tested the most closely associated SNPs from six regions--four not previously implicated in the normal variation of human pigmentation--and replicated their association in a second sample of 2,718 Icelanders and a sample of 1,214 Dutch. The SNPs from all six regions met the criteria for genome-wide significance. A variant in SLC24A4 is associated with eye and hair color, a variant near KITLG is associated with hair color, two coding variants in TYR are associated with eye color and freckles, and a variant on 6p25.3 is associated with freckles. The fifth region provided refinements to a previously reported association in OCA2, and the sixth encompasses previously described variants in MC1R.
We present results from a genome-wide association study for variants associated with human pigmentation characteristics among 5,130 Icelanders, with follow-up analyses in 2,116 Icelanders and 1,214 Dutch individuals. Two coding variants in TPCN2 are associated with hair color, and a variant at the ASIP locus shows strong association with skin sensitivity to sun, freckling and red hair, phenotypic characteristics similar to those affected by well-known mutations in MC1R.
Fair color increases risk of cutaneous melanoma (CM) and basal cell carcinoma (BCC). Recent genome-wide association studies have identified variants affecting hair, eye and skin pigmentation in Europeans. Here, we assess the effect of these variants on risk of CM and BCC in European populations comprising 2,121 individuals with CM, 2,163 individuals with BCC and over 40,000 controls. A haplotype near ASIP, known to affect a similar spectrum of pigmentation traits as MC1R variants, conferred significant risk of CM (odds ratio (OR) = 1.45, P = 1.2 x 10(-9)) and BCC (OR = 1.33, P = 1.2 x 10(-6)). The variant in TYR encoding the R402Q amino acid substitution, previously shown to affect eye color and tanning response, conferred risk of CM (OR = 1.21, P = 2.8 x 10(-7)) and BCC (OR = 1.14, P = 6.1 x 10(-4)). An eye color variant in TYRP1 was associated with risk of CM (OR = 1.15, P = 4.6 x 10(-4)). The association of all three variants is robust with respect to adjustment for the effect of pigmentation.
Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.
Accurate prediction of complex traits requires using a large number of DNA variants. Advances in statistical and machine learning methodology enable the identification of complex patterns in high-dimensional settings. However, training these highly parameterized methods requires very large data sets. Until recently, such data sets were not available. But the situation is changing rapidly as very large biomedical data sets comprising individual genotype-phenotype data for hundreds of thousands of individuals become available in public and private domains. We argue that the convergence of advances in methodology and the advent of Big Genomic Data will enable unprecedented improvements in complex-trait prediction; we review theory and evidence supporting our claim and discuss challenges and opportunities that Big Data will bring to complex-trait prediction.
Forensic DNA Phenotyping (FDP), i.e. the prediction of human externally visible traits from DNA, has become a fast growing subfield within forensic genetics due to the intelligence information it can provide from DNA traces. FDP outcomes can help focus police investigations in search of unknown perpetrators, who are generally unidentifiable with standard DNA profiling. Therefore, we previously developed and forensically validated the IrisPlex DNA test system for eye colour prediction and the HIrisPlex system for combined eye and hair colour prediction from DNA traces. Here we introduce and forensically validate the HIrisPlex-S DNA test system (S for skin) for the simultaneous prediction of eye, hair, and skin colour from trace DNA. This FDP system consists of two SNaPshot-based multiplex assays targeting a total of 41 SNPs via a novel multiplex assay for 17 skin colour predictive SNPs and the previous HIrisPlex assay for 24 eye and hair colour predictive SNPs, 19 of which also contribute to skin colour prediction. The HIrisPlex-S system further comprises three statistical prediction models, the previously developed IrisPlex model for eye colour prediction based on 6 SNPs, the previous HIrisPlex model for hair colour prediction based on 22 SNPs, and the recently introduced HIrisPlex-S model for skin colour prediction based on 36 SNPs. In the forensic developmental validation testing, the novel 17-plex assay performed in full agreement with the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines, as previously shown for the 24-plex assay. Sensitivity testing of the 17-plex assay revealed complete SNP profiles from as little as 63 pg of input DNA, equalling the previously demonstrated sensitivity threshold of the 24-plex HIrisPlex assay. Testing of simulated forensic casework samples such as blood, semen, saliva stains, of inhibited DNA samples, of low quantity touch (trace) DNA samples, and of artificially degraded DNA samples as well as concordance testing, demonstrated the robustness, efficiency, and forensic suitability of the new 17-plex assay, as previously shown for the 24-plex assay. Finally, we provide an update of the publically available HIrisPlex website https://hirisplex.erasmusmc.nl/, now allowing the estimation of individual probabilities for 3 eye, 4 hair, and 5 skin colour categories from HIrisPlex-S input genotypes. The HIrisPlex-S DNA test represents the first forensically validated tool for skin colour prediction, and reflects the first forensically validated tool for simultaneous eye, hair and skin colour prediction from DNA.
Prediction of human pigmentation traits, one of the most differentiable externally visible characteristics among individuals, from biological samples represents a useful tool in the field of forensic DNA phenotyping. In spite of freckling being a relatively common pigmentation characteristic in Europeans, little is known about the genetic basis of this largely genetically determined phenotype in southern European populations. In this work, we explored the predictive capacity of eight freckle and sunlight sensitivity-related genes in 458 individuals (266 non-freckled controls and 192 freckled cases) from Spain. Four loci were associated with freckling (MC1R, IRF4, ASIP and BNC2), and female sex was also found to be a predictive factor for having a freckling phenotype in our population. After identifying the most informative genetic variants responsible for human ephelides occurrence in our sample set, we developed a DNA-based freckle prediction model using a multivariate regression approach. Once developed, the capabilities of the prediction model were tested by a repeated 10-fold cross-validation approach. The proportion of correctly predicted individuals using the DNA-based freckle prediction model was 74.13%. The implementation of sex into the DNA-based freckle prediction model slightly improved the overall prediction accuracy by 2.19% (76.32%). Further evaluation of the newly-generated prediction model was performed by assessing the model's performance in a new cohort of 212 Spanish individuals, reaching a classification success rate of 74.61%. Validation of this prediction model may be carried out in larger populations, including samples from different European populations. Further research to validate and improve this newly-generated freckle prediction model will be needed before its forensic application. Together with DNA tests already validated for eye and hair colour prediction, this freckle prediction model may lead to a substantially more detailed physical description of unknown individuals from DNA found at the crime scene.
Humans are a colourful species of primate, with human skin, hair and eye coloration having been influenced by a great variety of evolutionary forces throughout prehistory. Functionally naked skin has been the physical interface between the physical environment and the human body for most of the history of the genus Homo, and hence skin coloration has been under intense natural selection. From an original condition of protective, dark, eumelanin-enriched coloration in early tropical-dwelling Homo and Homo sapiens, loss of melanin pigmentation occurred under natural selection as Homo sapiens dispersed into non-tropical latitudes of Africa and Eurasia. Genes responsible for skin, hair and eye coloration appear to have been affected significantly by population bottlenecks in the course of Homo sapiens dispersals. Because specific skin colour phenotypes can be created by different combinations of skin colour–associated genetic markers, loss of genetic variability due to genetic drift appears to have had negligible effects on the highly redundant genetic ‘palette’ for the skin colour. This does not appear to have been the case for hair and eye coloration, however, and these traits appear to have been more strongly influenced by genetic drift and, possibly, sexual selection.
This article is part of the themed issue ‘Animal coloration: production, perception, function and application’.
Cutaneous malignant melanoma (CMM) is a malicious human skin cancer that primarily affects individuals with light pigmentation and heavy sun exposure, but also has a known familial association. Multiple genes and polymorphisms have been reported as low-penetrance susceptibility loci for CMM. Here, we examined 33 candidate polymorphisms located in 11 pigmentation genes and the vitamin D receptor gene (VDR) in a population of 130 cutaneous melanoma patients and 707 healthy controls. The genotypes obtained were evaluated for main association effects and potential gene-gene interactions. MC1R, TYR, VDR and SLC45A2 genes were found to be associated with CMM in our population. The results obtained for major function MC1R mutations were the most significant [with odds ratio (OR)=1.787, confidence interval (CI)=1.320-2.419 and P=1.715], followed by TYR (rs1393350) (with OR=1.569, CI=1.162-2.118, P=0.003), VDR (GCCC haplotype in rs2238136-rs4516035-rs7139166-rs11568820 block) (with OR=5.653, CI=1.794-17.811, P=0.003) and SLC45A2 (rs16891982) (with OR=0.238, CI=0.057-0.987, P=0.048). The study also detected significant intermolecular epistatic effects between MC1R and TYR, SLC45A2 and VDR, HERC2 and VDR, OCA2 and TPCN2, as well as intramolecular interactions between variants within the genes MC1R and VDR. In the final multivariate logistic regression model for CMM development, only the gene-gene interactions discovered remained significant, showing that epistasis may be an important factor in the risk of melanoma.
Freckles, the lay term for ephelides and lentigines, are important pigmentation characteristics observed in humans. Both are affected by sunlight; ephelides are largely genetically determined but induced by sunlight whereas lentigines are induced by sun exposure and photodamage of the skin. However, despite being commonly observed, we know very little about them. Here we review the current status of knowledge about freckles and propose a model for their formation. This article is protected by copyright. All rights reserved.
In recent years, several studies have greatly increased our understanding of the genetic basis underlying human eye colour variation. A large percentage of the eye colour diversity present in humans can already be genetically explained, so much so that different DNA-based eye colour prediction models, such as IrisPlex, have been recently developed for forensic purposes. Though these models are already highly accurate, they are by no means perfect, with many genotype-phenotype discrepancies still remaining unresolved. In this work we have genotyped six SNPs associated with eye colour (IrisPlex) in 535 individuals from Spain, a Mediterranean population. Aside from different SNP frequencies in Spain compared to Northern Europe, the results for eye colour prediction are quite similar to other studies. However, we have found an association between gender and eye colour prediction. When comparing similar eye colour genetic profiles, females tend, as a whole, to have darker eyes than males (and, conversely, males lighter than females). These results are also corroborated by the revision and meta-analysis of data from previously published eye colour genetic studies in several Caucasian populations, which significantly support the fact that males are more likely to have blue eyes than females, while females tend to show higher frequencies of green and brown eyes than males. This significant gender difference would suggest that there is an as yet unidentified gender-related factor contributing to human eye colour variation.
Genome-wide association studies and comparative genomics have established major loci and specific polymorphisms affecting human skin, hair and eye color. Environmental changes have had an impact on selected pigmentation genes as populations have expanded into different regions of the globe.
Background To date, few epidemiological data on the relationships between solar lentigines, freckles and behavioural and constitutional risk factors in Caucasian populations exist.
Objectives To investigate the potential impact of behavioural and phenotypic variables, as well as the MC1R genetic background, on the history of facial freckles and the severity of solar lentigines in Caucasian women.
Methods The severity of solar lentigines was graded from facial digital images of 523 French middle-aged women by a dermatologist and summarized by a score afterwards. The history of facial freckles was assessed and the sun-exposure behaviour was characterized using a six-category typology. Risk factors including MC1R polymorphism were evaluated using logistic regression models.
Results Two constitutive host factors were found to be independently associated with a history of facial freckles: frequent sunburns and the presence of diminished function variants of the MC1R gene. In addition to age, five factors were independently associated with solar lentigines: constitutive host factors (dark skin colour and tanning capacity), a history of freckles, sun-exposure behaviour and current intake of oral contraceptive or progestogen treatments.
Conclusion These results strengthen the hypothesis that solar lentigines are markers of photoaging, whereas freckles are mainly determined by genetic factors. The finding that hormonal treatment is associated with a higher risk for solar lentigines merits further investigations.
Prediction of physical appearance based on genetic analysis is a very attractive prospect for forensic investigations. Recent studies have proved that there is a significant association between some genetic variants of the melanocortin 1 receptor (MC1R) gene and red hair color. The present study focuses on the potential forensic applicability of variation within this pigment-related gene. Sequencing of the complete MC1R gene was performed on a group of red-haired individuals and controls with different pigmentation. A major role in determination of red hair color is played by two MC1R variants—C451T and C478T. The optimized minisequencing assay for genotyping of the above positions and three other important red hair-related MC1R polymorphisms, C252A, G425A, and G880C was successfully applied to analyze typical forensic specimens. Determination of a homozygous or heterozygous combination can be a good predictor of both red hair color and fair skin of a subject.
The risk of developing skin cancers is dependent on a combination of environmental factors and personal genetic predispositions. Basal cell carcinoma (BCC) has been associated with single nucleotide polymorphisms in several pigmentation genes; however, there is still controversy concerning the mechanism by which these variants may increase the risk of BCC. The pathway may lead to pigmentation alone, but evidence for their independent influence is growing. Using a single base extension protocol, candidate polymorphisms within 11 known pigment-related genes were studied for their association with BCC in a population sample consisting of 164 patients and 707 controls. The significance of variation within the MC1R gene was confirmed and, in addition, position rs12203592 within the IRF4 gene was shown to be associated with BCC. These associations remained significant after adjustment for skin color. Gene-gene interactions were found to influence susceptibility to BCC. Among interacting genes are the two above-mentioned loci with main effect on BCC risk and additionally KITLG, TYRP1, ASIP and TYR. The obtained results indicate that polymorphism at MC1R and IRF4 constitute pigmentation-independent risk factor in the development of BCC. Moreover, susceptibility to BCC may be influenced by epistatic effects between pigmentation genes.
The widespread availability of high-throughput genotyping technology has opened the door to the era of personal genetics, which brings to consumers the promise of using genetic variations to predict individual susceptibility to common diseases. Despite easy access to commercial personal genetics services, our knowledge of the genetic architecture of common diseases is still very limited and has not yet fulfilled the promise of accurately predicting most people at risk. This is partly because of the complexity of the mapping relationship between genotype and phenotype that is a consequence of epistasis (gene-gene interaction) and other phenomena such as gene-environment interaction and locus heterogeneity. Unfortunately, these aspects of genetic architecture have not been addressed in most of the genetic association studies that provide the knowledge base for interpreting large-scale genetic association results. We provide here an introductory review of how epistasis can affect human health and disease and how it can be detected in population-based studies. We provide some thoughts on the implications of epistasis for personal genetics and some recommendations for improving personal genetics in light of this complexity.
Predicting complex human phenotypes from genotypes has recently gained tremendous interest in the emerging field of consumer genomics, particularly in light of attempting personalized medicine 1 and 2. So far, however, this approach has not been shown to be accurate, thus limiting its practical applications 3 and 4. Here, we used human eye (iris) color of Europeans as an empirical example to demonstrate that highly accurate genetic prediction of complex human phenotypes is feasible. Moreover, the six DNA markers we identified as major eye color predictors will be valuable in forensic studies.
Common causes of hyperpigmentation include postinflammatory hyperpigmentation, melasma, solar lentigines, ephelides (freckles), and café-au-lait macules. Although most hyperpigmented lesions are benign and the diagnosis is straightforward, it is important to exclude melanoma and its precursors and to identify skin manifestations of systemic disease. Treatment options for postinflammatory hyperpigmentation, melasma, solar lentigines, and ephelides include the use of topical agents, chemical peels, cryotherapy, or laser therapy. Caf&-au-lait macules are amenable to surgical excision or laser treatment. Disorders of hypopigmentation may also pose diagnostic challenges, although those associated with health risks are uncommon and are usually congenital (e.g., albinism, piebaldism, tuberous sclerosis, hypomelanosis of Ito). Acquired disorders may include vitiligo, pityriasis alba, tinea versicolor, and postinflammatory hypopigmentation. Treatment of patients with widespread or generalized vitiligo may include cosmetic coverage, psoralen ultraviolet A-range therapy (with or without psoralens), or narrow-band ultraviolet-B therapy; whereas those with stable, limited disease may be candidates for surgical grafting techniques. Patients with extensive disease may be candidates for depigmentation therapy. Other acquired disorders may improve or resolve with treatment of the underlying condition.
Ephelides and solar lentigines are benign pigmented spots, which are currently associated with an increased risk of skin cancer. These two pigmented spots are known to be discriminated by their clinical, histological, and electron microscopic characteristics, even though occasional misclassification can occur because of their similarity. It has also been questioned whether these spots are not one and the same. In this study, we have attempted to differentiate between these two pigmented spots with the use of a standardized protocol for clinical examinations on 272 healthy volunteers, paying particular consideration to their pigmentary and constitutional host factors. We found that solar lentigines 1) are more prevalent than ephelides, 2) increase in prevalence and number with higher age, and 3) are most prevalent on the trunk and occur more frequently in males than in females. A trend is also observed whereby ephelides 1) loose their prevalence with age, 2) become equally distributed on the face, arms, and trunk, and 3) occur more frequently in females. An intimate association of ephelides, but not solar lentigines, has been found with hair color and skin type. All of these findings are in agreement with most of those reported in the literature, supporting the view that ephelides and solar lentigines are different types of pigmented lesions.
We have examined melanocortin-1 receptor (MC1R) variant allele frequencies in the general population and in a collection of adolescent dizygotic and monozygotic twins to determine statistical associations of pigmentation phenotypes with increased skin cancer risk. This included hair and skin color, freckling, mole count and sun exposed skin reflectance. Nine variants were studied and designated as either strong R (OR = 63; 95% CI 32-140) or weak r (OR = 5; 95% CI 3-11) red hair alleles. Penetrance of each MC1R variant allele was consistent with an allelic model where effects were multiplicative for red hair but additive for skin reflectance. To assess the interaction of the brown eye color gene BEY2/OCA2 on the phenotypic effects of variant MC1R alleles we imputed OCA2 genotype in the twin collection. A modifying effect of OCA2 on MC1R variant alleles was seen on constitutive skin color, freckling and mole count. In order to study the individual effects of these variants on pigmentation phenotype we have established a series of human primary melanocyte strains genotyped for the MC1R receptor. These include strains which are MC1R wild-type consensus, variant heterozygotes, and homozygotes for strong R alleles Arg151Cys and Arg160Trp. Ultrastructural analysis demonstrated that only consensus strains contained stage III and IV melanosomes in their terminal dendrites whereas Arg151Cys and Arg160Trp homozygous strains contained only immature stage I and II melanosomes. Such genetic association studies combined with the functional analysis of MC1R variant alleles in melanocytic cells should provide a link in understanding the association between pigmentary phototypes and skin cancer risk.
Solar lentigines and ephelides are different types of pigmented skin lesions predominantly present on sun-exposed skin. Both lesions are risk indicators for melanoma and non-melanoma skin cancer. Solar lentigines are considered as a sign of photodamage although well-conducted epidemiological studies are lacking on this subject. Ephelides are associated with fair skin type and red hair. The aim of the present study was to investigate the relation of sun-exposure estimates with solar lentigines and ephelides. In the Leiden Skin cancer Study 577 patients with malignant melanoma and/or non-melanoma skin cancer and 385 individuals without a history of skin cancer were studied. The presence of solar lentigines and ephelides in the face and on the back was assessed. Data on skin type, hair color, sun-exposure variables and cutaneous signs of photodamage were collected, by questionnaire and physical examination. Data were analyzed by chi-square or Student t-tests and with multivariable regression. Exposure odds ratios with 95% confidence intervals (95% CI) were calculated to estimate the relative risk for the presence of solar lentigines and ephelides dependent on signs of photodamage. The association with age was strongly positive for solar lentigines whereas it was strongly negative for ephelides (P-values for trend <0.0001). After adjustment for age, sex and skin type, solar lentigines on the back were positively associated with cumulative (P = 0.01) and intermittent (P = 0.0002) sun exposure. After adjustment, solar lentigines on the back were also associated with a history of sunburns before the age of 20 yr (P = 0.0003) and the number of sunburns in childhood (P = 0.002). Solar lentigines in the face were significantly associated with cutaneous signs of photodamage, i.e. elastosis (odds ratio 2.4, 95% CI 1.7-3.3) and actinic keratosis (odds ratio 1.8, 95% CI 1.3-2.4) whereas ephelides were not. Ephelides in the face and on the back showed an inverse association with chronic sun exposure but after adjustment theses associations disappeared. Sunburns before the age of 20 appeared to be positively associated with ephelides on the back (P = 0.04). In contrast to lentigines, ephelides were much more associated with constitutional host factors such as fair skin and/or red hair (both P < 0.0001). This study indicates that both chronic and acute sun exposure are important in the pathogenesis of solar lentigines.
Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer
AM J HUM GENET
M D Ritchie
L W Hahn
L R Bailey
W D Dupont
F F Parl
J H Moore
M.D. Ritchie, L.W. Hahn, N. Roodi, L.R. Bailey, W.D. Dupont, F.F. Parl, J.H. Moore,
Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer, Am. J. Hum. Genet. 69 (2001)