Figure 3 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Heat map of Model Fit Scores. 20 values for each inclusion penalty (site and protein) were sampled from a logspace ranging from 1-99% of the maximum non-trivial penalty. A higher MFS suggests a higher risk of overfitting. Models with the best (lowest) 5% of MFS are included in predictive ensembles (Fig. 4, 5).
Source publication
Cases abound in which nearly identical traits have appeared in distant species facing similar environments. These unmistakable examples of adaptive evolution offer opportunities to gain insight into their genetic origins and mechanisms through comparative analyses. Here, we present a novel comparative genomics approach to build genetic models that...
Context in source publication
Context 1
... previous experimental and analytical knowledge 20,22,26,27 . This model correctly assigned all six C4 and six C3 species used to train the model and correctly predicted 97% of the other C4 species in this dataset (36 of 37) and 100% of C3 species (15 of 15) for a balanced accuracy of 98.5%. An ensemble of genetic models with similar MFS scores ( Fig. 3 ) also performed equally well ( Fig. 4A ). The best MFS model was found to be equally accurate in predicting C4 species that are siblings of those used in the training set, which suggests that multiple C4 species within a clade inherited the trait from a common ancestor. This is consistent with the parsimonious reconstruction of ...
Similar publications
Cases abound in which nearly identical traits have appeared in distant species facing similar environments. These unmistakable examples of adaptive evolution offer opportunities to gain insight into their genetic origins and mechanisms through comparative analyses. Here, we present an approach to build genetic models that underlie the independent o...