Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
17865606-file00.docx 04.01.2021
1
1
Phenotyping in the era of genomics: MaTrics – a digital character matrix to document 2
mammalian phenotypic traits coded numerically 3
4
5
Clara Stefen1, Franziska Wagner1, Marika Asztalos1, Peter Giere2, Peter Grobe3, Michael 6
Hiller4,5,6,7,8,9, Rebecca Hofmann8, Maria Jähde1, Ulla Lächele2, Thomas Lehmann8, Sylvia 7
Ortmann10, Benjamin Peters1, Irina Ruf8, Christian Schiffmann10, Nadja Thier1, Gabi 8
Unterhitzenberger10, Lars Vogt11, Matthias Rudolf12, Peggy Wehner12, Heiko Stuckas1 9
10
11
12
1 Senckenberg Naturhistorische Sammlungen Dresden, Königsbrücker Landstraße 159, 01109 13
Dresden, Germany, clara.stefen@senckenberg.de; heiko.stuckas@senckenberg.de 14
2 Museum für Naturkunde, Berlin Leibniz Institute for Evolution and Biodiversity Science, 15
Invalidenstr. 43, 10115 Berlin, Germany 16
3 Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53113 Bonn, 17
Germany
18
4 Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, 01307 19
Dresden, Germany 20
5 Max Planck Institute for the Physics of Complex Systems, Nöthnitzer Str. 38, 01187 21
Dresden, Germany 22
6 Center for Systems Biology Dresden, Pfotenhauerstr. 108, 01307 Dresden, Germany 23 7 LOEWE Center for Translational Biodiversity Genomics, Senckenberganlage 25, 60325 24
Frankfurt, Germany 25
8 Senckenberg Forschungsinstitut und Naturmuseum Frankfurt, Senckenberganlage 25, 60325 26
Frankfurt, Germany 27
9 Goethe-University, Faculty of Biosciences, Max-von-Laue-Str. 9, 60438 Frankfurt, 28
Germany 29
10 Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315 Berlin, 30
Germany 31
11 TIB Leibniz Information Centre for Science and Technology, Welfengarten 1B, 30167 32
Hannover, Germany 33
12 TU Dresden, Institut für Allgemeine Psychologie, Biopsychologie und Methoden der 34
Psychologie, Raum BZW A317, 01062 Dresden 35
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
2
36
37
38
Corresponding authors: clara.stefen@senckenberg.de; heiko.stuckas@senckenberg.de 39
40
41
42
ORCID IDs 43
Michael Hiller: 0000-0003-3024-1449 44
Sylvia Ortmann 0000-0003-2520-6251 45
Irina Ruf: 0000-0002-9728-1210 46
Clara Stefen: 0000-0001-79986-110X 47
Heiko Stuckas 0000-0002-5690-0994 48
Lars Vogt: 0000-0002-8280-0487 49
Franziska Wagner 0000-0001-6623-6700 50
Benjamin Peters 0000-0002-2737-7006 51
52
53
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
3
Abstract 54
A new and uniquely structured matrix of mammalian phenotypes, MaTrics (Mammalian 55
Traits for Comparative Genomics) is presented in a digital form. By focussing on mammalian 56
species for which genome assemblies are available, MaTrics provides an interface between 57
mammalogy and comparative genomics. 58
MaTrics was developed as part of a project to link phenotypic differences between mammals 59
to differences in their genomes using Forward Genomics. Apart from genomes this approach 60
requires information on homologous phenotypes that are numerically encoded (presence-61
absence; multistate character coding*) in a matrix. MaTrics provides these data, links them to 62
at least one reference (e.g., literature, photographs, histological sections, CT-scans, or 63
museum specimens) and makes them available in a machine actionable NEXUS-format. By 64
making the data computer readable, MatTrics opens a new way for digitizing collections. 65
Currently, MaTrics covers 147 mammalian species and includes 207 characters referring to 66
structure, morphology, physiology, ecology and ethology. Researching these traits revealed 67
substantial knowledge gaps, highlighting the need for substantial phenotyping efforts in the 68
genomic era. Using the trait information documented in MaTrics, previous Forward Genomics 69
screens identified changes in genes that are associated with various phenotypes, ranging from 70
fully-aquatic lifestyle to dietary specializations. These results motivate the continuous 71
expansion of phenotype information, both by filling research gaps or by adding additional 72
taxa and traits. MaTrics is digitally available online within the data repository Morph·D·Base 73
(www.morphdbase.de). 74
75 76 77 78 79 Key words hard tissue, visceral & life history traits, museum specimens, character states, 80
numerical coding 81
82
Expressions indicated with an * are explained in the attached glossary 83
84
85
Funding As part of the interdisciplinary research project ‘Identifying genomic loci underlying 86
mammalian phenotypic variability using Forward Genomics’ by the Leibnitz Association, 87
grant SAW-2016-SGN-2. 88
Conflicts of interest/Competing interests none known for all authors 89
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
4
Ethics approval Not applicable as no experiments with living animals have been performed 90
Consent to participate all authors agreed to the work 91
Consent for publication all authors agree to the publication 92
Availability of data and material: data are available within Morph·D·Base 93
(www.morphdbase.de) 94
Code availability (software application or custom code): Not applicable 95
Authors' contributions (optional: please review the submission guidelines from the journal 96
whether statements are mandatory), Cs, HS, MH had the idea for the manuscript, CSt, FW, 97
HS drafted and finalized the manuscript; IR, PGi, MH, Rh, UL, SO, TL, NT commented and 98
participated in writing the manuscript; MA, CSt did the study on cusps in teeth of Carnivora; 99
BP, CSch, CSt, FW, GU, IR, MA, MJ, NT, RH, SO, TL, UL were involved in coding 100
characters in MaTrics; PGr, LV provided Morph·D·Base, implemented tools therein and 101
technically finalized MaTrics; FW got statistics for MaTrics, MR, PW did the statistical 102
calculations on characters 103
104
105
106
107
Introduction 108
Background 109
Knowing and understanding the organisms around us has always been important for mankind 110
and thus describing and comparing phenotypes has a long tradition that goes beyond the 111
emergence of academic disciplines (e.g., Pruvost et al. 2011). The phenotype of an organism 112
refers to its observable constituents, properties, and relations that can be considered to result 113
from the interaction of the organism’s genotype with itself and its environment and include 114
the anatomical organization of an organism, its physical properties, behaviour, ecological 115
features, and lifestyle traits. They characterize an organism and contribute to biodiversity. 116
Morphological* and anatomical* data usually make up the largest part of the phenotypic data 117
available for a species. In mammalogy, specific skeletal, dental as well as body plan, visceral 118
and physiological traits are traditionally used for differentiating species and for describing 119
their inter- and intraspecific variability. Depending on preservation, this can also be applied to 120
fossil species. 121
122
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
5
Advances in molecular biology and genetics over the last decades identified many 123
genes and molecular mechanisms that are required for the development of many traits. This 124
work relied primarily on studying model organisms such as the fruit fly (Drosophila 125
melanogaster), the zebra fish (Danio rerio) or the mouse (Mus musculus). These models 126
provided decisive insights into the genes behind basic developmental processes, including 127
organ function and morphogenesis (Meunier 2012). Comparing developmental processes 128
from model to non-model organisms opened the field for evolutionary developmental biology 129
and explained the molecular basis of processes such as body plan evolution. However, there 130
are some limitations on what model organisms can tell (Bolker 2012). For instance, insights 131
from experiments on model organisms are restricted to the phenotypes present in that 132
particular species. For example, rodents such as mice do not have canine teeth, making mouse 133
an inappropriate model to study the molecular mechanisms required to make canine teeth. 134
Furthermore, even if model organism research would reveal all genes that are associated with 135
a given phenotype (e.g., the digestive system), it remains unknown which of these genes were 136
altered during evolution and contributed to phenotypic changes between related species (e.g., 137
mammals that specialized on particular diets). 138
139
With the development of sequencing technologies, sequencing and assembly of whole 140
genomes became possible; the first was published in 1995 (of the bacteria Haemophilus 141
influenzae, Fleischmann et al. 1995) and the mouse genome “only” was published in 2002 142
(Waterston et al. 2002). Due to advancements in high-throughput DNA sequencing, there are 143
an increasing number of species for which sequenced nuclear genomes are available (e.g., 144
Genome 10K Community of Scientists 2009; Teeling et al. 2018; Feng et al. 2020; Zoonomia 145
Consortium 2020). This wealth of genomes provides a basis for comparative genomics* 146
(“defined as the comparison of biological information derived from whole-genome 147
sequences” and as discipline / methodology thus only started in 1995 (de Crécy-Lagard and 148
Hanson 2018)). While comparative genomics often aims at identifying genomic elements that 149
are conserved across species and thus likely have an evolutionarily important function 150
(Nobrega and Pennacchio, 2004), comparative genomics can also be used to detect 151
differences in functional genomic elements and associate them with phenotypic differences of 152
species. For example, targeted analyses of genes associated with the formation of dentin 153
(DSPP) and enamel (AMTN, AMBN, ENAM, AMELX, MMP20) across Mammalia and 154
Sauropsida (including Aves, Crocodylia, Testudines, Squamata) showed an association 155
between the loss of these genes and the loss of teeth (Meredith et al. 2009, 2014). Another 156
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
6
example are losses of chitinase genes (CHIAs), enzymes that digest chitin, which 157
preferentially occurred in mammalian species that have non-insectivorous diets (Emerling 158
2018). 159
160
Recent advances in comparative genomics follow the idea that convergent phenotypic 161
evolution can be associated with convergent genomic changes e.g., gene loss (Lamichhaney et 162
al., 2019). This assumption is one conceptual foundation of the general Forward Genomics 163
approach that performs an unbiased screen for genomic changes that are associated with 164
convergent phenotypic traits (Hiller et al. 2012; Prudent et al. 2016). This approach employs 165
phenotype matrices and genome alignments to search for associations between convergent 166
phenotypic traits and genomic signatures. These genomics signatures (e.g., candidate genes) 167
need to be subjected to functional analyses to explain their association with the phenotype of 168
interest. Forward Genomics identified many new links between genomic changes in genes as 169
well as regulatory elements and various phenotypic changes such as adaptations to fully-170
aquatic lifestyles in cetaceans and manatees (Sharma et al. 2018a), echolocation in bats and 171
toothed whales (Lee et al., 2018), reductions and losses of the mammalian vomeronasal 172
system (Hecker et al., 2019a), the evolution of body armour in pangolins and armadillos 173
(Sharma et al. 2018a), the absence of testicular descent (Sharma et al. 2018b), and the 174
reduction of eye sight in subterranean mammals (Roscito et al., 2018; Langer and Hiller 175
2018). 176
177
Development of MaTrics 178
A key requirement Forward Genomics is comprehensive and digitally-available phenotypic 179
knowledge of species considered in the genomic screen. However, in contrast to genomic 180
data, phenotypic data are not readily available in such a digitized form that it can be used by 181
computer programs, not even for well-characterized species such as mammals with sequenced 182
genomes. Research in Zoology and related fields assembled a rich body of phenotypic 183
knowledge. But the information assembled over centuries is usually documented using natural 184
language and thus in the form of texts unstructured for computer-programs and so the 185
information is not machine-actionable* (Vogt et al. 2010). Whereas this is what researchers in 186
Zoology and related fields used, and still use effectively, it limits research in other disciplines 187
where substantial time investments are required to search and extract relevant phenotypic data 188
from published descriptions. As a result, this important cultural and scientific heritage is 189
underutilized in genomics and some disciplines of biomedical research. 190
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
7
191
Here, we address the need for digitally-available trait information by creating a phenotypic 192
character matrix. Since the genome “encodes” all traits that have a genetic basis, genomes of 193
many related species (such as mammals) enable Forward Genomic screens for many different 194
traits with convergent changes. Thus, comprehensive information of many traits can be stored 195
in a matrix form, where rows represent species and columns represent traits. 196
197
Constructing a comprehensive phenotypic matrix poses several challenges. While “simple” 198
phenotypes that can be compiled relatively easily across several mammals, more complex 199
phenotypes require experienced researchers in morphology, anatomy, physiology, veterinary 200
science or related fields, since interpreting the collected information on phenotypes requires 201
specialized knowledge on the terminology and taxon of interest. For example, the exact 202
meaning of specialized terms might depend on the described taxon, the author, and the time of 203
publication. Additionally, some terms might refer to spatio-structural properties, others to 204
common function or presumed common evolutionary origin, or to a mixture of both. All this 205
is well understandable to the experts, but, difficult for non-experts and even more so for 206
computer algorithms. Thus, integrating the information on phenotypes in machine actionable 207
form with other sources of data becomes exceedingly challenging and time-consuming 208
(Lamichhaney et al. 2019; Vogt 2019). For integrative research a way is sought to exploit that 209
knowledge without involving experts in each project. 210
211
To enable simpler use (and exploitation) of expert knowledge, more and more 212
information is being digitized, stored, and made accessible online such as current journals or 213
even older and classic books (Biodiverstiy Heritage Library). There are several databases for 214
storing, editing and publishing information on phenotypes (mainly on morphological ones) 215
covering various taxa (Table 1). All of them have their own purpose and relevance, but none 216
of them so far fulfils the requirements of Forward Genomics and other efforts to link 217
phenotypic to genomic differences; most neither provide information on the same character 218
across the selected species nor is the information numerically coded to be directly useful to 219
other computer analysing programs. However, tables on morphological traits for phylogenetic 220
analyses fulfil these requirements, but do not focus on extant mammals with sequenced 221
genomes. Furthermore, while inference of homologies in genomic data (nucleotide sequence 222
alignment) is fully automated, homology analyses (character alignment) of phenotypic data 223
cannot be executed by computer algorithms so far. This is irrespective of the type of basic 224
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
8
information available (digitized literature, 2D/3D scans of museum specimens). Matrices 225
usable to link phenotypic differences between species to genomic loci first need to provide 226
homology information. 227
228
To enable the full use of Forward Genomics, a trait matrix of mammalian phenotypes 229
was developed that fulfils the above-mentioned criteria. Here, we introduce MaTrics 230
(Mammalian Traits for Comparative genomics), the first and newly established matrix 231
providing information on phenotypic traits of mammals. 232
233
234
Design and coding* principles of MaTrics 235
MaTrics, version 1.0, release January 2021 (in the following still referred to as MaTrics only) 236
is implemented in the online data repository* Morph·D·Base (www.morphdbase.de, Grobe 237
and Vogt 2009) and publicly available. 238
239
Principles and data entry 240
MaTrics meets all requirements of Forward Genomics.a We primarily focused on mammalian 241
species for which genome sequences are available. Some basic principles of MaTrics are 242
described herein, a detailed user’s guide is available online (Wagner et al. 2020). Different 243
types of phenotypic traits were considered (see below) and in each case homology was 244
assured. 245
246
In case a phenotypic trait has several different expressions, it must be coded as a 247
multistate character. According Sereno (2007), the character part in a multistate character 248
comprises not only the locator but additionally a variable (V – the aspect that varies) and a 249
variable qualifier (q – the variable modifier). The character states of a multistate character in 250
MaTrics are numerically coded by 2 to n (Fig. 1B, Table 2). For example, the height of the 251
mandibular canine teeth in relation to the level of the occlusal height (averaged) of the cheek 252
teeth are coded as short (2), occlusal height (3) or long (4) (Fig. 1B). 253
254
According to Sereno (2007), a (phenotypic) trait of an operational taxonomic unit 255
(OUT; here the selected mammalian species) can be represented in a character statement that 256
is composed of two parts: character and statement, and can be divided into four types of 257
logical components (Sereno 2007:Table 4): one or more locators, a variable, and a variable 258
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
9
qualifier as parts of the character and a character state as the statement. Not all these 259
components are needed in any case, but a locator and a character state are the minimum 260
(representing character and statement). Thus, each character consists of at least one locator 261
(L – the morphological structure, the structure bearing the trait) and the statement of the 262
character state (v – mutually exclusive condition of a character) (Fig. 1). Specifying a locator 263
and a character state is sufficient in case of absent-present character statements* 264
(numerically coded by 0/1; Fig. 1A, Table 2). Following Sereno’s (2007) coding scheme, each 265
character in MaTrics is named with a label starting with a single locator or a sequence of 266
locators starting with Ln to L1 (the trait bearing structure), which provide all information 267
necessary for unambiguously identifying and locating the trait within the OTU. The sequence 268
of locators (Ln to L1 as illustrated in Fig. 1) in the character label is hierarchically organized. 269
While Sereno (2007) developed his coding scheme primarily for structural traits, we extended 270
it here and applied it also to ecological or behavioural traits. 271
272
In case a phenotypic trait has several different expressions or patterns, it must be 273
coded as a multistate character. According Sereno (2007), the character part in a multistate 274
character comprises not only the locator but additionally a variable (V – the aspect that 275
varies) and a variable qualifier (q – the variable qualifier). The character states of a 276
multistate character in MaTrics are numerically coded by 2 to n (Fig. 1B, Table 2). For 277
example, the height of the mandibular canine teeth in relation to the level of the occlusal 278
height (averaged) of the cheek teeth are coded as short (2), occlusal height (3) or long (4) 279
(Fig. 1B). 280
281
A key consideration when generating MaTrics was to clearly document the source(s) 282
for each phenotypic entry. In MaTrics, the character part of each character statement 283
therefore possesses a short textual definition that is taken from published sources (journals, 284
text books, online references) and includes references to relevant ontology terms from various 285
biomedical ontologies (the following online resources were used for identifying adequate 286
terms: Ontology Lookup Service, OLS, https://www.ebi.ac.uk/ols/index, Jupp et al. (2015); 287
Ontobee, https://www.ontobee.org, Xiang et al. (2011); Bioportal, 288
https://bioportal.bioontology.org, Musen et al. (2012)). If no adequate definition was 289
available, we provided our own definitions and clearly marked them as such. 290
291
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
10
The dimensions of MaTrics are defined by the number of rows (OTUs) and columns 292
(characters) that result in a specific number of cells (rows x columns). These cells primarily 293
contain the character states. Morph·D·Base enables the addition of further information such as 294
references, photos, illustrations, or museum specimen IDs to each matrix cell. All character 295
states recorded, thus each cell of MaTrics is linked to at least one supporting reference. This 296
refers either to citations from the literature (e.g., published journal articles, books, reliable 297
scientific online resources) or to primary data sources. These data sources can cover IDs of 298
museum specimens or direct links to media (e.g., photographs; microscopic and electron 299
microscopic (TEM and SEM) images, magnetic resonance (MRI), computed tomography (, 300
µCT), or even synchrotron data) which are directly uploaded in MDB. As a result, researchers 301
using MaTrics can trace the information to at least one original source. 302
Phenotypic traits coded in MaTrics represent by default adult states. Fetal structures or 303
traits that belong to perinatal or not yet fully-grown stages are explicitly indicated as “fetal” 304
(fetal is used as locator Ln in the character label). Traits referring to other ontogenetic stages 305
can be coded in a similar way. 306
307
Phenotypic traits included in MaTrics represent by default adult stages. Fetal 308
structures, or traits that belong to perinatal or not yet fully-grown stages are explicitly 309
indicated by placing “fetal” in front of the locator Ln in the character label. Traits referring to 310
other ontogenetic stages could be considered in a similar way. 311
312
The MaTrics or individual characters can be exported as a Nexus* file that provides 313
data in a structured way and can be used as input in various software analysis tools. 314
315
316
Specificities of MaTrics 317
The primary motivation to generating MaTrics was to create a research tool for linking 318
phenotypic differences between species to differences in their genomes. This is the main 319
reason why intraspecific variations of traits such as sexual dimorphism were not considered. 320
Another specificity is that character states (presence/absence; multistate) do not encode 321
character polarity. Researchers can decide for each project individually whether to use and 322
determine polarity or not. The characters might be further analysed (e.g., polarity analyses 323
using out-group comparison) if considered for phylogenetic studies or gene loss analyses. 324
Finally, character dependencies were not specifically accounted for during the choice and 325
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
11
coding of traits. For each research question, specific characters of interest were added to 326
MaTrics. Similarly, for different projects, characters can be selected individually to be 327
retrieved from MaTrics for other use. Character dependencies can be avoided or reduced in 328
this way, if needed. 329
330
331
Current status: MaTrics 332
To date, MaTrics contains 207 characters for 147 mammalian species to date, resulting in a 333
total of 30,282 documented character states. 153 of the 207 characters (74%) are described as 334
absent-present characters and the remaining 53 (26%) are multistate characters. The 335
mammalian species considered in MaTrics include two representatives of Monotremata, five 336
of Marsupialia and 140 of placental mammals (supplementary material Table S1). The 337
number of species from each order neither represents the respective diversity nor the 338
morphological disparity of mammalian orders, as the primary criterion for the inclusion in 339
MaTrics was the availability and suitable quality of whole genomes. The characters in 340
MaTrics cover structural, ecological, ethological, and physiological phenotypic traits (Table 341
3). All, but one character (organum vomeronasale), refer to the adult stage. For one character 342
(os jugale), the recording is 100%, so all cells contain coded and referenced character states. 343
Some traits were specifically included for the study in subsets of the listed mammals, and 344
therefore the recording purposely is less complete (for coding status see Table S2). 345
346
347
Notes on application 348
The primary motivation for creating MaTrics was to provide fully referenced phenotypic 349
information for applications in comparative genomics, especially the Forward Genomics 350
approach. The creation and filling of MaTrics and studies applying Forward Genomics were 351
developed in parallel within the mentioned project. So, phenotypes were coded in MaTrics 352
were partially successfully used in earlier studies and simpler shorter tables e.g. by Sharma et 353
al. (2018a) who identified various convergent gene losses associated with some specific 354
convergent mammalian phenotypes. They showed convincingly that tooth and enamel loss are 355
associated with the loss of ACP4 (a gene that is associated with the enamel disorder 356
amelogenesis imperfecta), and that the presence of scales is associated with the loss of the 357
gene DDB2 (which detects substances resulting from UV-light and helps to induce DNA 358
repair). The fully aquatic lifestyle is associated with the loss of MMP12, a gene associated 359
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
12
with breathing adaptation. The documented loss of these genes in some mammalian species is 360
functionally explainable either as a consequence of trait loss (the genes ACP4 and DDB2 361
have no function after trait loss) or as putative adaptive genomic alteration, causing novel 362
phenotypes (MMP12-loss is associated with novel lung functions in aquatic mammals) 363
(Sharma et al. 2018a). Such results might help to better understand some related human 364
diseases, as for example in the case of DDB2 whose mutations cause xeroderma pigmentosum 365
which manifests in hypersensitivity to sunlight (Rapic-Otric et al. 2003). 366
367
Another study investigated the gene losses associated with the reduction of the 368
vomeronasal system (VNS) in several mammals. A genomic comparison of 115 mammalian 369
genomes confirmed thatTrpc2 is an indicator for the functionality of the VNS (Hecker et al. 370
2019a). Moreover, it indicated a loss of functionality of the VNS in seals (Phocidae) and 371
otters (Lutrinae). Morphological data is scarce for seals and there is no data for otters (Hecker 372
et al. 2019a; Zhang and Nikaido 2020). A study to test the accuracy of the suggested 373
predictability is under way. This study is an example for testing genotype-phenotype 374
associations in non-model organisms and shows the potential of the combination of 375
comparative morphological and genomic approaches. 376
377
However, the relevance of MaTrics is by no means restricted to the Forward 378
Genomics approach. Characters were also included in MaTric for the usage in the 379
contemporary study to explore evolutionary conditions associated with the loss of genes 380
related to convergent evolution of herbivorous and carnivorous diet in mammals (Hecker et 381
al. 2019b). This study included 52 placental species and suggests that the lipase inhibitor gene 382
PNLIPRP1 is preferentially lost in herbivores, whereas the xenobiotic receptor NR1I3 is 383
preferably lost in carnivores. Even though the authors put forward hypotheses, the lack of 384
accessible data on mammalian diet preferences made it difficult to test whether gene losses 385
are associated with dietary fat content and diet-related toxins. Investigating whether 386
convergent gene loss is associated with similar dietary preferences may additionally hold 387
information on whether gene losses might be adaptive (Albalat and Cañestro 2016). 388
Consequently, an ongoing study records dietary categories in MaTrics that allow a semi-389
quantitative of dietary fat content (associated with PNLIPRP1) and diet-related toxins 390
(associated with NR1I3) (Wagner et al. ####). This study provided evidence that the 391
convergent loss of both genes is associated with the convergent evolutionary change of 392
dietary preferences, i.e. the consumption of a diet with reduced fat and toxin contents. The 393
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
13
hypotheses of Hecker et al. could be refined and also the evolutionary setting could be 394
reconstructed. 395
396
Future analyses using MaTrics have the potential to test how gene losses and dietary 397
composition are related to the presence/absence of structures or organs associated with 398
digestive processes. Even further, it allows investigating whether evolutionary changes in diet 399
composition are not only associated with the loss/presence of single molecules (e.g., lipase 400
inhibitor, xenobiotic receptor), but also with changes in complex structures and their 401
associated gene. For instance, it is interesting to note that first statistical investigations 402
(methods given in document S3) have not yet proven a significant association between the 403
presence of a gall bladder and the diet (p=0.74) as well as the lipase inhibitor gene PNLIPRP1 404
(p=0.49). This observation motivates the further development of MaTrics, i.e. by adding 405
further traits and species. 406
407
These two studies show how genomic and morphological studies are entangled: 408
current knowledge of morphology serves as basis for creating phenotypic trait matrices like 409
MaTrics which – on the other hand – forms the basis of genomic research, especially the 410
Forward Genomics approach. Hypotheses associated with findings of candidate loci, may in 411
turn inspire further morphological research. 412
413
The most obvious application are morphological studies. Although mammal dentitions 414
are well studied and a lot is known about teeth number, form, and shape in particular in 415
relation to dietary specialization (see Thenius 1989; Hillson 2005; Ungar 2010), we still have 416
many gaps of knowledge, e.g., concerning functional adaptations and evolutionary 417
transformations. Thus, Sole and Ladevèze (2017) aimed to put forward new ideas on how the 418
basic mammalian tribosphenic molar was transformed to sectorial teeth in hypercarnivorous 419
mammals. They (Sole and Ladevèze 2017) included only carnivores as defined by flesh-420
eating and the presence of carnassial teeth, representatives of the living Carnivoramrpha 421
(including the extinct Nimravidae) and Dasyuromorphia, as well as from the extinct 422
Sparassoodonta, Oxyaenodonta, and Hyaenodontida in their study. Comparing the cusp 423
pattern/morphology of the upper and lower molars of these species Sole and Ladevéze 424
(2017:fig. 4) derived a scheme for the morphological evolution of the sectorial teeth in 425
hypercarnivorous mammals. They also aimed at providing new arguments to discuss the 426
developmental aspects of the evolution of hypercarnivory by associating their morphological 427
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
14
observations with ontogenetic studies. The latter highlighted the importance of the expression 428
of ectodysplasin A (Eda): increased levels are able to modify the number, shape, and position 429
of cusps in mice during tooth development (Kangas et al. 2004). Further, Häärä et al. 430
(2012:3189) showed – again in mice – that “Fgf20 is a major downstream effector of Eda and 431
affects Eda-regulated characteristics of tooth morphogenesis, including the number, size, and 432
shape of teeth. Fgf20 function is compensated for by other Fgfs”. Inspired by the observations 433
and the model of Solé and Ladevèze (2017), we started a study with a subsample of Carnivora 434
(Table S3) collected in MaTrics with two aims: firstly, to test the suitability of MaTrics in 435
comparative morphological studies and, secondly to set the basis to proceed with genome 436
wide searches for genomic causes correlated with the loss of cusps. This seems to be 437
promising with the development of new methods to include searches for regulatory elements 438
(see below). 439
440
For the selected Carnivora (Table S4) the absence and presence of individual tooth 441
cusps for the fourth upper premolar (P4) and all molar teeth were recorded in MaTrics. The 442
nomenclature of the cusps followed Thenius (1989, exemplified in Fig. 2). The detailed 443
descriptions of cusp patterns for the species are given in the supplementary document S5 and 444
examples are illustrated in Fig. 3 and detailed in Table S6. Some of our results confirmed the 445
observations of Solé and Ladevèze (2017), who focused on carnivores as defined by the 446
presence of carnassials. We confirm that parastyle and protocone of the P4 are generally 447
reduced in hypercarnivorous carnivorans. Interestingly, both structures are more reduced in 448
the Canidae and the polar bear (Ursus maritimus) than in the members of the Felidae and 449
Hyaenidae. Solé and Ladevèze (2017) reported that in the upper molars protocone, paraconule 450
and metaconule are reduced in hypercarnivorous mammals which is also in line with our 451
findings. 452
These structures are reduced in the Canidae, and totally absent in the Felidae and 453
Hyaenidae. Solé and Ladevèze (2017) also found, that metaconid and talonid are generally 454
lost in hypercarnivorous mammals, especially felid-like and hyaenid-like hypercarnivores. 455
Based on our study, we found that metaconid and talonid are completely reduced only in the 456
Felidae (except the cheetah, Acinonyx jubatus) and the spotted hyena (Crocuta crocuta). Like 457
in the Canidae and the striped hyena (Hyaena hyaena), both structures are also present in 458
Ursus maritimus. The specialized hypercarnivorous diet of several Feliformia lead to an 459
extreme reduction of the tribosphenic molar, whereas the Canidae and Ursus maritimus also 460
eat fruits and vegetables and therefore need crushing structures. The presence of protocone 461
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
15
and talonid seems to be necessary for an omnivorous diet (Solé and Ladevèze 2017), but 462
based on our study we can confirm that this is also true for herbivorous species (e.g., red 463
panda, Ailurus fulgens; giant panda, Ailuropoda melanoleuca). 464
465
Except for the Pacific walrus (Odobenus rosmarus) at least 10 specimens per species 466
were analysed (Table S3); and for several species, exceptions of the common pattern in the 467
presence of cusps were observed (Table 4). MaTrics was not designed to take intraspecific 468
variability into account, therefore only the most common cusp patterns for each species were 469
recorded. Deviations from the cusp patterns are present in several cusps in domestic dog, 470
brown bear (Ursus arctos) and for one cusp in the red fox (Vulpes vulpes). Such exceptions 471
are important as they might indicate evolutionary trends. However, variations within a species 472
cannot be reflected in MaTrics as maximally one character state is given for each character 473
representatively for a species here. Only in this way the (common) absence or presence of a 474
trait can be compared with the genome of again one representative of a species. Studies on 475
intraspecific variability of certain characters would need additional matrices with different 476
intentions. 477
478
479
Conclusion and Future Perspectives 480
Recent advances in molecular techniques lead to a rapid increase in the assembly and 481
publication of genomes from various organisms. However, knowledge of the genome 482
sequences is only a first step to understand the relationships between genomic changes, the 483
phenotype of an organisms and phenotypic differences between different organisms (Hardison 484
2003). The systematic description of phenotypic information in matrix form like in MaTrics is 485
necessary to understand the genome information and to deal with questions related to 486
evolutionary biology and biomedicine. This is not restricted to mammals as the coding 487
principles of MaTrics, which comply with the requirements of molecular research, can serve 488
as a template for matrices comprising trait knowledge of other vertebrate and non-vertebrate 489
groups. The establishment of trait matrices for various taxa could lead to a broad 490
documentation of phenotypes for applications in comparative genomics, and, hence, enable a 491
systematic exploration of genotype-phenotype associations. 492
493
However, trait collections such as MaTrics also revealed a tremendous research gap on 494
phenotypic data. In fact, filling MaTrics with information on different phenotypic traits across 495
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
16
mammals showed that detailed information on structural, physiological, or life history traits 496
was often not available for many species, even with intensive literature research. For example, 497
reductions of the vomeronasal system (VNS) are clearly documented in several mammals and 498
our previous genomic comparison of 115 mammalian genomes uncovered several genes 499
whose loss is associated with a reduced or non-functional VNS (Hecker et al. 2019a). This 500
genomic screen also revealed that seals (Phocidae) and otters (Lutrinae) have lost some of 501
these genes, indicating a reduced VNS. However, to the best of our knowledge, information 502
concerning the vomeronasal organ of Phocidae and Lutrinae is not available. Indeed, the 503
recording status in MaTrics for the character “vomeronasal organ” with the states 504
absent/present is only 37%. Another example of a character, that would be assumed to be 505
well-known, is the absence/presence of the gall bladder (“Vesica bilaris”), with a recording 506
status of 70%. In other words, the recording status of the characters in the MaTrics 507
demonstrate the lack of information on phenotypic traits in several species. These research 508
gaps can only be filled by specimen-based research (e.g. Thier and Stefen 2020). Although 509
individual studies are valuable scientific contributions, they may not suffice to close the 510
substantial research gaps in short time. The authors see the need for more basic zoological 511
research complementing the systematic exploration of the genomic basis of biodiversity, i.e. 512
research activities on biodiversity genomics could be assisted by research initiatives on 513
biodiversity phenomics (= systematically phenotyping animals in matrices like MaTrics). 514
515
Most of the genomic studies mentioned above identified protein coding genes 516
associated with complex body plan changes (e.g., aquatic and aerial lifestyle of cetaceans and 517
bats, respectively). However, evolutionary theory predicts that changes in cis-regulatory 518
genetic elements are probably more important for morphological changes than protein-coding 519
genes. For instance, Roscito et al. (2018) stated that the loss of morphological traits is (often) 520
associated with the decay of the cis-regulatory elements. Consequently, the Forward Genomic 521
approach has been further developed to include methodologies that can be successfully 522
associate phenotypes with the loss or presence of regulatory elements (e.g., Langer et al. 523
2018; Langer and Hiller 2019). In awareness of these developments, the phenotype matrix 524
presented here already provides a whole bunch of morphological characters that will be 525
subject to further exploration in the near future. Thus, the phenotypic information compiled in 526
MaTrics will be of increasing importance. This applies for instance to those referring to tooth 527
morphology and tooth cusps discussed above. In fact, tooth characters are known to be the 528
result of a complex signalling network involving timely graded activation and deactivation of 529
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
17
genes controlled by regulatory elements (e.g., Jernvall and Thesleff 2000; Thesleff et al. 530
2001). 531
532
A last aspect to be mentioned refers to the way how phenotypic information is 533
documented. So far, filling MaTrics with information is still mostly conducted by hand; 534
experienced scientists have to control the content and to check for homology. However, some 535
recent developments may open the door to the partial automation of this work. First, the 536
implementation of ontologies and semantic phenotypes in the platform Morph·D·Base. The 537
development of a respective semantic description module is already initiated (Vogt and Baum 538
2019; Vogt 2019). This is expected to allow the development of computer algorithms to mine 539
data on homologous structures to establish matrices more automatically (Vogt 2018). 540
541
MaTrics is a new and unique data collection of phenotypic traits of mammalian 542
species. By including homologous phenotypic traits across (an increasing number of) species, 543
MaTrics and similar matrices can serve as basis for a variety of research fields as illustrated 544
herein. The recorded phenotypic traits are well defined and fully referenced (characters as 545
well the character state for each species). Not only literature data are accepted for the latter, 546
but also references to specimens in collections, which contributes in a specific way to the 547
digitalization of collection material. MaTrics data are directly useful in genomic studies since 548
the character states are numerically coded and hence can be extracted as NEXUS file to be 549
machine-actionable. The scientific potential of digitized phenotype matrices is apparent and 550
motivates thinking about future development. 551
552
553
554
Acknowledgement 555
We want to thank members of the German Society of Mammalogy (DGS) for stimulating 556
discussions at the annual DGS meeting 2019 which were useful to shape the manuscript. 557
Also, the helpful comments of the reviewers (….) are thankfully acknowledged. 558
Funding 559
This work was funded by the interdisciplinary research project ‘Identifying genomic loci 560
underlying mammalian phenotypic variability’ by the Leibnitz Association, grant SAW-2016-561
SGN-2. MH was supported by the Max Planck Society and the LOEWE-Centre for 562
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
18
Translational Biodiversity Genomics (TBG) funded by the Hessen State Ministry of Higher 563
Education, Research and the Arts (HMWK). 564
565
566
References 567
Albalat R, Cañestro C (2016) Evolution by gene loss. Nature Rev Genet 17:379-391 568
Asher RJ (2007) A web-database of mammalian morphology and a reanalysis of placental 569
phylogeny. BMC Evol Biol 7:108. https://doi.org/10.1186/1471-2148-7-108 570
Bolker J (2012) Model organisms: There's more to life than rats and flies. Nature 571
491(7422):31 572
De Crécy-Lagard V, Hanson AD (2018) Comparative Genomics. Reference Module in 573
Biomedical Sciences. 574
https://www.sciencedirect.com/topics/neuroscience/comparative-genomics 575
Edmunds RC, Su B, Balhoff JP, Dahdul WM, Lapp H, Lundberg JG, Vision TJ, Dunham RA, 576
Mabee PM, Westerfield M (2016) Phenoscape: Identifying candidate genes for 577
species-specific phenotypes. Molec Biol Evol 33:13–24. doi:10.1093/molbev/msv223 578
Emerling CA, Delsuc F, Nachman MW (2018) Chitinase genes (CHIAs) provide genomic 579
footprints of a post-Cretaceous dietary radiation in placental mammals.Science 580
Advances 4(5):eaar6478 581
Feng S, Stiller J, Deng Y, Armstrong J, et al Zhang G. (2020) Dense sampling of bird 582
diversity increases power of comparative genomics. Nature 587:252-257. 583
https://doi.org/10.1038/s41586-020-2873-9 584
Fleischmann R, Adams M, White O, Clayton R, Kirkness E, Kerlavage A, Bult C, Tomb J, 585
Dougherty B, Merrick J, et al. (1995) Whole-genome random sequencing and 586
assembly of Haemophilus influenzae Rd. Science 269(5223):496–512. 587
doi:10.1126/science.7542800 588
Genome 10K Community of Scientists (2009) Genome 10K: a proposal to obtain whole 589
genome sequence for 10 000 vertebrate species. J Hered 100.6: 659–674 590
Grobe P, Vogt L (2009) Documenting Morphology: Morph·D·Base. In: Wägele JW, 591
Bartolomaeus T (eds) Deep Metazoan Phylogeny: The Backbone of the Tree of Life –592
New Insights from Analyses of Molecules, Morphology, and Theory of Data Analysis. 593
De Gruyter, Berlin, pp 475-503. http://www. morphdbase.de 594
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
19
Häärä O, Harjunmaa E, Lindfors PH, Huh SH, Fliniaux I, Åberg T, Jernvall J, Ornitz DM, 595
Mikkola ML, Thesleff I (2012) Ectodysplasin regulates activator-inhibitor balance in 596
murine tooth development through Fgf20 signaling. Development 139(17):3189–3199 597
Hardison RC (2003) Comparative genomics. PLoS Biol, 1(2):e58 598
Harrow JL, Steward CA, Frankish A, Gilbert JG, Gonzalez JM, Loveland JE, et al, Wilming 599
LG (2014) The vertebrate genome annotation browser 10 years on. Nuc Acid Res 600
42(D1):D771–D779 601
Hecker N, Lächele U, Stuckas H, Giere P, Hiller M (2019a) Convergent vomeronasal system 602
reduction in mammals coincides with convergent losses of calcium signalling and 603
odorant degrading genes. Mol Ecol 28(16):3656–3668 604
Hecker N, Sharma V, Hiller M (2019b) Convergent gene losses illuminate metabolic and 605
physiological changes in herbivores and carnivores. Proc Natl Acad Sci USA 606
116(8):3036–3041 607
Hiller M, Schaar BT, Indjeian VB, Kingsley DM, Hagey LR, Bejerano G (2012) A “Forward 608
genomics” approach links genotype to phenotype using independent phenotypic losses 609
among related species. Cell reports 2(4):817–823 610
Hillson S (2005) Teeth. Cambridge university press, Cambridge 611
Huelsmann M, Hecker N, Springer MS, Gatesy J, Sharma V, Hiller M (2019) Genes lost 612
during the transition from land to water in cetaceans highlight genomic changes 613
associated with aquatic adaptations. Science advances 5(9):eaaw6671 614
Jernvall J (2000). Linking development with generation of novelty in mammalian teeth. Proc 615
Nat Acad Sci 97(6):2641-2645 616
Jernvall J, Thesleff I (2000) Reiterative signalling and patterning during mammalian tooth 617
morphogenesis. Mechanisms dev 92(1):19–29 618
Jupp S, Burdett T, Leroy C, Parkinson HE (2015) A new Ontology Lookup Service at EMBL-619
EBI. In: Malone J et al. (eds.) Proceedings of SWAT4LS International Conference 620
2015, pp 118–119 621
Kangas AT, Evans AR, Thesleff I, Jernvall J (2004) Nonindependence of mammalian dental 622
characters. Nature 432(7014):211-214 623
Lamichhaney S, Card DC, Grayson P, Tonini JF, Bravo GA, Näpflin K, et al, Sackton TB 624
(2019) Integrating natural history collections and comparative genomics to study the 625
genetic architecture of convergent evolution. Phil Trans Royal Soc B 626
374(1777):20180248 627
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
20
Langer BE, Hiller M (2019) TFforge utilizes large-scale binding site divergence to identify 628
transcriptional regulators involved in phenotypic differences. Nuc acids res 47(4):e19-629
e19 630
Langer BE, Roscito JG, Hiller M (2018) REforge associates transcription factor binding site 631
divergence in regulatory elements with phenotypic differences between species. Mol 632
Biol Evol 35(12):3027–3040 633
Lecocq T, Benard A, Pasquet A, Nahon S, Ducret A, Dupont-Marin K, Lang I, Thomas M 634
(2019) TOFF, a database of traits of fish to promote advances in fish aquaculture. 635
Scientific Data 6(1):1–5 636
Lee JH, Lewis KM, Moural TW, Kirilenko B, Bogdanova B, Prange G, Koessl M, 637
Huggenberger S, Kang C, Hiller M (2018) Molecular parallelism in fast-twitch muscle 638
proteins in echolocating mammals. Science Adv 4(9): eaat9660 639
Meredith RW, Gatesy J, Murphy WJ, Ryder OA, Springer MS (2009) Molecular decay of the 640
tooth gene enamelin (ENAM) mirrors the loss of enamel in the fossil record of 641
placental mammals. PLoS Genet 5(9):e1000634 642
Meredith RW, Gatesy J, Springer MS (2013) Molecular decay of enamel matrix protein genes 643
in turtles and other edentulous amniotes. BMC evol biol 13(1):20 644
Meunier R (2012) Stages in the development of a model organism as a platform for 645
mechanistic models in developmental biology: Zebrafish, 1970–2000. Studies in 646
History and Philosophy of Science Part C: Studies in History and Philosophy of 647
Biological and Biomedical Sciences 43:522–531 648
Musen MA, Noy NF, Shah NH, Whetzel PL, Chute CG, Story MA, Smith B, NCBO team 649
(2012) The National Center for Biomedical Ontology. J Am Med Inform 650
Assoc19:190–5. Epub 2011 651
Nobrega MA, Pennacchio LA (2004) Comparative genomic analysis as a tool for biological 652
discovery. J physiol 554(1):31–39 653
O'Leary MA, Kaufman S (2011) MorphoBank: phylophenomics in the ‘‘cloud’’. Cladistics 654
27:1–9 655
Pavey SA, Bernatchez L, Aubin-Horth N, Landry CR (2012) What is needed for next-656
generation ecological and evolutionary genomics?. TREE 27(12):673–678 657
Porter IH (1973) From gene to phene. J Invest Dermatol 60(6):360–368 658
Pruvost M, Bellone R, Benecke N, Sandoval-Castellanos E, Cieslak M, Kuznetsova T, 659
Morales-Muñiz A, O'Connor T, Reissmann M, Hofreiter M, Ludwig A (2011) 660
Genotypes of predomestic horses match phenotypes painted in Paleolithic works of 661
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
21
cave art. Proc Natl Acad Sci USA 108(46):18626-18630. 662
doi:10.1073/pnas.1108982108 663
Prieto-Marquez A, Erickson GM, Seltmann K, Ronquist F, Riccardi GA, Maneva-Jakimoska 664
C, et al, Deans A (2007) Morphbank, an avenue to document and disseminate 665
anatomical data: phylogenetic and paleohistological test cases. J Morph 268:1120–666
1120 667
Prudent X, Parra G, Schwede P, Roscito JG, Hiller M (2016) Controlling for phylogenetic 668
relatedness and evolutionary rates improves the discovery of associations between 669
species’ phenotypic and genomic differences. Molec biol evol 33(8):2135-2150 670
Roscito JG, Sameith K, Parra G, Langer BE, Petzold A, Moebius C, Bickle M, Rodrigues 671
MT, Hiller M (2018) Phenotype loss is associated with widespread divergence of the 672
gene regulatory landscape in evolution. Nat Commun 9:737. 673
https://doi.org/10.1038/s41467-018-0712 674
Rosenthal N., Brown S. (2007) The mouse ascending: perspectives for human-disease models. 675
Nature cell biol 9:993–999 676
Ruzicka L, Bradford YM, Frazer K, Howe DG, Paddock H, Ramachandran S, Singer A, Toro 677
S, Van Slyke CE, Eagle AE, Fashena D, Kalita P, Knight J, Mani P, Martin R, Moxon 678
SA, Pich C, Schaper K, Shao X, Westerfield M (2015) ZFIN, the Zebrafish Model 679
Organism Database: Updates and new directions. Genesis 53(8):498–509 680
Schulz S, Jansen L. (2013) Formal ontologies in biomedical knowledge representation. IMIA 681
Yearb Med Inform 8(1):132–46 682
Schulz S, Stenzhorn H, Boekers M, Smith B (2007) Strengths and limitations of formal 683
ontologies in the biomedical domain. Electron J Commun Inf Innov Health 3(1):31–45 684
Sereno PC (2007) Logical basis for morphological characters in phylogenetics. Cladistics 685
23:565–587 686
Sharma V, Hecker N, Roscito JG, Foerster L, Langer BE, Hiller M (2018a) A genomics 687
approach reveals insights into the importance of gene losses for mammalian 688
adaptations. Nat Commun 9:1215. https://doi.org/10.1038/s41467-018-03667-1 689
Sharma V, Lehmann T, Stuckas H, Funke L, Hiller M (2018b) Loss of RXFP2 and INSL3 690
genes in Afrotheria shows that testicular descent is the ancestral condition in placental 691
mammals. PLoS Biology 16e2005293 692
Smith B (2003) Ontology. In: Floridi L (ed) Blackwell Guide to the Philosophy of Computing 693
and Information. Blackwell Publishing, Oxford, pp 155–166 694
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
22
Solé F, Ladevèze S (2017) Evolution of the hypercarnivorous dentition in mammals 695
(Metatheria, Eutheria) and its bearing on the development of tribosphenic molars. Ev 696
Dev 19(2):56–68 697
Teeling EC, Vernes SC, Dávalos LM, Ray DA, Gilbert MTP, Myers E, Bat1K Consortium 698
(2018) Bat biology, genomes, and the Bat1K project: to generate chromosome-level 699
genomes for all living bat species. Annu Rev Anim Biosci 6:23–46 700
Thenius E. (1989) Zähne und Gebiss der Säugetiere. Handbuch der Zoologie. volume 8, 701
Mammalia, part 56, Walter de Gruyter, Berlin 702
Thesleff I, Keranen S, Jernvall J (2001) Enamel knots as signaling centers linking tooth 703
morphogenesis and odontoblast differentiation. Advances Dent Res 5(1):14–18 704
Thier N, Stefen C (2020) Morphological and radiographic studies on the skull of the straw-705
coloured fruit-bat Eidolon helvum (Chiroptera: Pteropodidae). Vertebrate Zoology 706
70(4). https://doi.org/10.26049/VZ70-4-2020-05 707
Vaughan TA, Ryan JM, Czaplewski NJ (2015) "Chapter 4: Classification of Mammals" 708
(PDF). Mammalogy (Sixth ed.) 709
Vogt L (2018) The logical basis for coding ontologically dependent characters. Cladistics 710
34(4):438–458 711
Vogt L (2019) Organizing phenotypic data—a semantic data model for anatomy. J. Biomed. 712
Semant. 10 (2019) 12. https://doi:10.1186/s13326-019-0204-6 713
Vogt L, Bartolomaeus T, Giribet G (2010) The linguistic problem of morphology: structure 714
versus homology and the standardization of morphological data. Cladistics 26:301–715
325 716
Vogt L, Baum R (2019) Using named graphs and knowledge graph template patterns for 717
efficiently organizing FAIR anatomy data and metadata. Biodiversity Information 718
Science and Standards 2019. doi:10.3897/biss.3.37205 719
Vogt L, Baum R, Bhatty P, Köhler C, Meid S, Quast B, et al. (2019) SOCCOMAS: a FAIR 720
web content management system that uses knowledge graphs and that is based on 721
semantic programming. Database. 2019 (baz067):1–22 722
723 Wagner F, Peters B, Giere P, Grobe P, Hofmann R, Jähde M, Lächele U, Lehmann, T, 724
Ortmann S, Ruf I, Schiffmann C, Stefen C, Stuckas H, Thier N, Unterhitzenberger G, 725
Vogt L (2020) How to use Mammalian Traits for Comparative Genomics (MaTrics) 726
Design Principles of a project trait matrix in Morph
·
D
·
Base. URL will follow 727
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
23
Wagner F, Ruf I, Hofmann R, Lehmann T, Ortmann S, Schiffmann C, Hiller M, Stefen C, 728
Stukas H (####) Convergent evolutionary changes in mammalian composition are 729
associated to convergent gene loss: a case study for the lipase inhibitor PNLIPRP1 and 730
the xenobiotic receptor NR1I3 731
Waterston RH, Lindblad-Toh K, Birney E, et al. (2002) Initial sequencing and comparative 732
analysis of the mouse genome. Nature 420(6915):520–562. doi:10.1038/nature01262. 733
PMID 12466850 734
Wilson DE, Reeder DM (2005) Mammal species of the world: a taxonomic and geographic 735
reference. 3rd Ed, John Hopkins University Press, Baltimore. 736
Xiang Z, Mungall C, Ruttenberg A, He Y (2011) Ontobee: A Linked Data Server and
737
Browser for Ontology Terms. Proceedings of the 2nd International Conference on 738
Biomedical Ontologies (ICBO), July 28-30, 2011, Buffalo, NY, USA. pp 279-281. 739
http://ceur-ws.org/Vol-833/paper48.pdf 740
Zhang Z, Nikaido M (2020) Inactivation of ancV1R as a predictive signature for the loss of 741
vomeronasal system in mammals. Genome Biol Evol 12(6):766-778 742
Zoonomia Consortium: Genereux DP, Serres A, Armstrong J, Johnson J, Marinescu V, Murén 743
E, et al, Damas J (2020) A comparative genomics multitool for scientific discovery 744
and conservation. Nature 587(7833):240-245. 745
https://www.nature.com/articles/s41586-020-2876-6 746
747
748
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
24
Glossary 749
Anatomy - "The demonstrable facts of animal structure, or also, by transference to the object, 750
the structure or even the tissue of the animal itself." (Snodgras 1951:173). In other words, 751
anatomy is the part of the phenotype of an organism that refers to its physical and structural 752
properties. At the same time, it refers to the science of anatomy, with anatomical data being 753
facts about the anatomy of organisms. 754
755
Character Coding – The parameterized description of a quality or relation of an operational 756
taxonomic unit. 757
758
Character Statement – see Sereno 2007 759
760
Data repository – A large database infrastructure that collects, manages, and stores data sets 761
for data analysis, sharing and reporting. A data repository is also known as a data library or 762
data archive. NCBI GenBank is an example of a data repository for a sequence database. 763
764
Machine actionable – Data and metadata that are structured in a formalized and consistent 765
way so that machines (i.e. computers) can read and use them with algorithms that were 766
programmed against this structure. Machine-actionability of data and metadata includes for 767
instance the use of persistent identifiers for data creators (e.g. ORCIDs), organizations and 768
funding agencies, but also open accessibility of data for machines through a corresponding 769
application programming interface (API), and basic semantics that allow algorithms to 770
distinguish different categories of information and apply rules to them. Machine-actionability 771
in this sense goes beyond machine-readability which only requires data and metadata to be 772
readable by a machine, i.e. data and metadata must be provided in a machine-readable format. 773
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
25
Machine-readability does not necessarily require data and metadata to provide basic semantics 774
for allowing algorithms to distinguish different categories of information contained in them. 775
776
Morphology – “Our philosophy or science of animal form, a mental concept derived from 777
evidence based on anatomy and embryogeny, usually incapable of proof, attempting to 778
discover structural homologies and to explain how animal organization has come to be as it 779
is.” (Snodgrass (1951:173). In other words, morphology refers to the interpretations of 780
anatomical facts within theories and hypotheses such as homology. 781
782
NEXUS file – A file format widely used in bioinformatics. It stores information about taxa, 783
phenotypic characters, trees, and other information relevant for phylogenetics. Several 784
phylogenetic programs such as PAUP, MrBayes, and Mac Clade use this format. 785
786
Phenotypic trait – A particular part of the phenotype of an organism. The Phenotype of an 787
organism refers to its observable constituents, properties, and relations that can be considered 788
to result from the interaction of the organism’s genotype with itself and its environment. 789
Anatomy is the part of the phenotype that refers to the physical and structural properties of the 790
organism. 791
792
Ontology – Ontologies are dictionaries that can be used for describing a certain reality. They 793
consist of labeled classes and relations between classes, both with clear definitions that are 794
ideally created by experts through consensus and that are formulated in a highly formalized 795
canonical syntax and standardized format with the goal to yield a lexical or taxonomic 796
framework for knowledge representation (Smith 2003). Each ontology class and relation (also 797
called property) possesses its own Uniform Resource Identifier (URI*) through which it can 798
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
26
be identified and individually referenced. Ontologies contain expert-curated domain 799
knowledge about specific kinds of entities together with their properties and relations in the 800
form of classes defined through universal statements (Schulz et al. 2009, Schulz and Jansen 801
2013). Ontologies in this sense do not include statements about particular entities (i.e., 802
empirical data). (Vogt et al. 2019) 803
URI – A Uniform Resource Identifier (URI) is a string of characters that follows a specific 804
structure and unambiguously identifies a particular resource. The URI can also serve as a 805
URL (web address), and can be resolved to an IP address (see the example URI below). 806
http://purl.obolibrary.org/obo/CL_0000255 (for eukaryotic cell) 807
808
809
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
27
Tables 810
Table 1 Examples of data repositories in which phenotypic data of different vertebrate taxa 811
are collected. The table lists the projects with their URL and aim and/or type of information 812
that is stored and, if available, references in which the project is introduced. 813
814
Project Link Aim, type of information Reference
Morphobank http://www.mor
phobank.org
Homology of phenotypes over the web;
building the Tree of Life with phenotypes,
publicly accessible containing images and
matrices
O'Leary and
Kaufmann
2011
Digimorph http://www.digi
morph.org
A National Science Foundation Digital
Library at The University of Texas Austin,
a dynamic archive that holds high-
resolution X-ray computed tomography of
biological specimens
Morphbank http://www.mor
phbank.net
A continuously growing database of
Biological Imaging and stores images that
scientists use for international
collaboration, research and education
Morphologi-
cal Image
database
http://people.pw
f.cam.ac.uk/rja5
8/database/morp
hsite_bmc07.ht
ml
Asher 2007
Phenoscape http://kb.phenos
cape.org
Data resource that is ontology-driven and
contains information about mutant
zebrafish (Danio rerio) phenotypes curated
by the zebrafish model organism database,
ZFIN at http://zfin.org
Ruzicka et
al. 2015;
Edmunds et
al. 2015
TOFF http://toff-
project.univ-
lorraine.fr
An open source repository focusing on fish
functional traits. It aims to combine
behavioural, morphological, phenological,
and physiological traits with environmental
Lecocq et al.
2019
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
28
measurements
815
816
817
Table 2 The numerical coding options for (A) absent/present and (B) multistate characters in 818
MaTrics. The numerals 0 and 1 refer to the character states ‘absent’ and ‘present’, thus, the 819
coding for multistate character states starts at 2 and continues to 820
821
State State name Description
(A) Absent/present characters
? missing Information is missing
- inapplicable Refers to traits which are part of a structural
complex which is absent in a species (e.g., a/p
recording of roots in a toothless species)
0 absent Absence of the trait
1 present Presence of the trait
(B) Multistate characters
? missing Information is missing
- inapplicable Refers to traits which are part of a structural
complex which is absent in a species (e.g., trait
“prehensile tail” in a tailless species)
2 state 2 Lowest expression (or absence) of the character
variable
3, 4, 5, ..., n state 3, 4, 5, … n Each different state of increasing expression of
the character variable, either nominal or scaled,
is given with a number starting with 3
822
823
824
825
Table 3 Gross categories of 206 characters included in MaTrics and number of characters in 826
these categories 827
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
29
828
Gross category Subcategory n
Anatomy/Morphology 164
Body plan 1
Cranial skeleton 30
Dentition 94
Gastrointestinal tract 5
Head 4
Integument 3
Postcranial skeleton 26
Sense organs 1
Ecology 30
Ethology 5
Physiology 6
Embryonic 1
Total 206
829
830
Table 4 Deviations in cusp patterns in the studied Carnivora. Abbreviations: M1–3, upper 831
(indicated by number in superscript)/lower molar tooth (indicated by subscript); P4 – upper 4th 832
premolar 833
834
Species
Deviation from common cusp pattern for species
Canis familiaris Metaconid and hypoconid at M
3
Small cusp mesial of paracone at P
4
Entoconulid (mesial of entoconid) at M1
Additional fourth lower molar
Vulpes vulpes Small cusp mesial of paracone at P
4
Ursus arctos Second cusp palatinal at P
4
Third lingual cusp at M2
Three metaconid-cusps at M2
Third palatinal cusp at M
1
835
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
30
Figure legends 836
837
Fig. 1 Schematic illustration showing how phenotypic traits are reflected in character 838
statements and in the character labels in MaTrics. The basic nomenclature is based on Sereno 839
(2007: table 4, Scheme 3), see A1 and B1. A) Illustrates the structure for characters which can 840
be described with only two character states: absent and present. B) Illustrates the structure for 841
characters which require more than two character states (multistate characters). A2 and B2 842
give the terminology for the examples from MaTrics named in A3/B3. Sereno’s (2007) 843
terminology recognizes character statements (CS) consisting of characters (C) and 844
statements (S). The character is represented by a (list of) locators (Ln, …L1; hierarchically 845
organized and forming the structure tree) and optionally the variable (V) and the variable 846
qualifier (q). The different expressions of the variable are given as character states (v0, … to 847
vn) representing the statement. A4/B4 are examples how this nomenclature is given in the 848
character label in MaTrics. The character states are defined in the “states” field and assigned 849
to each cell of MaTrics. Whether a character can be described by the two states absent/present 850
or several states is indicated in the character label by the addition [a/p] and [m], respectively. 851
852
853
Fig. 2 Some examples for the presence of cusps in the studied Carnivora. A) the spotted hyaen 854
Crocuta Crocuta MTD B4936, B) the red panda Ailurus fulgens MTD B17478, C) the panda 855
Ailuropda melanoleuca ZMB_Mam_17246 and D) the Weddell seal Leptonychotes weddellii 856
MTD B5029. For each species the upper P4 and molars (1, 2) and lower molars (3, 4) are 857
illustrated as present and the cusps labelled. The teeth are photographed in lateral (1, 3) and 858
occlusal (2, 4) view. Abbreviations alphabetically: End – entoconid, Enld – entoconulid, Hy – 859
hypocone, Hyd – hypoconid, Hyld – hypoconulid, Me – metacone, Mec – metaconule, Med – 860
metaconid, Mes – mesostyle, Ms – metastyle, Pa – paracone, Pac – paraconule, Pad – 861
paraconid, Pr – protocone, Prd – protoconid and Ps – parastyle 862
863
864
Fig. 3 The presence and absence of cusps in P4 and M1 exemplified in A) the spotted hyaena 865
Crocuta Crocuta, B) the red panda Ailurus fulgens, C) the panda Ailuropda melanoleuca and 866
D) the Weddell seal Leptonychotes weddellii (teeth illustrated in Fig. 3). Abbreviations as 867
they appear in table: Ps – parastyle, Pa – paracone, Pr – protocone, Ms – metastyle, Hy – 868
hypocone, Mes – mesostyle, Me – metacone, Mec – metaconule, and Pac – paraconule 869
870
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
31
Supplementary Material 871
Table S1 List of the mammal species allocated to order, sorted alphabetically, included in 872
MaTrics so far (as of December 2020) 873
874
875
Table S2 Recording status of MaTrics as of December 2020. 876
A) recording progress of the 30,282 cells for the specific character traits (absent/present, a/p; 877
or multistate, m) as well as missing and inapplicable. Missing is the default setting and can 878
mean a) the cell has not been treated, the information on the character state for the taxon is not 879
known, or the information is known, but currently not retrievable (for example a specimen is 880
known in a distant collection). The number of relevant cells as well as the percentage is given. 881
B) Recording progress of the 206 characters for the 147 mammalian species included in 882
MaTrics. The table lists the number of cells which are recorded to 100% (i.e., for all species), 883
and to at least 75% and 50% of the species, respectively. 884
885
A 886
Character states n %
Recorded data (a/p and m) 18,389 60.7
missing 8,982 29.7
inapplicable 2,911 9.6
887
B 888
Recording progress (147 species) n (traits) % (of total number of traits)
100% (all species) 1 0.5
≥
75% (
≥
111 species) 71 34
≥
50% (
≥
74 species) 129 63
< 50% (<74 species) 77 37
889
890
Supplementary Material document S3 Brief description of statistical methods, samples and 891
observed p-values mentioned in the text 892
893
894
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
17865606-file00.docx 04.01.2021
32
Supplementary Material Table S4 Species and the assigned material studied in the different 895
collections (SNSD – Senckenberg Naturhistorische Sammlungen Dresden, MfN – Museum 896
für Naturkunde Berlin) for the test study on Carnivora. 897
898
899
900
Supplementary Material document S5 Description of the tooth cusp patterns in 20 selected 901
Carnivora. 902
903
904
Supplementary Material Table S6 Absence (0) and presence (1) of the analyzed cusps in the 905
studied teeth of the carnivoran species 906
907
908
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
A1 A2
A4A3
B2
B4
B3
B1
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
C1 C2
C4
D4D3D2D1
C3
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint
Talon
Family
Genus
Species
Diet
Ps
Pa
Pr
Ms
Hy
Ps
Mes
Ms
Pa
Pr
Me
Pac
Mec
Hy
Ailuridae
Ailurus
fulgens
herbivorous
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Canidae
Canis
lupus
carnivorous
0
1
1
1
0
0
0
0
1
1
1
1
1
0
Canidae
Canis
familiaris
carnivorous
0
1
1
1
0
0
0
0
1
1
1
1
1
0
Canidae
Vulpes
vulpes
carnivorous
0
1
1
1
0
0
0
0
1
1
1
1
1
1
Mustelidae
Mustela
putorius
omnivorous
1
1
1
1
0
0
0
0
1
1
1
0
0
0
Phocidae
Leptonychotes
weddellii
piscivorous
0
1
0
0
0
0
0
0
0
1
0
0
0
0
Procyonidae
Bassariscus
astutus
omnivorous
1
1
1
1
1
0
0
0
1
1
1
1
1
1
Procyonidae
Nasua
nasua
omnivorous
1
1
1
1
1
0
0
0
1
1
1
1
1
1
Procyonidae
Procyon
lotor
omnivorous
1
1
1
1
1
0
0
0
1
1
1
1
1
1
Odobenidae
Odobenus
rosmarus
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Ursidae
Ailuropoda
melanoleuca
herbivorous
1
1
1
1
1
0
0
0
1
1
1
0
0
1
Ursidae
Ursus
arctos
omnivorous
0
1
1
1
0
0
0
0
1
1
1
0
0
1
Ursidae
Ursus
maritimus
hypercarnivorous
0
1
1
1
0
0
0
0
1
1
1
0
0
1
Felidae
Acinonyx
jubatus
hypercarnivorous
1
1
1
1
0
0
0
0
1
1
0
0
0
0
Felidae
Felis
catus
hypercarnivorous
1
1
1
1
0
0
0
0
1
1
0
0
0
0
Felidae
Panthera
leo
hypercarnivorous
1
1
1
1
0
0
0
0
1
1
0
0
0
0
Felidae
Panthera
tigris
hypercarnivorous
1
1
1
1
0
0
0
0
1
1
0
0
0
0
Felidae
Puma
concolor
hypercarnivorous
1
1
1
1
0
0
0
0
1
1
0
0
0
0
Hyaenidae
Crocuta
crocuta
hypercarnivorous
1
1
1
1
0
0
0
0
1
0
0
0
0
0
Hyaenidae
Hyaena
hyaena
hypercarnivorous
1
1
1
1
0
1
0
0
1
1
1
0
0
0
P4M
1
Trigon
CanoidaeFeloidae
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted January 19, 2021. ; https://doi.org/10.1101/2021.01.17.426960doi: bioRxiv preprint