About
143
Publications
31,910
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,275
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (143)
Recent advances in protein structure prediction have generated accurate structures of previously uncharacterized human proteins. Identifying domains in these predicted structures and classifying them into an evolutionary hierarchy can reveal biological insights. Here, we describe the detection and classification of domains from the human proteome....
Transmembrane proteins (TMPs), with diverse cellular functions, are difficult targets for structural determination. Predictions of TMPs and the locations of transmembrane segments using computational methods could be unreliable due to the potential for false positives and false negatives and show inconsistencies across different programs. Recent ad...
Comparative analyses of genomic data reveal further insights into the phylogeny and taxonomic classification of butterflies presented here. As a result, 2 new subgenera and 2 new species of Hesperiidae are described: Borna Grishin, subgen. n. (type species Godmania borincona Watson, 1937) and Lilla Grishin, subgen. n. (type species Choranthus lilli...
Analyses of whole genomic shotgun datasets, COI barcodes, morphology, and historical literature suggest that the following 13 butterfly species from the family Hesperiidae (Lepidoptera: Papilionoidea) in Texas, USA are distinct from their closest named relatives and therefore are described as new (type localities are given in parenthesis): Spicauda...
The recent breakthroughs in structure prediction, where methods such as AlphaFold demonstrated near-atomic accuracy, herald a paradigm shift in structural biology. The 200 million high-accuracy models released in the AlphaFold Database are expected to guide protein science in the coming decades. Partitioning these AlphaFold models into domains and...
PARylation plays critical roles in regulating multiple cellular processes such as DNA damage response and repair, transcription, RNA processing, and stress response. More than 300 human proteins have been found to be modified by PARylation on acidic residues, i.e., Asp (D) and Glu (E). We used the deep-learning tool AlphaFold to predict protein-pro...
The discovery that a skipper butterfly Telegonus fulgerator (Walch, 1775), previously placed in the genus Astraptes Hübner, [1819], is a complex of many similar-looking species-level taxa with different COI barcodes, caterpillar foodplants and body patterns, and subtle differences in adult phenotypes raised a question about which species is the ori...
We obtained whole genome shotgun sequence reads for a number of Firetip skippers (subfamily Pyrrhopyginae), including all known species from the genera Yanguna Watson, 1893 and Gunayan Mielke, 2002 and representative species of Pyrrhopyge Hübner, [1819]. Phylogenetic analysis of their protein-coding regions unexpectedly revealed that Yanguna tetric...
Protein‐protein interactions (PPIs) are involved in almost all essential cellular processes. Perturbation of PPI networks plays critical roles in tumorigenesis, cancer progression and metastasis. While numerous high‐throughput experiments have produced a vast amount of data for PPIs, these datasets suffer from high false positive rates and exhibit...
New taxa in Hesperiidae (Lepidoptera: Papilionoidea) are traditionally proposed after inspection of male genitalia, which largely form the basis for Hesperiidae taxonomy. However, with genomic DNA sequencing, even a single female specimen can be placed in a phylogenetic context of existing classification and taxonomically assigned with confidence....
The mitochondrial DNA COI barcode segment sequenced from American Anthocharis specimens across their distribution ranges partitions them into four well-separated species groups and reveals different levels of differentiation within these groups. The lanceolata group experienced the deepest divergence. About 2.7% barcode difference separates the two...
The comparative genomics of butterflies yields additional insights into their phylogeny and classification that are compiled here. As a result, 3 genera, 5 subgenera, 5 species, and 3 subspecies are proposed as new, i.e., in Hesperiidae: Antina Grishin, gen. n. (type species Antigonus minor O. Mielke, 1980), Pompe Grishin and Lamas, gen. n. (type s...
The recent breakthroughs in structure prediction, where methods such as AlphaFold demonstrated near atomic accuracy, herald a paradigm shift in structure biology. The 200 million high-accuracy models released in the AlphaFold Database are expected to guide protein science in the coming decades. Partitioning these AlphaFold models into domains and s...
During the last 10 years, the Erythrina stem borer moth, Terastia meticulosalis, emerged as a pest of cultivated coral trees (Erythrina spp.) in California. Erythrina trees are valued for their moderate drought resistance and beautiful flame‐like flowers. They are beloved enough to be considered Los Angeles's official “City Tree.” Thus, they are a...
Motivation
Recent development of deep-learning methods has led to a breakthrough in the prediction accuracy of 3-dimensional protein structures. Extending these methods to protein pairs is expected to allow large-scale detection of protein-protein interactions and modeling protein complexes at the proteome level.
Results
We applied RoseTTAFold and...
Bacterial conjugation is the fundamental process of unidirectional transfer of DNAs, often plasmid DNAs, from a donor cell to a recipient cell1. It is the primary means by which antibiotic resistance genes spread among bacterial populations2,3. In Gram-negative bacteria, conjugation is mediated by a large transport apparatus—the conjugative type IV...
Significance
Using the domain and operon organization of VtrA/VtrC, combined with fold predictions, we identify co-component signal transduction systems in enteric bacteria that likely regulate virulence. We observe that the heterodimeric VtrA/VtrC periplasmic bile acid receptor controlling the Vibrio parahaemolyticus type 3 secretion system 2 is a...
ATP‐binding cassette (ABC) systems, characterized by ABC‐type nucleotide‐binding domains (NBDs), play crucial roles in various aspects of human physiology. Human ABCG5 and ABCG8 form a heterodimeric transporter that functions in the efflux of sterols. We used sequence similarity search, multiple sequence alignment, phylogenetic analysis, and struct...
Bacterial signal transduction systems sense changes in the environment and transmit these signals to control cellular responses. The simplest one-component signal transduction systems include an input sensor domain and an output response domain encoded in a single protein chain. Alternately, two-component signal transduction systems transmit signal...
We propose a higher classification of the lycaenid hairstreak tribe Eumaeini – one of the youngest and most species‐rich butterfly tribes – based on autosome, Lepidopteran Z sex chromosome and mitochondrial protein‐coding genes. The subtribe Neolycaenina Korb is a synonym of Callophryidina Tutt and subtribe Tmolusina Bálint is a synonym of Strephon...
Our expanded efforts in genomic sequencing to cover additional skipper butterfly (Lepidoptera: Hesperiidae) species and populations, including primary type specimens, call for taxonomic changes to restore monophyly and correct misidentifications by moving taxa between genera and proposing new names. Reconciliation between phenotypic characters and...
We present an analysis of the names proposed by Carl Plötz in 1884 for the New World species in the genus Pyrgus Hübner, [1819] facilitated by the genomic sequencing of extant primary type specimens comparatively with a larger sample of more recently collected specimens of these species and their relatives. The changes to nomenclature suggested her...
Two new skipper butterfly (Hesperiidae) species are described from the United States: Staphylus floridus Grishin, sp. n. (type locality in Florida, Volusia Co.) and Staphylus ecos Grishin, sp. n. (type locality in Texas, Brewster Co.). They are cryptic and hence escaped recognition. They differ from their sister species by the relative size and mor...
Protein-protein interactions play critical roles in biology, but despite decades of effort, the structures of many eukaryotic protein complexes are unknown, and there are likely many interactions that have not yet been identified. Here, we take advantage of recent advances in proteome-wide amino acid coevolution analysis and deep-learning-based str...
Our previous comments in opposition to Case 3709 (Calhoun et al., 2020) cited our genomics studies, which were then available only in preprint form (Cong et al., 2019). We now announce the publication of our results in the journal Molecular Biology and Evolution and provide the associated citation
Recently diverged butterfly populations in North America have been found to exhibit high levels of divergence on the Z chromosome relative to autosomes, as measured by fixation index, . The pattern of divergence appears to result from accumulation of incompatible alleles, obstructing introgression on the Z chromosome in hybrids (i.e., the large‐Z e...
Deep learning takes on protein folding
In 1972, Anfinsen won a Nobel prize for demonstrating a connection between a protein’s amino acid sequence and its three-dimensional structure. Since 1994, scientists have competed in the biannual Critical Assessment of Structure Prediction (CASP) protein-folding challenge. Deep learning methods took center st...
DeepMind presented remarkably accurate protein structure predictions at the CASP14 conference. We explored network architectures incorporating related ideas and obtained the best performance with a 3-track network in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and in...
Emesis eleanorae Gallardo & Grishin n. sp. is described from western Honduras. It differs from other species of Emesis Fabricius, 1807 in having a row of prominent iron-gray crescent-shaped postdiscal spots on both wings above, outlined by paler areas basad and mirrored as merlot-colored spots below, with the largest by the forewing costa, and in i...
Significance
Caterpillars of Eumaeus butterflies eat toxic plants and are impacted by their toxins. Despite the ancient origins of cycads, the association of cycads and Eumaeus is recent. Following a switch to feeding on cycads, Eumaeus evolved cluster egg-laying and conspicuously colored, gregarious caterpillars. Eumaeus then split into two fast e...
Centuries of zoological studies have amassed billions of specimens in collections worldwide. Genomics of these specimens promises to reinvigorate biodiversity research. However, because DNA degrades with age in historical specimens, it is a challenge to obtain genomic data for them and analyze degraded genomes. We developed experimental and computa...
Continuing with comparative genomic exploration of worldwide butterfly fauna, we use all protein-coding genes as they are retrieved from the whole genome shotgun sequences for phylogeny construction. Analysis of these genome-scale phylogenies projected onto the taxonomic classification and the knowledge about butterfly phenotypes suggests further r...
Hibernating mammals exhibit medically relevant phenotypes, but the genetic basis of hibernation remains poorly understood. Using the meadow jumping mouse ( Zapus hudsonius ), we investigated the genetic underpinnings of hibernation by uniting experimental and comparative genomic approaches. We assembled a Z. hudsonius genome and identified widespre...
Hibernating mammals exhibit medically relevant phenotypes, but the genetic basis of hibernation remains poorly understood. Using the meadow jumping mouse (Zapus hudsonius), we investigated the genetic underpinnings of hibernation by uniting experimental and comparative genomic approaches. We assembled a Z. hudsonius genome and identified widespread...
Closely related species of butterfly sampled from southern suture zones in North America exhibit a continuous pattern of gene flow and population difference measures (index values) for autosomes, but not for the Z chromosome; When populations are compared through their Z chromosomes, index values obtained from samples of the same species are separa...
Our previous comments in opposition to Case 3709 (Calhoun et al., 2019) cited some of our ongoing genomic studies, which were made public on 4 September 2019 (Cong et al., 2019). As promised, we now provide additional details about this work. We are pleased that Scott et al. (2019) agreed with the principal conclusions of our research (based on pre...
Malaza fastuosus is a lavishly patterned skipper butterfly from a genus that has three described species, all endemic to the mainland of Madagascar. To our knowledge, M. fastuosus has not been collected for nearly 50 years. To evaluate the power of our techniques to recover DNA, we used a single foreleg of an at least 140-year-old holotype specimen...
We obtained whole genome shotgun sequences and phylogenetically analyzed protein-coding regions of representative skipper butterflies from the genus Carcharodus Hübner, [1819] and its close relatives. Type species of all available genus-group names were sequenced. We find that species attributed to four exclusively Old World genera (Spialia Swinhoe...
Delineating species boundaries in phylogenetic groups undergoing recent radiation is a daunting challenge akin to discretizing continuity. Here, we propose a general approach exemplified by American butterflies from the genus Junonia Hübner notorious for the variety of similar phenotypes, ease of hybridization, and the lack of consensus about their...
Further genomic sequencing of butterflies by our research group expanding the coverage of species and specimens from different localities, coupled with genome-scale phylogenetic analysis and complemented by phenotypic considerations, suggests a number of changes to the names of butterflies, mostly those recorded from the United States and Canada. H...
Studies of life rely on classifying organisms into species. However, since Darwin, there is no agreement about how to separate species from varieties. Contrary to a frequent belief, quantitative standards for species delineation are lacking and overdue, and debates about species delimitation create obstacles for conservation biology, agriculture, l...
Never before have we had the luxury of choosing a continent, picking a large phylogenetic group of animals, and obtaining genomic data for its every species. Here, we sequence all 845 species of butterflies recorded from North America north of Mexico. Our comprehensive approach reveals the pattern of diversification and adaptation occurring in this...
Genomic sequencing and analysis of worldwide skipper butterfly (Lepidoptera: Hesperiidae) fauna points to imperfections in their current classification. Some tribes, subtribes and genera as they are circumscribed today are not monophyletic. Rationalizing genomic results from the perspective of phenotypic characters suggests two new tribes, two new...
This paper addresses morphological and DNA sequences analysis of Brevianta saphonota in order to validate its placement in the current taxonomy.
We obtained and phylogenetically analyzed whole genome shotgun sequences of nearly all species from the tribe Emesidini Seraphim, Freitas & Kaminski, 2018 (Riodinidae) and representatives from other Riodinidae tribes. We see that the recently proposed genera Neoapodemia Trujano, 2018 and Plesioarida Trujano & García, 2018 are closely allied with Ap...
Centuries of zoological studies amassed billions of specimens in collections worldwide. Genomics of these specimens promises to rejuvenate biodiversity research. The obstacles stem from DNA degradation with specimen age. Overcoming this challenge, we set out to resolve a series of long-standing controversies involving a group of butterflies. We ded...
This paper reports the evaluation of predictions for the "CALM1" challenge in the 5th round of the Critical Assessment of Genome Interpretation held in 2018. In the challenge, the participants were asked to predict effects on yeast growth caused by missense variants of human calmodulin, a highly conserved protein in eukaryotic cells sensing calcium...
We obtained and analyzed whole genome data for more than 160 representatives of skipper butterflies (family Hesperiidae) from all known subfamilies, tribes and most distinctive genera. We found that two genera, Katreus Watson, 1893 and Ortholexis Karsch, 1895, which are sisters, are well-separated from all other major phylogenetic lineages and orig...
Biologists marvel at the powers of adaptive convergence, when distantly related animals look alike. While mimetic wing patterns of butterflies have fooled predators for millennia, entomologists inferred that mimics were distant relatives despite similar appearance. However, the obverse question has not been frequently asked. Who are the close relat...
We believe the actions requested by Case 3709 are premature and unnecessary, as it is now possible to determine the taxonomic identities of ancient specimens through revolutionary methods of genomic analysis. We are nearing the completion of a groundbreaking, multi-year research project to determine and analyze genomic sequences of a number of name...
For centuries, biologists have used phenotypes to infer evolution. For decades, a handful of gene markers have given us a glimpse of the genotype to combine with phenotypic traits. Today, we can sequence entire genomes from hundreds of species and gain yet closer scrutiny. To illustrate the power of genomics, we have chosen skipper butterflies (Hes...
Giant-Skippers (Megathymini) are unusual thick-bodied, moth-like butterflies whose caterpillars feed inside Yucca roots and Agave leaves. Giant-Skippers are attributed to the subfamily Hesperiinae and they are endemic to southern and mostly desert regions of the North American continent. To shed light on the genotypic determinants of their unusual...
Since its accidental introduction to Massachusetts in the late 1800s, the European gypsy moth (EGM; Lymantria dispar dispar ) has become a major defoliator in North American forests. However, in part because females are flightless, the spread of the EGM across the United States and Canada has been relatively slow over the past 150 years. In contras...
We obtained and analyzed whole genome shotgun sequences of all 845 species of butterflies recorded from Canada and the United States. Genome-scale phylogenetic trees constructed from the data reveal several non-monophyletic genera and suggest improved classification of species included in these genera. Here, these changes are formalized and 2 subge...
The exponential growth of genomic variants uncovered by next-generation sequencing necessitates efficient and accurate computational analyses to predict their functional effects. A number of computational methods have been developed for the task, but few unbiased comparisons of their performance are available. To fill the gap, The Critical Assessme...
Supplementary material is available on the publisher’s web site along with the published article.
Significance
Thirteen years of mitochondrial DNA barcoding of 15,000+ species of Lepidoptera and their parasitoids living in Area de Conservación Guanacaste, northwestern Costa Rica, indicate several thousand cases where barcodes combined with ecology suggest unrecognized cryptic species, substantially increasing species counts. Here, we show that...
Sequencing complete genomes of all major phylogenetic groups of organisms opens unprecedented opportunities to study evolution and genetics. We report draft genomes of Calephelis nemesis and Calephelis virginiensis, representatives of the family Riodinidae. They complete the genomic coverage of butterflies at the family level. At 809 and 855 Mbp, r...
Background:
The Hoary Edge Skipper (Achalarus lyciades) is an eastern North America endemic butterfly from the Eudaminae subfamily of skippers named for an underside whitish patch near the hindwing edge. Its caterpillars feed on legumes, in contrast to Grass skippers (subfamily Hesperiinae) which feed exclusively on monocots.
Results:
To better...
Acute hepatopancreatic necrosis disease (AHPND) is a newly emerging shrimp disease that has severely damaged the global shrimp industry. AHPND is caused by toxic strains of Vibrio parahaemolyticus that have acquired a "selfish plasmid" encoding the deadly binary toxins PirAvp/PirBvp. To better understand the repertoire of virulence factors in AHPND...
Background: Giant-Skipper butterflies from the genus Megathymus are North American endemics. These large and thick-bodied Skippers resemble moths and are unique in their life cycles. Grub-like at the later stages of development, caterpillars of these species feed and live inside yucca roots. Adults do not feed and are mostly local, not straying far...
Two species of hairstreak butterflies from the genus Calycopis are known in the United States: C. cecrops and C. isobeon. Analysis of mitochondrial COI barcodes of Calycopis revealed cecrops-like specimens from the eastern US with atypical barcodes that were 2.6% different from either USA species, but similar to Central American Calycopis species....
Giant-Skipper butterflies from the genus Agathymus (family Hesperiidae) are unusual as their caterpillars feed inside Agave leaves. Relationships among Agathymus taxa and their names (i.e. if they are species, subspecies, or synonyms) are poorly understood due to phenotypic similarity. DNA sequences are promising to clarify the taxonomic questions,...
We assembled a complete mitochondrial genome of a unique Australian skipper butterfly Euschemon rafflesia (Hesperiidae) from next generation sequencing reads. The 15,447 bp mitogenome covers 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), 2 ribosomal RNA genes (rRNAs), and an A + T-rich region. Its gene order is typical for mitogenom...
We assembled a complete mitogenome of an Asian skipper butterfly Burara striata (Hesperiidae, Coeliadinae), the first representative of the genus Burara, from next generation sequencing reads. The 15327 bp mitogenome covers 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), 2 ribosomal RNA genes (rRNAs), and an A + T rich region. Its ge...
The Small Cabbage White ( Pieris rapae) is originally a Eurasian butterfly. Being accidentally introduced into North America, Australia, and New Zealand a century or more ago, it spread throughout the continents and rapidly established as one of the most abundant butterfly species. Although it is a serious pest of cabbage and other mustard family p...