Jerome C. Regier's research while affiliated with University of Maryland, College Park and other places

Publications (182)

Article
Full-text available
Gracillariidae are one of the most diverse families of internally feeding insects, and many species are economically important. Study of this family has been hampered by lack of a robust and comprehensive phylogeny. In the present paper, we sequenced up to 22 genes in 96 gracillariid species, representing all previously rec- ognized subfamilies and...
Article
Full-text available
Major progress has been made recently toward resolving the phylogeny of Noctuoidea, the largest superfamily of Lepidoptera. However, numerous questions and weakly supported nodes remain. In this paper we independently check and extend the main findings of multiple recent authors by performing maximum-likelihood analyses of 5–19 genes (6.7–18.6 kb)...
Article
The Gelechioidea (>18 000 species), one of the largest superfamilies of Lepidoptera, are a major element of terrestrial ecosystems and include important pests and biological model species. Despite much recent progress, our understanding of the classification, phylogeny and evolution of Gelechioidea remains limited. Building on recent molecular stud...
Article
Within the insect order Lepidoptera (moths and butterflies), the so-called nonditrysian superfamilies are mostly species-poor but highly divergent, offering numerous synapomorphies and strong morphological evidence for deep divergences. Uncertainties remain, however, and tests of the widely accepted morphological framework using other evidence are...
Article
The Tineoidea are the earliest-originating extant superfamily of the enormous clade Ditrysia, whose 152 000+ species make up 98% of the insect order Lepidoptera. Though more diverse than all non-ditrysian superfamilies put together (3719 vs 2604 species), the tineoids are not especially species-rich among ditrysian superfamilies. Their phylogenetic...
Data
List of taxon subsets used to generate (by deletion) new data sets with reduced numbers of taxa. (DOC)
Data
Full-text available
Synopsis of genes sequenced. (PDF)
Data
Full-text available
Bootstrap results based on analysis of taxon-depleted nt123_degen1 data sets. (PDF)
Data
Nexus-formatted tree file that encodes the topology (with branch lengths) of highest likelihood recovered in our analysis of the nt123 data set for 483 taxa and 19 genes with mask characters already excluded. The species codenames are identified by their complete genus-species names in Table S3. (TRE)
Data
Bootstrap results based on analysis of taxon-depleted nt123 data sets. (PDF)
Data
Nexus-formatted data set that includes nucleotide sequence data (nt123) for 483 taxa and 19 genes with the ambiguously aligned characters already excluded (14658 characters total). Sets of characters are defined and listed immediately after the data matrix. This data set can be degenerated using the degen1 script available at http://www.phylotools....
Data
Nexus-formatted tree file that encodes the topology (with branch lengths) of highest likelihood recovered in our analysis of the nt123_degen1 data set for 483 taxa and 19 genes with mask characters already excluded. The species codenames are identified by their complete genus-species names in Table S3. (TRE)
Data
Full-text available
Absolute number of unambiguous nucleotides (bp) per gene in each taxon, plus summary statistics. (PDF)
Data
Full-text available
Maximum likelihood tree in phylogram format, with bootstrap values, based on analysis of the nt123_degen1 data set for 483 taxa and 19 genes. A condensed cladogram version is shown in Figure 2. Terminal taxa are labeled by their generic names. Higher-level classification names are also included. The 63 tineoid test taxa are each identified by three...
Data
Full-text available
Maximum likelihood tree in phylogram format, with bootstrap values, based on analysis of the nt123 data set for 483 taxa and 19 genes. Terminal taxa are labeled by their generic name. Higher-level classification names are also included. (PDF)
Data
Nexus-formatted data set that includes nucleotide sequence data (nt123_degen1) for 483 taxa and 19 genes with the ambiguously aligned characters already excluded (14658 characters total). This data set was degenerated using a degen1 script and the nt123 data set. The most current degen1 script is available at http://www.phylotools.com. The species...
Data
List of specimens sampled, Leptree voucher identification numbers, and gene information, including GenBank numbers. (XLS)
Article
Full-text available
Background Higher-level relationships within the Lepidoptera, and particularly within the species-rich subclade Ditrysia, are generally not well understood, although recent studies have yielded progress. We present the most comprehensive molecular analysis of lepidopteran phylogeny to date, focusing on relationships among superfamilies. Methodolog...
Data
A spreadsheet showing the included species with annotations of their classification, collecting locality, host plant families, identification check with DNA barcodes, sequence data completeness (fraction of total target sequence actually obtained) and GenBank accession numbers. The eight genes initially sampled are shown to the left of the 11–19 ad...
Data
Full-text available
The best ML tree for nt3 (only) analysis of the 8–27 gene, 139-taxon data set, rooted with Tischeria ekebladella. Bootstrap values, when >50%, are shown above branches. (PDF)
Data
The best ML tree found for nt12 (only) analysis of the 8–27 gene, 139-taxon data set, rooted with Tischeria ekebladella. Bootstrap values, when >50%, are shown above branches. (PDF)
Data
Full-text available
The best maximum likelihood tree found in nt123 analysis of the 4-gene, 139-taxon data set. The four genes are listed in Figure S1. The tree is rooted with Tischeria ekebladella. Bootstrap values, when >50%, are shown above branches. (PDF)
Article
Full-text available
Yponomeutoidea, one of the early-diverging lineages of ditrysian Lepidoptera, comprise about 1,800 species worldwide, including notable pests and insect-plant interaction models. Yponomeutoids were one of the earliest lepidopteran clades to evolve external feeding and to extensively colonize herbaceous angiosperms. Despite the group's economic impo...
Data
Full-text available
The best ML cladogram from Figure 2, with bootstrap values for the initial 8 genes (nt123 analysis). Values for 109fin, 205fin, 208fin, and 3007fin are shown above branch, in that order; values for ACC, CAD, DDC and enolase are shown below branches. ‘−’ = node not recovered in the ML tree for that analysis. ‘*’ = bootstrap value <50%. ‘NA’ = bootst...
Data
Full-text available
Proportions of the six distinct Ser codons for each of the 80 taxa in this study. Taxa are clustered by their higher-level classification to demonstrate that, in general, there is substantial variation in codon usage within higher-level groups, as well as across them. (PDF)
Data
Full-text available
Number of Ser -containing alignment sites in relation to the number of taxa that encode Ser at those sites. (PDF)
Data
Full-text available
Maximum Likelihood tree based on a nucleotide model analysis (GTR+G+I) of a degen1-encoded data set that lacks all serine-coding nucleotides/codons. The six nodes of particular interest (Xenocarida, Multicrustacea, Altocrustacea, Vericrustacea, Miracrustacea, Edafopoda) are recovered, indicating that the strong signal provided by serine codons is c...
Data
Full-text available
Strict consensus of four maximum parsimony trees for 21AA data set plus bootstrap values (above branches) from 20AA (left) and 21AA (right) analyses. (PDF)
Data
Full-text available
Arthropod relationships and classification scheme based on degen1 analysis of 75 ingroup plus five outgroup species [5] . (PDF)
Data
Full-text available
Compositional distance tree (Euclidean distances) based on the nucleotide composition of a degen1-encoded data set that is restricted to co-Ser residues. Bootstrap percentages >50% are displayed and indicate the strength of the compositional signal at particular nodes. The sum of all branch lengths reflects the total amount of compositional heterog...
Article
Full-text available
In a previous study of higher-level arthropod phylogeny, analyses of nucleotide sequences from 62 protein-coding nuclear genes for 80 panarthopod species yielded significantly higher bootstrap support for selected nodes than did amino acids. This study investigates the cause of that discrepancy. The hypothesis is tested that failure to distinguish...
Data
Full-text available
Compositional distance tree (Euclidean distances) based on the amino acid composition of a 21-amino-acid data set that is restricted to co-Ser (S/Z) residues. Bootstrap percentages >50% are displayed and indicate the strength of the compositional signal at particular nodes. The sum of all branch lengths reflects the total amount of compositional he...
Conference Paper
Full-text available
We present the first detailed molecular estimate of relationships across the subfamilies of Pyraloidea, and assess its concordance with previous morphology-based hypotheses. Maximum likelihood analyses yield trees that differ little among data sets and character treatments and are strongly supported at all levels of divergence. Subfamily relationsh...
Article
Pyraloidea, one of the largest superfamilies of Lepidoptera, comprise more than 15 684 described species worldwide, including important pests, biological control agents and experimental models. Understanding of pyraloid phylogeny, the basis for a predictive classification, is currently provisional. We present the most detailed molecular estimate of...
Article
Full-text available
Background: Tortricidae, one of the largest families of microlepidopterans, comprise about 10,000 described species worldwide, including important pests, biological control agents and experimental models. Understanding of tortricid phylogeny, the basis for a predictive classification, is currently provisional. We present the first detailed molecul...
Data
List of specimens sampled, including collection localities, LepTree voucher identification numbers and codes, and GenBank accession numbers. (XLS)
Article
Full-text available
For the kingdom Animalia, 1,552,319 species have been described in 40 phyla in a new evolutionary classification. Among these, the phylum Arthropoda alone represents 1,242,040 species, or about 80% of the total. The most successful group, the Insecta (1,020,007 species), accounts for about 66% of all animals. The most successful insect order, Coleo...
Chapter
Full-text available
“Order Lepidoptera Linnaeus, 1758. In: Zhang, Z.-Q. (Ed.) Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness”.
Conference Paper
Full-text available
Yponomeutoidea represent one of the major radiations in the basal ditrysian Lepidoptera. Yponomeutoidea are especially important for tracing the early evolutionary history of Lepidoptera-plant interactions because they are one of the earliest groups to evolve external feeding (Powell et al., 1998) and to extensively colonize herbs as well as shrubs...
Conference Paper
Neotropical yponomeutoids are one of the most poorly studied lepidopteran faunas, with many groups still tentatively defined. One example is the genus Dasycarea Zeller, 1877, based on a single female specimen of the type species D. viridisquamata from which the abdomen is missing. Due to the lack of information about the genitalia, different author...
Article
Full-text available
This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deep...
Data
Bootstrap values based on analysis of data sets and subsets differing in their average rates of nonsynonymous change. The complete data set is split into two or three subsets based on average rates of nonsynonymous change of individual genes, and bootstrap analyses are performed to estimate the informativeness of the different rate category ranges....
Data
Single-gene bootstrap values (≥75% only, nt123_degen1) for taxonomic groups (nodes) present in Figure 1. This table lists bootstrap values for taxa identified in Figure 1 based on analysis of single genes. (XLS)
Data
GenBank accession numbers (also cited in [11]). (DOC)
Data
A Nexus-formatted data set that includes nucleotide sequence data (nt123) for 80 taxa and 62 genes, slightly realigned relative to that in Regier et al., 2010 [11] (see Materials and Methods, Supplemetary Materials) and used principally for indel analysis. Sets of characters are defined and listed immediately after the data matrix, including those...
Data
Bootstrap values based on analysis of shuffled data matrices of varying sizes (100% to 15% of complete data matrix). This table lists bootstrap values after randomizing character order in the 100% data matrix and splitting it into portions of varying sizes for analysis (100% to 15% of complete data matrix) without replacement. A subset of the Table...
Article
Full-text available
This study aims to investigate the strength of various sources of phylogenetic information that led to recent seemingly robust conclusions about higher-level arthropod phylogeny and to assess the role of excluding or downweighting synonymous change for arriving at those conclusions. The current study analyzes DNA sequences from 68 gene segments of...
Data
Single-gene bootstrap values (≥75% only, nt123_degen1) for taxonomic groups NOT present in Figure 1. This table lists bootstrap values for taxa not recovered in the analysis shown in Figure 1 based on analysis of single genes. (XLS)
Data
Exemplar species included, their classification, and GenBank accession numbers. For Gracillariidae the number of taxa in each subfamily and genus is listed in parentheses (number of taxa sampled/number of taxa known). "x" denotes a sequence that could not be amplified.
Article
Full-text available
Researchers conducting molecular phylogenetic studies are frequently faced with the decision of what to do when weak branch support is obtained for key nodes of importance. As one solution, the researcher may choose to sequence additional orthologous genes of appropriate evolutionary rate for the taxa in the study. However, generating large, comple...
Data
Maximum likelihood trees based on inferred amino acids. Scale bar = 0.03 substitutions/site.
Data
Single gene bootstrap values for all nodes in the nt123 tree of data set B. Shaded boxes are those with > 80% bootstrap support. "ALL" refers to dataset B (all genes included). See Additional file 1 for taxon code names.
Data
Maximum likelihood trees based on a partitioned model. Scale bar = 0.2 substitutions/site.
Data
Maximum likelihood nt123 trees for data sets A through D. Scale bar = 0.07 substitutions/site.
Data
Comparison of Euclidean compositional distance (NJ), GTR ML distance (NJ), and ML trees for nt123 and nt3. Arrows indicate a long internal branch in the Euclidean compositional distance trees.
Data
Maximum likelihood trees based on a codon model. Scale bar = 0.03 substitutions/site.
Conference Paper
The higher phylogeny of Hexapoda has been a topic of intense debate for decades, in particular the interordinal relationships. Different morphological and molecular characters frequently yield different topologies, but statistical tests of conflict are seldom used with morphological characters, while the repeated analyses of incrementally expanded...
Article
The Afrotropical butterfly subfamily Pseudopontiinae (Pieridae) was traditionally thought to comprise one species, with two subspecies (Pseudopontia paradoxa paradoxa Felder & Felder and Pseudopontia paradoxa australis Dixey) differing in a single detail of a hindwing vein. The two subspecies also differ in their known geographic distributions (mai...
Article
This study has as its primary aim the robust resolution of higher-level relationships within the lepidopteran superfamily Bombycoidea. Our study builds on an earlier analysis of five genes (∼6.6 kbp) sequenced for 50 taxa from Bombycoidea and its sister group Lasiocampidae, plus representatives of other macrolepidoteran superfamilies. The earlier s...
Article
Full-text available
The remarkable antiquity, diversity and ecological significance of arthropods have inspired numerous attempts to resolve their deep phylogenetic history, but the results of two decades of intensive molecular phylogenetics have been mixed. The discovery that terrestrial insects (Hexapoda) are more closely related to aquatic Crustacea than to the ter...
Data
123-taxon ML tree & bootstrap for noLRall2 + nt2. Part A: noLRall2 + nt2 best ML tree found in 10,000 replicate GARLI searches, GTR + G + I model, phylogram format. Part B: noLR2all + nt2 bootstrap majority rule consensus tree, generated in PAUP, from 1000 GARLI ML bootstrap replicates, GTR + G + I model.
Data
Single-gene bootstrap analyses. We present a table of bootstrap values obtained from a separate analysis of all nucleotides for each gene, for all nodes on the all-nt ML tree plus all other nodes supported by BP of 50% or greater by any gene. We summarize the evidence on bootstrap-supported groupings that conflict with those found for other individ...
Data
Effects of compositional heterogeneity on inferred relationships, compared between nt3 and noLRall2 + nt2. Part A. Analysis of variable nt3 characters. Part B. Analysis of noLRall2 + nt2 characters.
Data
123-taxon ML tree & bootstrap consensus tree for nt12. Part A: nt12 best ML tree found in 10,000 replicate GARLI searches, GTR + G + I model, phylogram format. Part B: nt12 bootstrap majority rule consensus tree, generated in PAUP, from 1000 GARLI ML bootstrap replicates, GTR + G + I model.
Data
123-taxon ML tree & bootstrap consensus tree for nt123. Part A: nt123, best ML tree found in 10,000 replicate GARLI searches, GTR + G + I model, phylogram format. Part B: nt123, majority rule consensus tree from 1000 GARLI ML bootstrap replicates, generated in PAUP.
Data
Data matrix. The final sequence alignment, excluding alignment-ambiguous regions, is presented in sequential Nexus format. The file includes charsets for different genes, nucleotides and LR/noLR. Taxon names are the code names given in Additional file 1.
Data
123-taxon ML tree & bootstrap consensus tree for nt3. Part A. nt3 best ML tree found in 10,000 replicate GARLI searches, GTR + G + I model, phylogram format. Part B. nt3, bootstrap 50% majority rule consensus tree, generated in PAUP, from 1000 GARLI ML bootstrap replicates, GTR + G + I model.
Data
Specimen information and Genbank numbers. For each specimen sequenced we list superfamily, family, genus and species name, code name, collection locality, GenBank accession numbers for all sequences, and missing and partial sequences. Numbers after the superfamily name indicate number of families sampled/number of families total. An L in parenthese...
Data
Nt123 Bayesian analysis, partitioned noLRall2 + nt2 vs. LR + nt3. Majority rule consensus of trees sampled from partitioned Bayesian analysis of nt123 with partitions noLRall2 + nt2 versus LRall2 + nt3. Two runs, 10106 trees sampled from each, with standard deviation of split frequencies < 0.01.
Article
Full-text available
In the mega-diverse insect order Lepidoptera (butterflies and moths; 165,000 described species), deeper relationships are little understood within the clade Ditrysia, to which 98% of the species belong. To begin addressing this problem, we tested the ability of five protein-coding nuclear genes (6.7 kb total), and character subsets therein, to reso...
Data
Strict consensus of the 12 MPCs (length = 42618 steps, CI = 0.15, RI = 0.53) resulting from five-gene simultaneous MP analysis. Nodes are labeled to the right of each internal branch. Bootstrap values below branches, Bremer supports above. (2.03 MB TIF)
Data
ML phylogram. lnL = −187418.656372. The scale bar indicates the estimated substitutions per site. (1.73 MB TIF)
Data
The sampled 131 ingroup and 10 outgroup taxa with specimen localities, LepTree voucher identification numbers, and GenBank accession numbers. (0.29 MB DOC)
Data
Data matrix. The aligned sequence data are presented in sequential Nexus format. (0.96 MB DOC)
Article
Full-text available
The 1400 species of hawkmoths (Lepidoptera: Sphingidae) comprise one of most conspicuous and well-studied groups of insects, and provide model systems for diverse biological disciplines. However, a robust phylogenetic framework for the family is currently lacking. Morphology is unable to confidently determine relationships among most groups. As a m...
Article
Full-text available
Phylogenetic relationships among basal hexapod lineages were investigated using molecular sequence data derived from three nuclear genes: elongation factor-1α, RNA polymerase II, and elongation factor-2. Nucleotide and amino acids from 12 hexapods and 22 crustacean outgroups were analyzed using maximum parsimony and maximum likelihood methods. The...
Article
Full-text available
This study attempts to resolve relationships among and within the four basal arthropod lineages (Pancrustacea, Myriapoda, Euchelicerata, Pycnogonida) and to assess the widespread expectation that remaining phylogenetic problems will yield to increasing amounts of sequence data. Sixty-eight regions of 62 protein-coding nuclear genes (approximately 4...
Chapter
Full-text available
Lepidoptera are among the most diverse and easily recognized organisms on the planet, with at least 150,000 described species (Kristensen and Skalski 1998). They are one of the four megadiverse orders of holometabolous insects, together with Diptera (flies), Coleoptera (beetles), and Hymenoptera (wasps, bees, and ants). Butterflies alone are more n...
Article
Abstract The Heliothinae are a cosmopolitan subfamily of about 365 species that include some of the world’s most injurious crop pests. This study re-assesses evolutionary relationships within heliothines, providing an improved phylogeny and classification to support ongoing intensive research on heliothine genomics, systematics, and biology. Our ph...
Article
Nucleotide and inferred amino acid sequences from two nuclear protein-encoding genes, elongation factor-a and RNA polymerase II, were obtained from 34 myriapods and 14 other arthropods to determine phylogenetic relationships among and within the myriapod classes. Phylogenetic analyses using maximum parsimony and maximum likelihood methods recovered...