Giulio Formenti

Giulio Formenti
Rockefeller University | Rockefeller

Ph.D.

About

111
Publications
49,810
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,022
Citations

Publications

Publications (111)
Article
We present a genome assembly from an individual female Vipera ursinii rakosiensis (the Hungarian meadow viper; Chordata; Lepidosauria; Squamata; Viperidae). The genome sequence is 1,625.0 megabases in span. Most of the assembly is scaffolded into 19 chromosomal pseudomolecules, including the W and Z sex chromosomes. The mitochondrial genome has als...
Article
Full-text available
Habitat transitions have shaped the evolutionary trajectory of many clades. Sea catfishes (Ariidae) have repeatedly undergone ecological transitions, including colonizing freshwaters from marine environments, leading to an adaptive radiation in Australia and New Guinea alongside non-radiating freshwater lineages elsewhere. Here, we generate and ana...
Article
Reference genomes as a key biodiversity genomics tool In the midst of the Earth’s sixth mass extinction, species worldwide are declining at an unprecedented rate1 directly impacting ecosystem functioning and services2, human health3 and our resilience to climate disturbances4. Biodiversity and ecosystem decline5,6, loss and degradation raise the pr...
Article
Full-text available
We present a reference genome assembly from an individual male Violet Carpenter Bee (Xylocopa violacea, Linnaeus 1758). The assembly is 1.02 gigabases in span. 48% of the assembly is scaffolded into 17 pseudo-chromosomal units. The mitochondrial genome has also been assembled and is 21.8 kilobases in length. The genome is highly repetitive, likely...
Article
Full-text available
A genomic database of all Earth’s eukaryotic species could contribute to many scientific discoveries; however, only a tiny fraction of species have genomic information available. In 2018, scientists across the world united under the Earth BioGenome Project (EBP), aiming to produce a database of high-quality reference genomes containing all ~1.5 mil...
Article
Full-text available
We present a reference genome assembly from an individual male Violet Carpenter Bee (Xylocopa violacea, Linnaeus 1758). The assembly is 1.02 gigabases in span. 48% of the assembly is scaffolded into 17 pseudo-chromosomal units. The mitochondrial genome has also been assembled and is 21.8 kilobases in length. The genome is highly repetitive, likely...
Article
Full-text available
The Harpy Eagle (Harpia harpyja) is an iconic species that inhabits forested landscapes in Neotropical regions, with decreasing population trends mainly due to habitat loss, and currently classified as vulnerable. Here, we report on a chromosome-scale genome assembly for a female individual combining long reads, optical mapping, and chromatin confo...
Article
Full-text available
We present a genome assembly from an individual female Molossus alvarezi (Chordata; Mammalia; Chiroptera; Molossidae). The genome sequence is 2.490 Gb in span. The majority of the assembly is scaffolded into 24 chromosomal pseudomolecules, with the X sex chromosomes assembled.
Preprint
Full-text available
We present a reference genome assembly from an individual male Violet Carpenter Bee (Xylocopa violacea, Linnaeus, 1758). The assembly is 1.02 gigabases in span. 48% of the assembly is scaffolded into 17 pseudo-chromosomal units. The mitochondrial genome has also been assembled and is 21.8 kilobases in length. The genome is highly repetitive, likely...
Preprint
Full-text available
We present haplotype-resolved reference genomes and comparative analyses of six ape species, namely: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan, and siamang. We achieve chromosome-level contiguity with unparalleled sequence accuracy (<1 error in 500,000 base pairs), completely sequencing 215 gapless chromosomes telomere-to-t...
Article
We present a genome assembly from an individual female Vipera ursinii rakosiensis (the Hungarian meadow viper; Chordata; Lepidosauria; Squamata; Viperidae). The genome sequence is 1,625.0 megabases in span. Most of the assembly is scaffolded into 19 chromosomal pseudomolecules, including the W and Z sex chromosomes. The mitochondrial genome has als...
Article
Full-text available
We present a reference genome assembly from an individual male Rhynchonycteris naso (Chordata; Mammalia; Chiroptera; Emballonuridae). The genome sequence is 2.46 Gb in span. The majority of the assembly is scaffolded into 22 chromosomal pseudomolecules, with the Y sex chromosome assembled.
Article
Full-text available
Sex-limited polymorphism has evolved in many species including our own. Yet, we lack a detailed understanding of the underlying genetic variation and evolutionary processes at work. The brood parasitic common cuckoo ( Cuculus canorus ) is a prime example of female-limited color polymorphism, where adult males are monochromatic gray and females exhi...
Article
Genomes are typically mosaics of regions with different evolutionary histories. When speciation events are closely spaced in time, recombination makes the regions sharing the same history small, and the evolutionary history changes rapidly as we move along the genome. When examining rapid radiations such as the early diversification of Neoaves 66 M...
Article
Animals living in caves are of broad relevance to evolutionary biologists interested in understanding the mechanisms underpinning convergent evolution. In the Eastern Andes of Colombia, populations from at least two distinct clades of Trichomycterus catfishes (Siluriformes) independently colonized cave environments and converged in phenotype by los...
Article
Full-text available
The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long...
Article
Full-text available
Suncus etruscus is one of the world’s smallest mammals, with an average body mass of about 2 grams. The Etruscan shrew’s small body is accompanied by a very high energy demand and numerous metabolic adaptations. Here we report a chromosome-level genome assembly using PacBio long read sequencing, 10X Genomics linked short reads, optical mapping, and...
Preprint
Full-text available
The intertidal gastropod Littorina saxatilis is a model system to study speciation and local adaptation. The repeated occurrence of distinct ecotypes showing different levels of genetic divergence makes L. saxatilis particularly suited to study different stages of the speciation continuum in the same lineage. A major finding is the presence of seve...
Article
Full-text available
Chub mackerels (Scomber japonicus) are a migratory marine fish widely distributed in the Indo-Pacific Ocean. They are globally consumed for their high Omega-3 content, but their population is declining due to global warming. Here, we generated the first chromosome-level genome assembly of chub mackerel (fScoJap1) using the Vertebrate Genomes Projec...
Article
Full-text available
Background The red junglefowl, the wild outgroup of domestic chickens, has historically served as a reference for genomic studies of domestic chickens. These studies have provided insight into the etiology of traits of commercial importance. However, the use of a single reference genome does not capture diversity present among modern breeds, many o...
Preprint
Full-text available
Animals living in caves are of broad relevance to evolutionary biologists interested in understanding the mechanisms underpinning convergent evolution. In the Eastern Andes of Colombia, populations from at least two distinct clades of Trichomycterus catfishes (Siluriformes) independently colonized cave environments and converged in phenotype by los...
Preprint
Comparative analysis of recent human genome assemblies highlights profound sequence divergence that peaks within polymorphic loci such as centromeres. This raises the question about the adequacy of relying on human reference genomes to accurately analyze sequencing data derived from experimental cell lines. Here, we generated the complete diploid g...
Article
Full-text available
The European mink Mustela lutreola (Mustelidae) ranks among the most endangered mammalian species globally, experiencing a rapid and severe decline in population size, density, and distribution. Given the critical need for effective conservation strategies, understanding its genomic characteristics becomes paramount. To address this challenge, the...
Article
Full-text available
The Open Institute of the African BioGenome Project empowers African scientists and institutions with the skill sets, capacity and infrastructure to advance scientifc knowledge and innovation and drive economic growth.
Article
Full-text available
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5....
Article
Full-text available
Amidst the current biodiversity crisis, the availability of genomic resources for declining species can provide important insights into the factors driving population decline. In the early 1990s, the black-legged kittiwake (Rissa tridactyla), a pelagic gull widely distributed across the arctic, subarctic and temperate zones, suffered a steep popula...
Poster
The availability of high-quality reference genomes has the potential to advance biological discoveries. Genome assembly, a vital step in generating reference genomes, is influenced by a variety of factors, with repeat content being a major determinant. Working with a diverse set of vertebrate species, the Vertebrate Genomes Project (VGP) aims to co...
Article
Full-text available
The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, a...
Article
Full-text available
Background PacBio high fidelity (HiFi) sequencing reads are both long (15–20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated...
Preprint
Full-text available
The diverse physiography of the Portuguese land and marine territory, spanning from continental Europe to the Atlantic archipelagos, has made it an important repository of biodiversity throughout the Pleistocene glacial cycles, leading to a remarkable diversity of species and ecosystems. This rich biodiversity is under threat from anthropogenic dri...
Preprint
Full-text available
Improvements in genome sequencing and assembly are enabling high-quality reference genomes for all species. However, the assembly process is still laborious, computationally and technically demanding, lacks standards for reproducibility, and is not readily scalable. Here we present the latest Vertebrate Genomes Project assembly pipeline and demonst...
Preprint
Full-text available
Background. The red junglefowl, the wild progenitor of domestic chickens, has historically served as a reference for genomic studies of domestic chickens. These studies have provided insight into the etiology of traits of commercial importance. However, the use of a single reference genome does not capture diversity present among modern breeds, man...
Preprint
Full-text available
Africa, a continent of 1.3 billion people, had 326 researchers per one million people in 2018 (Schneegans, 2021; UNESCO, 2022), despite the global average for the number of researchers per million people being 1368 (Schneegans, 2021; UNESCO, 2022). Nevertheless, a strong research community is a requirement to advance scientific knowledge and innova...
Article
Full-text available
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals¹. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair lev...
Article
Full-text available
The Rock Ptarmigan (Lagopus muta) is a cold-adapted, largely sedentary, game bird with a Holarctic distribution. The species represents an important example of an organism likely to be affected by ongoing climatic shifts across a disparate range. We provide here a high-quality reference genome and mitogenome for the Rock Ptarmigan assembled from Pa...
Article
The Aeolian wall lizard, Podarcis raffonei, is an endangered species endemic to the Aeolian archipelago, Italy, where it is present only in three tiny islets and a narrow promontory of a larger island. Because of the extremely limited area of occupancy, severe population fragmentation and observed decline, it has been classified as Critically Endan...
Article
Full-text available
Programmed DNA loss is a gene silencing mechanism that is employed by several vertebrate and nonvertebrate lineages, including all living jawless vertebrates and songbirds. Reconstructing the evolution of somatically eliminated (germline-specific) sequences in these species has proven challenging due to a high content of repeats and gene duplicatio...
Article
Full-text available
The availability of public genomic resources can greatly assist biodiversity assessment, conservation, and restoration efforts by providing evidence for scientifically informed management decisions. Here we survey the main approaches and applications in biodiversity and conservation genomics, considering practical factors, such as cost, time, prere...
Article
Full-text available
The availability of public genomic resources can greatly assist biodiversity assessment, conservation, and restoration efforts by providing evidence for scientifically informed management decisions. Here we survey the main approaches and applications in biodiversity and conservation genomics, considering practical factors, such as cost, time, prere...
Preprint
Full-text available
The taxonomic classification of a falcon population found in the Altai region in Asia has been heavily debated for two centuries and previous studies have been inconclusive, hindering a more informed conservation approach. Here, we generated a chromosome-level gyrfalcon reference genome using the Vertebrate Genomes Project (VGP) assembly pipeline....
Article
Full-text available
Sea turtles represent an ancient lineage of marine vertebrates that evolved from terrestrial ancestors over 100 Mya. The genomic basis of the unique physiological and ecological traits enabling these species to thrive in diverse marine habitats remains largely unknown. Additionally, many populations have drastically declined due to anthropogenic ac...
Preprint
Full-text available
The Rock Ptarmigan (Lagopus muta) is a cold-adapted, largely sedentary, game bird with a Holarctic distribution. The species represents an important example of an organism likely to be affected by ongoing climatic shifts across a disparate range. We provide here a high-quality reference genome and mitogenome for the Rock Ptarmigan assembled from Pa...
Article
Full-text available
Senescence, an age-related decline in survival and/or reproductive performance, occurs in species across the tree of life. Molecular mechanisms underlying this within-individual phenomenon are still largely unknown, but DNA methylation changes with age are among the candidates. Using a longitudinal approach, we investigated age-specific changes in...
Article
Full-text available
The chicken continues to hold its position as a leading model organism within many areas of research, as well as a being major source of protein for human consumption. The First Report on Chicken Genes and Chromosomes [Schmid et al., 2000], which was published in 2000, was the brainchild of the late, and sadly missed, Prof Michael Schmid of the Uni...
Article
Full-text available
Background The Australian black swan (Cygnus atratus) is an iconic species with contrasting plumage to that of the closely related northern hemisphere white swans. The relative geographic isolation of the black swan may have resulted in a limited immune repertoire and increased susceptibility to infectious diseases, notably infectious diseases from...
Article
Full-text available
Insights into the evolution of non-model organisms are limited by the lack of reference genomes of high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated reference genome and pangenome for the barn swallow (Hirundo rustica). We complement these resources with a reference-free multialignment of the refe...
Preprint
Full-text available
Background PacBio high fidelity (HiFi) sequencing reads are both long (15-20 kb) and highly accurate (>Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated...
Preprint
Full-text available
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, th...
Article
Full-text available
Background The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, hold...
Article
Full-text available
The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was g...
Article
Full-text available
Background False duplications in genome assemblies lead to false biological conclusions. We quantified false duplications in popularly used previous genome assemblies for platypus, zebra finch, and Anna’s Hummingbird, and their new counterparts of the same species generated by the Vertebrate Genomes Project, of which the Vertebrate Genomes Project...
Article
Full-text available
Background Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as complete and error-free as possible, which requires utilizing long reads, long-range scaffolding data, new assembly algorithms, and m...
Article
Full-text available
Background Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra...
Preprint
Full-text available
Background The blue whale, Balaenoptera musculus , is the largest animal known to have ever existed. Body size is tightly coupled to cell metabolism and environmental adaptations. A high-quality genome assembly of this magnificent animal will aid our understanding of body size regulation and related biological processes. Results We report a referen...
Article
Full-text available
Motivation With the current pace at which reference genomes are being produced, the availability of tools that can reliably and efficiently generate genome assembly summary statistics has become critical. Additionally, with the emergence of new algorithms and data types, tools that can improve the quality of existing assemblies through automated an...
Article
Full-text available
Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly acc...
Article
Full-text available
Variant calling has been widely used for genotyping and for improving the consensus accuracy of long-read assemblies. Variant calls are commonly hard-filtered with user-defined cutoffs. However, it is impossible to define a single set of optimal cutoffs, as the calls heavily depend on the quality of the reads, the variant caller of choice and the q...