Pavel Dobrynin

Pavel Dobrynin
Saint Petersburg State University | SPBU · Department of Genetics and Biotechnology

About

91
Publications
19,205
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
841
Citations

Publications

Publications (91)
Preprint
Inference of complex demographic histories typically requires parameterized models specified manually by the researcher. With an increased variety of methods and tools, each with its own interface, model specification becomes tedious and error-prone. Moreover, optimization algorithms used to find optimal parameters sometimes turn out to be ineffici...
Article
Full-text available
This study provides new data on the whole-exome sequencing of a cohort of children with autistic spectrum disorders (ASD) from an underexplored Russian population. Using both a cross-sectional approach involving a control cohort of the same ancestry and an annotation-based approach involving relevant public databases, we explored exonic single nucl...
Article
Full-text available
Captive breeding programmes represent the most intensive type of ex situ population management for threatened species. One example is the Cuvier’s gazelle programme that started in 1975 with only four founding individuals and after more than four decades of management in captivity, a reintroduction effort was undertaken in Tunisia in 2016, to estab...
Article
Full-text available
Species of the mustelid subfamily Guloninae inhabit diverse habitats on multiple continents, and occupy a variety of ecological niches. They differ in feeding ecologies, reproductive strategies and morphological adaptations. To identify candidate loci associated with adaptations to their respective environments, we generated a de novo assembly of t...
Article
Full-text available
На примере исследования отдельного семейного случая, показано успешное применение экзомного секвенирования семейных трио (ребенок-пробанд и родители) в качестве первого метода геномной диагностики множественных необъясненных нарушений и задержек развития, с целью скриннинга на наличие патогенных однуклеотидных замен и структурных геномных вариантов...
Article
Full-text available
This short report on a family case study provides evidence of the effectiveness of exome sequencing of family trios (proband-parents) as the first-tier test in genomic diagnostics of unexplained developmental delays and disorders, as a genomic screening for both pathogenic single-nucleotide variants and copy number variations (CNVs). In this study,...
Preprint
Full-text available
Species of the mustelid subfamily Guloninae inhabit diverse habitats on multiple continents, and occupy a variety of ecological niches. They differ in feeding ecologies, reproductive strategies and morphological adaptations. To identify candidate loci associated with adaptations to their respective environments, we generated a de novo assembly of t...
Article
Full-text available
The Puma lineage within the family Felidae consists of three species that last shared a common ancestor around 4.9 million years ago. Whole-genome sequences of two species from the lineage were previously reported: the cheetah (Acinonyx jubatus) and the mountain lion (Puma concolor). The present report describes a whole-genome assembly of the remai...
Article
Full-text available
Species is the fundamental taxonomic unit in biology and its delimitation has implications for conservation. In giraffe (Giraffa spp.), multiple taxonomic classifications have been proposed since the early 1900s.1 However, one species with nine subspecies has been generally accepted,2 likely due to limited in-depth assessments, subspecies hybridizi...
Article
Full-text available
As we enter the sixth mass extinction, many species that are no longer self‐sustaining in their natural habitat will require ex situ management. Zoos have finite resources for ex situ management, and there is a need for holistic conservation programs between the public and private sector. Ex situ populations of sable antelope, Hippotragus niger, ha...
Article
Full-text available
Captive populations provide a valuable insurance against extinctions in the wild. However, they are also vulnerable to the negative impacts of inbreeding, selection and drift. Genetic information is therefore considered a critical aspect of conservation management. Recent developments in sequencing technologies have the potential to improve the out...
Article
Full-text available
Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct th...
Preprint
Full-text available
Captive populations provide a valuable insurance against extinctions in the wild. However, they are also vulnerable to the negative impacts of inbreeding, selection and drift. Genetic information is therefore considered a critical aspect of conservation management planning. Recent developments in sequencing technologies have the potential to improv...
Article
Full-text available
Genome-wide assessment of genetic diversity has the potential to increase the ability to understand admixture, inbreeding, kinship and erosion of genetic diversity affecting both captive (ex situ) and wild (in situ) populations of threatened species. The sable antelope (Hippotragus niger), native to the savannah woodlands of sub-Saharan Africa, is...
Article
The Russian Federation is the largest and one of the most ethnically diverse countries in the world, however no centralized reference database of genetic variation exists to date. Such data are crucial for medical genetics and essential for studying population history. The Genome Russia Project aims at filling this gap by performing whole genome se...
Article
Full-text available
Objective: Nicotiana glauca (tree tobacco) is a naturally transgenic plant, containing sequences acquired from Agrobacterium rhizogenes by horizontal gene transfer. Besides, N. glauca contains a wide profile of alkaloids of medical interest. Data description: We report a high-depth sequencing and de novo assembly of N. glauca full genome and ana...
Preprint
Full-text available
During the last few years more and more genomic analysis of individuals of closely related species has appeared. The history of evolution and development of populations, so-called demographic history, is embedded into their genome and we can try to pull it out. Allele frequency spectrum or AFS, the distribution of allele frequency in populations, i...
Article
Full-text available
A comparative analysis of whole genome sequencing (WGS) and genotype calling was initiated for ten human genome samples sequenced by St. Petersburg State University Peterhof Sequencing Center and by three commercial sequencing centers outside of Russia. The sequence quality, efficiency of DNA variant and genotype calling were compared with each oth...
Data
Long indel counts. The number of identified long indels is given for each sequencing center to illustrate the effect of filtering (described in the first column). (DOCX)
Data
All identified LoF SNP list with annotation. (XLSX)
Data
Overlap of long indels across three sequencing centers. The Venn diagram shows the number of shared long indels in the three datasets. (PDF)
Data
All identified LoF short indel list with annotation. (XLSX)
Data
List of candidate AIH-related genes obtained from separate studies. (XLSX)
Data
Distribution of alternative allele counts in called genotypes. Three datasets of genotypes for 10 individuals (Illumina and Macrogen) and one dataset of genotypes for 6 individuals (Peterhof) were considered. For each variant, the number of alternative alleles was obtained; the variants were classified according to this number. Multiallelic variant...
Data
Segmental duplications identified in trio in three datasets. "Common" bar corresponds to segmental duplications present in all three datasets. (PDF)
Data
Statistics on called variants. Statistics on variant calling and genotyping were calculated on the 6 samples shared in the three datasets. The variants were classified as known or novel according to their presence or absence in the NCBI dbSNP database build 147. (XLSX)
Data
Distribution of copy numbers in non-duplicated (control) regions. The distributions are plotted for each sample from (A) Illumina, (B) Macrogen, (C) Peterhof. (PDF)
Data
Comparison of various QC parameters for raw reads. Raw read quality control parameters assessed for all sequenced samples for each sequencing center. (XLSX)
Data
Alignment statistics. Various parameters of alignment results are averaged over all samples in each dataset. (DOCX)
Data
Mendel inheritance errors. Variants violating the Mendel inheritance law were counted in the trio genotype data. (DOCX)
Data
Per-sample genotype comparison between datasets. (XLSX)
Data
HLA genotyping and concordance of WGS-based and molecular typing. (XLSX)
Data
Time estimates for 30X coverage from sequencing centers per person. (DOCX)
Article
Full-text available
Solenodons are insectivores living in Hispaniola and Cuba that form an isolated branch in the tree of placental mammals highly divergent from other eulipothyplan insectivores The history, unique biology and adaptations of these enigmatic venomous species could be illuminated by the availability of genome data, but a whole genome assembly for soleno...
Article
Black and white rhinoceros (Diceros bicornis and Ceratotherium simum) are iconic African species that are classified by the International Union for the Conservation of Nature (IUCN) as Critically Endangered and Near Threatened (http://www.iucnredlist.org/), respectively [1]. At the end of the 19th century, Southern white rhinoceros (Ceratotherium s...
Article
Full-text available
Epigenetic regulation plays an important role in development, at the embryonic stages and later during the lifespan. Some epigenetic marks are highly conserved throughout the lifespan whereas others are closely associated with specific age periods and/or particular environmental factors. Little is known about the dynamics of epigenetic regulation d...
Preprint
Full-text available
Background Nicotiana glauca (tree tobacco) is a member of the Solanaceae family, which includes important crops (potato, tomato, eggplant, pepper) and many medicinal plants. This diploid plant is native to South America and is one of the first Nicotiana species with Agrobacterium cellular T-DNA (cT-DNA). Its cT-DNA is a partial, inverted repeat, ca...
Preprint
Full-text available
Solenodons are insectivores living on the Caribbean islands, with few surviving related taxa. The genus occupies one of the most ancient branches among the placental mammals. The history, unique biology and adaptations of these enigmatic venomous species, can be greatly advanced given the availability of genome data, but the whole genome assembly f...
Article
Full-text available
The dwindling wildlife species of our planet have become a cause célèbre for conservation groups, governments and concerned citizens throughout the world. The application of powerful new genetic technologies to surviving populations of threatened mammals has revolutionized our ability to recognize hidden perils that afflict them. We have learned ne...
Article
Full-text available
The first complete mitochondrial genome sequence of parthenogenetic Caucasian rock lizard Darevskia unisexualis (Lacertidae family) is determined by hybrid assembly with Illumina HiSeq and PacBio RS II platforms. The circular 21.4 kbp mitogenome contains 13 protein-coding genes, 12S and 16S rRNA genes, 20 tRNAs, two pseudogenized tRNAs, and one lon...
Data
D. unisexualis mitogenome annotation. Inner circle show GC-richness with red for AT-rich regions and blue for GC-rich regions (1). The next circle contains annotation data, where genes marked by red, tRNA by blue, rRNA by green, 59 bp tandem repeats by orange, and CR region by violet (2). The next circle shows three PacBio reads containing full rea...
Article
Full-text available
Pangolins (order Pholidota) are the only mammals covered by scales. We have recently sequenced and analyzed the genomes of two critically endangered Asian pangolin species, namely the Malayan pangolin (Manis javanica) and the Chinese pangolin (Manis pentadactyla). These complete genome sequences will serve as reference sequences for future research...
Article
Full-text available
Pangolins, unique mammals with scales over most of their body, no teeth, poor vision, and an acute olfactory system, comprise the only placental order (Pholidota) without a whole-genome map. To investigate pangolin biology and evolution, we developed genome assemblies of the Malayan (Manis javanica) and Chinese (M. pentadactyla) pangolins. Striking...
Article
Full-text available
Background As the number of sequenced genomes rapidly increases, chromosome assembly is becoming an even more crucial step of any genome study. Since de novo chromosome assemblies are confounded by repeat-mediated artifacts, reference-assisted assemblies that use comparative inference have become widely used, prompting the development of several re...
Article
Full-text available
The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of...
Data
Pedigree of the strains. Names of strains with sequenced genomes are shown on yellow background. The number of generations is counted as the number of meiotic events between two strains. MAT a strains are depicted left-budded and MAT are right-budded. Diploids are unbudded. Curved arrow indicates self-fertilization. Dashed arrows indicate genetic m...
Data
Phylogenetic relation of 15V-P4 to other S. cerevisiae strains. (A) Population structure of 101 strains including 15V-P4 assessed with three sets of 1210 variable positions with roughly uniform minor allele frequency distribution. Populations or groups of similar populations are framed. 15V-P4 and S288C are highlighted in red. (B) Neighbor joining...
Data
Phenylalanine auxotrophy mutation pheA10 is allelic to PHA2. (A) Short read alignment. (B) Sanger resequencing. Red frame, TAA nonsense mutation appearing at codon 161. (C) 33G-D373 plated on selective media immediately after transformation with low copy number plasmids bearing indicated PHA2 alleles. Vector, pRS316. (D) Asp+ and Asp- variants of 6...
Data
Systematic names of genes used to infer the ORF-based phylogenetic tree. (XLS)
Data
Genome coverage across reference for euploid strains. (A) 1B, (B) 74. Dashed lines signify chromosome borders. (PDF)
Data
Neighbour joining (NJ) clustering of the PGC strains and S288C based on number of pairwise SNVs. Shown in right are numbers of SNVs in comparison to S288C (highlighted in different shades of green with color intensity proportional to the number of SNVs) or to 15V-P4 (similarly highlighted in shades of purple). (PDF)
Data
GAL10C287T mutation in the 1B strain may be responsible for the Gal- phenotype. (A) SNVs in the GAL locus compared to S288C. Upper character, reference nucleotide; lower character, variant nucleotide. Nucleotides of the Watson strand are indicated. C287T substitution in GAL10 of 1B is highlighted in blue circle. (B) The complete GAL locus or its GA...
Data
Summary of BLAST analysis for introgressed regions. Shown are results of BLAST search (output format 6) in the 15V-P4 genome and in the YJM248 genome. (XLS)
Data
Coverage of Saccharomyces sensu stricto genomes with short reads for 15V-P4 does not reveal introgression from any of the closely related species. Short reads for the 15V-P4 genome were aligned to concatenated genomes of S. sensu stricto species with Bowtie2. S288C and YJM248 were used as a negative and positive controls for introgression, respecti...
Data
Summary of variable positions in the SUCX genes. Positions are indicated according to S288C SUC2 sequence. Variants are called according to short read alignment for sequenced PGC strains and to ungapped multiple alignment for known SUC genes (NCBI accession numbers are given in parentheses). (XLS)
Data
Genomic regions annotated as amplified or deleted in each of the genomes. (XLS)
Data
List of genes with stop codons gained or lost in the strains analyzed. Light green, known genotype. (XLS)