Simone Maestri’s research while affiliated with University of Milan and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (7)


Figure 1. Traditional methods for triplet repeats characterization. ( A ) Southern blotting requires genomic DNA digestion with restriction enzymes, f ollo w ed b y blotting and probing with a labeled DNA fragment that specifically h ybridiz es to the repeat containing region. ( B ) Fluorescence-PCR uses at least one fluorescent primer and performs fragment analysis using a capillary electrophoresis system. ( C ) Small-pool PCR relies on serial dilutions and multiple independent PCRs across the repeat, f ollo w ed b y electrophoresis and blotting. ( D ) Sanger sequencing of PCR amplicons, after allelic separation by electrophoresis, detects fluorescence emitted by chain-terminating nucleotides.
Figure 2. High-throughput sequencing methods for triplet repeat characterization. ( A ) PCR-based methods begin with PCR amplification of the region of interest; the resulting amplicons then undergo platform-specific library preparation for high-throughput sequencing. ( B ) CRISPR / Cas9-based enrichment methods in v olv e cutting DNA using the Cas9-CRISPR RNA s (crRNA s) comple x, f ollo w ed b y ligation of sequencing adapters to the free DNA ends. ( C ) In-silico -based enrichment methods (adaptive sampling or 'Read Until') are used with Oxford Nanopore Technologies (ONT) devices to selectively sequence DNA molecules. Based on the first sequenced bases, the voltage across the nanopore can be reversed to eject the molecule if it does not match an on-target region.
Figure 3. Proposed experimental strategies to characterize CAG repeats in HD. Based on the biological questions users may wish to address, a w orkflo w outlining the optimal experimental setup is proposed. An estimate of the costs for each approach is also provided: $ represents 10$; $$ represents 100$; $$$ represents 10 0 0$.
High-throughput sequencing platforms for triplet repeats characterization
Studies using sequencing-based approaches to characterize CAG repeats in HD
Navigating triplet repeats sequencing: concepts, methodological challenges and perspective for Huntington's disease
  • Literature Review
  • Full-text available

December 2024

·

27 Reads

·

1 Citation

Nucleic Acids Research

Simone Maestri

·

Davide Scalzo

·

·

[...]

·

Elena Cattaneo

The accurate characterization of triplet repeats, especially the overrepresented CAG repeats, is increasingly relevant for several reasons. First, germline expansion of CAG repeats above a gene-specific threshold causes multiple neurodegenerative disorders; for instance, Huntington’s disease (HD) is triggered by >36 CAG repeats in the huntingtin (HTT) gene. Second, extreme expansions up to 800 CAG repeats have been found in specific cell types affected by the disease. Third, synonymous single nucleotide variants within the CAG repeat stretch influence the age of disease onset. Thus, new sequencing-based protocols that profile both the length and the exact nucleotide sequence of triplet repeats are crucial. Various strategies to enrich the target gene over the background, along with sequencing platforms and bioinformatic pipelines, are under development. This review discusses the concepts, challenges, and methodological opportunities for analyzing triplet repeats, using HD as a case study. Starting with traditional approaches, we will explore how sequencing-based methods have evolved to meet increasing scientific demands. We will also highlight experimental and bioinformatic challenges, aiming to provide a guide for accurate triplet repeat characterization for diagnostic and therapeutic purposes.

Download

Figure 1. SI causes cell-type specific vulnerability. ( A ) Recent studies have shown that somatic expansions are not only tissue-specific, but also cell-type specific; ( B ) Vulnerable cell types preferentially undergo somatic expansion over the course of the patient's lifetime, ultimately leading to transcriptional dysregulation and cell death.
Figure 2. HTT allele str uct ures influence HD A O O. T he upper part of the diagram represents a reference HD allele with 42 CAG repeats f ollo w ed b y the typically human 'CAA-CAG' tract, leading to a protein with 42Q + 2Q (both CAA and CAG translate to glutamine, Q). The CCG-CCA pair [representing the initial tract of the proline-rich domain (PRD)] f ollo wing the CAGs is also shown. Middle and bottom sections: GWAS-identified variants in the HTT allele nucleotide sequence that alter A O O; specifically, (middle) the LOI disease haplotype, an A-to-G synonymous mutation in the polyQ tract, leads to the same protein as the reference HD allele (42Q + 2Q), but accelerates disease onset. Con v ersely (bottom), in the DUP disease haplotype, the inclusion of an additional 'CAA-CAG' tract dela y s disease onset despite adding two extra Qs to the protein (42Q + 4Q).
When repetita no-longer iuvant: somatic instability of the CAG triplet in Huntington's disease

December 2024

·

36 Reads

·

3 Citations

Nucleic Acids Research

Trinucleotide repeats in DNA exhibit a dual nature due to their inherent instability. While their rapid expansion can diversify gene expression during evolution, exceeding a certain threshold can lead to diseases such as Huntington’s disease (HD), a neurodegenerative condition, triggered by >36 C–A–G repeats in exon 1 of the Huntingtin gene. Notably, the discovery of somatic instability (SI) of the tract allows these mutations, inherited from an affected parent, to further expand throughout the patient’s lifetime, resulting in a mosaic brain with specific neurons exhibiting variable and often extreme CAG lengths, ultimately leading to their death. Genome-wide association studies have identified genetic variants—both cis and trans, including mismatch repair modifiers—that modulate SI, as shown in blood cells, and influence HD’s age of onset. This review will explore the evidence for SI in HD and its role in disease pathogenesis, as well as the therapeutic implications of these findings. We conclude by emphasizing the urgent need for reliable methods to quantify SI for diagnostic and prognostic purposes.


G007 Short tandem repeats sequencing: methodological challenges and perspective for Huntington’s disease

September 2024

·

6 Reads

Journal of Neurology, Neurosurgery, and Psychiatry

Background Huntington’s disease (HD) is a neurodegenerative disorder caused by a CAG repeat expansion in the gene encoding for the huntingtin protein. Recently, Genome Wide Association Studies (GWAS) have identified genomic variations, occurring both at the HTT locus and in genes mostly involved in mismatch repair pathways, to be associated with disease onset and progression. Such mutations correlate with modulations of somatic instability, which is now considered as the main driver of pathogenesis. Accordingly, the set-up of reliable sequencing-based methods for assessing somatic instability is of paramount importance. Aims In this work, we aimed to compare workflows for assessing somatic instability, by identifying strengths and drawbacks of each enrichment, sequencing and analysis method. Methods Based on research studies focusing on Short Tandem Repeats (STR) characterization available in the literature, we compared multiple enrichment methods coupled to various sequencing platforms and data analysis tools. We then performed a preliminary comparison on internal data, and discussed their accuracy in CAG sizing and sensitivity for rare alleles detection. Results We reported much higher enrichment for PCR-based methods, compared to CRISPR/Cas9 enrichment and adaptive sampling. Moreover, unlike long-read sequencing platforms, we described a decrease in sequencing quality at cycles above 300 for Illumina MiSeq. As a last point, we showed a preliminary comparison among sequencing-based methods on internal data, showing the impact of each step towards the most accurate STR characterization. Conclusions We anticipate that long-read sequencing of PCR amplicons incorporating UMIs may represent a valid alternative to PCR-free enrichment methods, providing highly accurate performances in terms of repeat length and rare alleles detection. Nonetheless, we recommend accurate optimization of PCR primers and conditions, as suboptimal number of PCR cycles may result in off-target products and inadequate number of PCR duplicates.


B001 CAGinSTEM, a human embryonic stem cell platform to identify genetic factors implicated in Huntington’s disease

September 2024

·

6 Reads

Journal of Neurology, Neurosurgery, and Psychiatry

Background It is well known that HD patients with similar CAG length show a wide range of variability in motor onset that can account for up to two decades. One possible explanation resides in the fact that the inherited CAG repeats may expand in somatic tissues, especially in post-mitotic neurons, giving rise to a HTT mosaicism that results in longer than inherited CAG tracts in affected tissues, such as the striatum and cortex. This expansion may continue during the lifetime of the individual and contribute to exacerbate neuronal toxicity and selective neuronal degeneration. More recently, trans- and cis- modifiers of age of onset (AOO) have been identified. However, if and how they cause the progressive accumulation of CAG instability is still unclear. Aim To identify new cis and trans modifiers of CAG instability, we aimed to establish an isogenic human stem cell platform that, combined with third generation long-read sequencing, allows to monitor HTT CAG size over time, both during mitotic cell replication and in post-mitotic neurons. Methods Starting from H9 human embryonic stem cell (hES) line, we inserted a monoallelic Recombinant Mediated Exchange Cassette within HTT exon 1, which can be subsequently exchanged with any exon1 variant in an efficient way. We generated a wide variety of exon 1 modified cell lines, which we refer to as the CAGinSTEM platform. Results Our data show that the CAGinSTEM platform is technically robust as for each genotype we have multiple cell lines which have been quality checked. By exploiting the properties of the CAGinSTEM platform, we are testing how CAG length and composition impact on CAG instability in terminally differentiated medium spiny neurons and in active proliferating hES cells. Conclusions The CAGinSTEM platform offers a distinctive biological model system designed to explore genotype-phenotype correlations and investigate the mechanisms underlying CAG instability accumulation in postmitotic human neurons and other cell types.Funded by an ERC Advanced Grant from the European Commission.


Figure 2. Key results executing the tools with default settings. (A) Number of hits detected by NanOlympicsMod for each tool in the Oligos dataset. (B) As (A) for the yeast dataset. (C) As (A) for the mouse dataset. (D) As (A) for the human dataset. (E) Distribution of m6A hits for each tool along the synthetic oligos. (F) as (E) for the yeast metagene. (G) as (E) for the mouse metagene. (H) as (E) for the human metagene. (I) Heatmap reporting the overlap of m6A hits for each pair of tools executed with default settings on the oligos dataset. The value in a cell represents, for each pair of tools, the proportion of hits in common to the number of hits of the tool on the row (see the schema on the left of the panel). (J) As in (I) for the yeast dataset. (K) As in (I) for the mouse dataset. (L) As in (I) for the human dataset.
Figure 3. Agreement with reference sets of m6A hits. (A) Precision, recall and F1 score for each tool executed at default conditions on the oligos dataset. According to Supplementary Table 1, GM and TM identify tools working on the genome (G) or transcriptome (T) space and require multiple conditions, respectively. GS and TS identify tools working on the genome (G) or transcriptome (T) space and requiring a single condition, respectively. (B) Precision and recall curves at different cut-off values for the tools indicated in (A) on the oligos dataset; for each tool, the default cut-off is indicated by a square; the performance of a random classifier is included. (C) As in (A) for the yeast dataset. (D) as in (A) for the mouse dataset. (E) as in (A) for the human dataset. (F) as in (B) for the yeast dataset. (G) as in (B) for the mouse dataset. (H) as in (B) for the human dataset.
Figure 4. Agreement with reference sets of m6A hits on RRACH+, accessible, and high-coverage bins. (A) Precision, recall and F1 score for each tool executed at default conditions on the mouse dataset on RRACH+ bins. According to Supplementary Table 1, GM and TM identify tools working on the genome (G) or transcriptome (T) space and requiring multiple conditions, respectively. GS and TS identify tools working on the genome (G) or transcriptome (T) space and requiring a single condition, respectively. (B) Precision and recall curves at different cut-off values for the tools indicated in (A) on the mouse dataset; for each tool, the default cut-off is indicated by a square; the performance of a random classifier is included. (C) as in (A) for DRACH+ bins outside of splice-site exclusion zones. (D) as in (B) for DRACH+ bins outside of splice-site exclusion zones. (E) as in (A) for bins with high coverage. (F) as in (B) for bins with high coverage.
Figure 5. Sequence features associated with true positive, false positive and false negative hits. (A) m6A hits of each tool were stratified based on their association to specific RRACH motifs, and their number and accuracy on the mouse dataset was reported. (B) Distribution of accuracy stratified for common and uncommon RRACH motifs. (C) De novo motif enrichment analysis was performed on 50 nt regions centred at false positive hits for each tool on the mouse dataset, and the most significant motif was reported, together with statistical significance and consensus motif; tools marked with * are restricted to RRACH/DRACH motifs by implementation. (D) Distribution of the GC content for 50 nt regions centred at true positive (TP), false negative (FN) and false positive (FP) m6A hits. (E) as (D) for the free energy. (F) as (D) for the Shannon entropy.
Figure 6. m6A calling saturation analysis. (A) Saturation analysis for m6A calling by various tools on the human dataset; the number of hits (y-axis) identified on subsets of the whole dataset (x-axis) is reported as a proportion of the number of hits identified on the whole dataset. (B) As in (A) where the y-axis reports the corresponding F1 score. (C) as in (A) where the y-axis reports the AUPRC.
Benchmarking of computational methods for m6A profiling with Nanopore direct RNA sequencing

January 2024

·

91 Reads

·

24 Citations

Briefings in Bioinformatics

N6-methyladenosine (m6A) is the most abundant internal eukaryotic mRNA modification, and is involved in the regulation of various biological processes. Direct Nanopore sequencing of native RNA (dRNA-seq) emerged as a leading approach for its identification. Several software were published for m6A detection and there is a strong need for independent studies benchmarking their performance on data from different species, and against various reference datasets. Moreover, a computational workflow is needed to streamline the execution of tools whose installation and execution remains complicated. We developed NanOlympicsMod, a Nextflow pipeline exploiting containerized technology for comparing 14 tools for m6A detection on dRNA-seq data. NanOlympicsMod was tested on dRNA-seq data generated from in vitro (un)modified synthetic oligos. The m6A hits returned by each tool were compared to the m6A position known by design of the oligos. In addition, NanOlympicsMod was used on dRNA-seq datasets from wild-type and m6A-depleted yeast, mouse and human, and each tool’s hits were compared to reference m6A sets generated by leading orthogonal methods. The performance of the tools markedly differed across datasets, and methods adopting different approaches showed different preferences in terms of precision and recall. Changing the stringency cut-offs allowed for tuning the precision-recall trade-off towards user preferences. Finally, we determined that precision and recall of tools are markedly influenced by sequencing depth, and that additional sequencing would likely reveal additional m6A sites. Thanks to the possibility of including novel tools, NanOlympicsMod will streamline the benchmarking of m6A detection tools on dRNA-seq data, improving future RNA modification characterization.


The Oxford Nanopore MinION as a Versatile Technology for the Diagnosis and Characterization of Emerging Plant Viruses

December 2023

·

76 Reads

Methods in molecular biology (Clifton, N.J.)

The emergence of novel viral epidemics that could affect major crops represents a serious threat to global food security. The early and accurate identification of the causative viral agent is the most important step for a rapid and effective response to disease outbreaks. Over the last years, the Oxford Nanopore Technologies (ONT) MinION sequencer has been proposed as an effective diagnostic tool for the early detection and identification of emerging viruses in plants, providing many advantages compared with different high-throughput sequencing (HTS) technologies. Here, we provide a step-by-step protocol that we optimized to obtain the virome of “Lamon bean” plants (Phaseolus vulgaris L.), an agricultural product with Protected Geographical Indication (PGI) in North–East of Italy, which is frequently subjected to multiple infections caused by different RNA viruses. The conversion of viral RNA in ds-cDNA enabled the use of Genomic DNA Ligation Sequencing Kit and Native Barcoding DNA Kit, which have been originally developed for DNA sequencing. This allowed the simultaneous diagnosis of both DNA- and RNA-based pathogens, providing a more versatile alternative to the use of direct RNA and/or direct cDNA sequencing kits.


STArS (STrain-Amplicon-Seq), a targeted nanopore sequencing workflow for SARS-CoV-2 diagnostics and genotyping

August 2022

·

57 Reads

·

1 Citation

Biology Methods and Protocols

Diagnostic tests based on reverse transcription–quantitative polymerase chain reaction (RT–qPCR) are the gold standard approach to detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection from clinical specimens. However, unless specifically optimized, this method is usually unable to recognize the specific viral strain responsible of coronavirus disease 2019, a crucial information that is proving increasingly important in relation to virus spread and treatment effectiveness. Even if some RT–qPCR commercial assays are currently being developed for the detection of viral strains, they focus only on single/few genetic variants that may not be sufficient to uniquely identify a specific strain. Therefore, genome sequencing approaches remain the most comprehensive solution for virus genotyping and to recognize viral strains, but their application is much less widespread due to higher costs. Starting from the well-established ARTIC protocol coupled to nanopore sequencing, in this work, we developed STArS (STrain-Amplicon-Seq), a cost/time-effective sequencing-based workflow for both SARS-CoV-2 diagnostics and genotyping. A set of 10 amplicons was initially selected from the ARTIC tiling panel, to cover: (i) all the main biologically relevant genetic variants located on the Spike gene; (ii) a minimal set of variants to uniquely identify the currently circulating strains; (iii) genomic sites usually amplified by RT–qPCR method to identify SARS-CoV-2 presence. PCR-amplified clinical samples (both positive and negative for SARS-CoV-2 presence) were pooled together with a serially diluted exogenous amplicon at known concentration and sequenced on a MinION device. Thanks to a scoring rule, STArS had the capability to accurately classify positive samples in agreement with RT–qPCR results, both at the qualitative and quantitative level. Moreover, the method allowed to effectively genotype strain-specific variants and thus also return the phylogenetic classification of SARS-CoV-2-postive samples. Thanks to the reduced turnaround time and costs, the proposed approach represents a step towards simplifying the clinical application of sequencing for viral genotyping, hopefully aiding in combatting the global pandemic.

Citations (4)


... In these studies, the precise measurement of CAG size and composition in individual brain cells, along with the corresponding transcriptional profiles, has become increasingly important. These aspects are discussed in detail in the accompanying article by some of the authors ( 17 ). Finally, we will review strategies aimed at reducing SI with the goal of fighting the disease. ...

Reference:

When repetita no-longer iuvant: somatic instability of the CAG triplet in Huntington's disease
Navigating triplet repeats sequencing: concepts, methodological challenges and perspective for Huntington's disease

Nucleic Acids Research

... Meera Purushottam meera.purushottam@gmail.com expansion disorders, especially in the brain, is a critical factor in disease biology [4]. Transcription-induced DNA slippage and instability may have profound biological consequences in repeat-associated neurodegenerative diseases, and account for expanded repeats in terminally differentiated cells like neurons [5]. ...

When repetita no-longer iuvant: somatic instability of the CAG triplet in Huntington's disease

Nucleic Acids Research

... This is because generating ground truth data sets that can mimic high complexity biological samples can be challenging itself. In fact, algorithms can have discrepancies in performance when used for synthetic or biological RNA data set analysis 65 , pointing out that for further challenges the combination of synthetic (including IVT) and in vivo DRS RNA samples can be used for training and validation respectively. This approach can highlight new algorithms strategies for the analysis of biological samples. ...

Benchmarking of computational methods for m6A profiling with Nanopore direct RNA sequencing

Briefings in Bioinformatics

... Nanopore platform classified 10 out of 14 samples as Apicomplexa positive, according to a scoring rule we developed, which classifies a sample as Apicomplexa positive in case the number of reads assigned to Apicomplexa is at least 5-fold the average number of reads assigned to Apicomplexa for negative controls. This scoring rule was adapted from previous works describing the adoption of Nanopore sequencing for pathogen detection [29,30], while the Illumina platform classified all 14 samples as Apicomplexa positive. In particular, the four samples classified as Apicomplexa negative by the Nanopore platform had a very low percentage of reads assigned to Apicomplexa also in the Illumina analysis, namely 0.22%, 0.09%, 0.04%, and 0.02% for the G0159P, G0173CR, G0173L, and G0225CR1 samples, respectively. ...

STArS (STrain-Amplicon-Seq), a targeted nanopore sequencing workflow for SARS-CoV-2 diagnostics and genotyping

Biology Methods and Protocols