Marzia Angela Cremona

Marzia Angela Cremona
Laval University | ULAVAL · Department of Operations and Decision Systems

PhD

About

28
Publications
3,070
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
219
Citations
Additional affiliations
July 2017 - June 2019
Pennsylvania State University
Position
  • Professor (Assistant)
February 2016 - June 2017
Pennsylvania State University
Position
  • PostDoc Position
February 2016 - June 2019
Pennsylvania State University
Position
  • PostDoc Position

Publications

Publications (28)
Preprint
We develop a new method to locally cluster curves and discover functional motifs, i.e. typical "shapes" that may recur several times along and across the curves capturing important local characteristics. In order to identify these shared curve portions, our method leverages ideas from functional data analysis (joint clustering and alignment of curv...
Article
Full-text available
Supplementary information Supplementary data are available at Bioinformatics online.
Article
Full-text available
We investigate patterns of COVID-19 mortality across 20 Italian regions and their association with mobility, positivity, and socio-demographic, infrastructural and environmental covariates. Notwithstanding limitations in accuracy and resolution of the data available from public sources, we pinpoint significant trends exploiting information in curve...
Article
Full-text available
Significance Multiple human genetic diseases are caused by mutations in the maternally transmitted DNA of mitochondria, the powerhouses of the cell. It is important to study how these mutations arise and accumulate with age, especially because humans in many societies now choose to have children at an older age. However, this is difficult to accomp...
Preprint
Full-text available
Modern sequencing technologies are not error-free, and might possess systematic biases in their error distributions. A potential cause for non-randomly occurring errors is the formation of alternative DNA structures (non-B DNA), such as G-quadruplexes (G4s), Z-DNA, or cruciform structures, during sequencing. Approximately 13% of the human genome ha...
Preprint
Full-text available
We consider functional data where an underlying smooth curve is composed not just with errors, but also with irregular spikes. We propose an approach that, combining regularized spline smoothing and an Expectation-Maximization algorithm, allows one to both identify spikes and estimate the smooth component. Imposing some assumptions on the error dis...
Preprint
Full-text available
Motivation Protein-DNA binding sites of ChIP-seq experiments are identified where the binding affinity is significant based on a given threshold. The choice of the threshold is a trade-off between conservative region identification and discarding weak, but true binding sites. Results We argue the biological relevance of weak binding sites and the i...
Article
Full-text available
Approximately 13% of the human genome can fold into non-canonical (non-B) DNA structures (e.g. G-quadruplexes, Z-DNA, etc.), which have been implicated in vital cellular processes. Non-B DNA also hinders replication, increasing errors and facilitating mutagenesis, yet its contribution to genome-wide variation in mutation rates remains unexplored. H...
Preprint
We investigate patterns of COVID-19 mortality across 20 Italian regions and their association with mobility, positivity, and socio-demographic, infrastructural and environmental covariates. Notwithstanding limitations in accuracy and resolution of the data available from public sources, we pinpoint significant trends exploiting information in curve...
Article
Long Interspersed Elements-1 (L1s) constitute >17% of the human genome and still actively transpose in it. Characterizing L1 transposition across the genome is critical for understanding genome evolution and somatic mutations. However, to date, L1 insertion and fixation patterns have not been studied comprehensively. To fill this gap, we investigat...
Article
Full-text available
Mutations create genetic variation for other evolutionary forces to operate on and cause numerous genetic diseases. Nevertheless, how de novo mutations arise remains poorly understood. Progress in the area is hindered by the fact that error rates of conventional sequencing technologies (1 in 100 or 1,000 base pairs) are several orders of magnitude...
Article
Supplementary information Supplementary material is available at Bioinformatics online.
Article
Full-text available
Coadaptation between bacterial hosts and plasmids frequently results in adaptive changes restricted exclusively to host genome leaving plasmids unchanged. To better understand this remarkable stability we transformed naïve E. coli cells with a plasmid carrying an antibiotic-resistance gene and forced them to adapt in a turbidostat environment. We t...
Preprint
In the last two decades several biclustering methods have been developed as new unsupervised learning techniques to simultaneously cluster rows and columns of a data matrix. These algorithms play a central role in contemporary machine learning and in many applications, e.g. to computational biology and bioinformatics. The H-score is the evaluation...
Article
Full-text available
DNA conformation may deviate from the classical B-form in ~13% of the human genome. Non-B DNA regulates many cellular processes, including transcription and telomere maintenance; however, its effects on DNA polymerization speed and accuracy have not been investigated genome-wide. Such an inquiry is critical for informing neurological diseases and c...
Preprint
Full-text available
DNA conformation may deviate from the classical B-form in ~13% of the human genome. Non-B DNA regulates many cellular processes; however, its effects on DNA polymerization speed and accuracy have not been investigated genome-wide. Such an inquiry is critical for understanding neurological diseases and cancer genome instability. Here we present the...
Article
Summary With increased generation of high-resolution sequence-based “Omics” data, detecting statistically significant effects at different genomic locations and scales has become key to addressing several scientific questions. IWTomics is an R/Bioconductor package (integrated in Galaxy) that, exploiting sophisticated Functional Data Analysis techni...
Chapter
We consider thousands of endogenous retrovirus detected in the human and mouse genomes, and quantify a large number of genomic landscape features at high resolution around their integration sites and in control regions. We propose to analyze this data employing a recently developed functional inferential procedure and functional logistic regression...
Article
Full-text available
Railway wheel wear prediction is essential for reliability and optimal maintenance strategies of railway systems. Indeed, an accurate wear prediction can have both economic and safety implications. In this paper we propose a novel methodology, based on Archard's equation and a local contact model, to forecast the volume of material worn and the cor...
Article
Full-text available
Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-...
Article
Full-text available
ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are...
Chapter
Full-text available
BarCamp is a quite new type of event for the scientific and technological community. In full generality, it is an “unconference”, a meeting where everyone can contribute, presenting a topic and generating a discussion. In this paper, we propose the BarCamp as an innovative way of producing and communicating statistical knowledge, and we describe th...
Conference Paper
In recent years many techniques have been developed to study genetic and epigenetic processes. Here we focus on a particular Next Generation Sequencing method called ChIP-Seq (Chromatin ImmunoPrecipitation Sequencing), that permits to investigate protein-DNA interactions. At present, in the relevant literature, the analysis of ChIP-Seq data is main...
Conference Paper
In recent years many techniques have been developed to study genetic and epigenetic processes. Among Next Generation Sequencing methods, ChIP-Seq (Chromatin ImmunoPrecipitation Sequencing) permits to investigate protein-DNA interactions. At present only signal intensity is considered in the analysis of ChIP-Seq data, with the aim to detect highly e...

Network

Cited By