Roberto Corizzo

Roberto Corizzo
American University Washington D.C. | AU · Department of Computer Science

PhD

About

36
Publications
11,396
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
504
Citations
Additional affiliations
August 2020 - present
American University Washington D.C.
Position
  • Professor (Assistant)
July 2019 - July 2019
American University Washington D.C.
Position
  • Research Associate
October 2018 - present
Università degli Studi di Bari Aldo Moro
Position
  • PostDoc Position

Publications

Publications (36)
Article
Full-text available
Lifelong learning addresses the challenge of acquiring new knowledge and tackling new tasks in a continually evolving environment. Although this thread of research has recently received increased interest, most lifelong machine learning approaches proposed thus far focus on object recognition or classification tasks. In contrast, lifelong approache...
Article
Full-text available
The Growing Hierarchical Self-Organizing Map (GHSOM) algorithm has shown its potential for performing several tasks such as exploratory analysis, anomaly detection and forecasting on a variety of domains including the financial and cyber-security domains. GHSOM is a dynamic variant of the SOM algorithm which generates a multi-level hierarchy of SOM...
Article
Full-text available
The increasing presence of geo-distributed sensor networks implies the generation of huge volumes of data from multiple geographical locations at an increasing rate. This raises important issues which become more challenging when the final goal is that of the analysis of the data for forecasting purposes or, more generally, for predictive tasks. Th...
Article
Full-text available
Gravitational waves represent a new opportunity to study and interpret phenomena from the universe. In order to efficiently detect and analyze them, advanced and automatic signal processing and machine learning techniques could help to support standard tools and techniques. Another challenge relates to the large volume of data collected by the dete...
Conference Paper
Full-text available
Detecting relevant changes in dynamic time series data in a timely manner is crucially important for many data analysis tasks in real-world settings. Change point detection methods have the ability to discover changes in an unsupervised fashion, which represents a desirable property in the analysis of unbounded and unlabeled data streams. However,...
Preprint
Full-text available
Detecting relevant changes in dynamic time series data in a timely manner is crucially important for many data analysis tasks in real-world settings. Change point detection methods have the ability to discover changes in an unsupervised fashion, which represents a desirable property in the analysis of unbounded and unlabeled data streams. However,...
Conference Paper
Full-text available
Recent advances in medical imaging and deep learning have enabled the efficient analysis of large databases of images. Notable examples include the analysis of computed tomography (CT), magnetic resonance imaging (MRI), and X-ray. While the automatic classification of images has proven successful, adopting such a paradigm in the medical healthcare...
Conference Paper
Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning...
Chapter
Long-tailed distributions and class imbalance are problems of significant importance in applied deep learning where trained models are exploited for decision support and decision automation in critical areas such as health and medicine, transportation and finance. The challenge of learning deep models from such data remains high, and the state-of-t...
Chapter
The huge amount of data generated by sensor networks enables many potential analyses. However, one important limiting factor for the analyses of sensor data is the possible presence of anomalies, which may affect the validity of any conclusion we could draw. This aspect motivates the adoption of a preliminary anomaly detection method. Existing meth...
Preprint
Full-text available
Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning...
Conference Paper
Full-text available
Learning from imbalanced data poses significant challenges for the classifier. This becomes even more difficult, when dealing with multi-class problems. Here relationships among classes are no longer well-defined and it is easy to loose performance on one of the classes while gaining on other. In last years this topic has gained increased interest...
Article
Full-text available
Air pollution is a global problem, especially in urban areas where the population density is very high due to the diverse pollutant sources such as vehicles, industrial plants, buildings, and waste. North Macedonia, as a developing country, has a serious problem with air pollution. The problem is highly present in its capital city, Skopje, where ai...
Article
Full-text available
Air pollution is becoming a rising and serious environmental problem, especially in urban areas affected by an increasing migration rate. The large availability of sensor data enables the adoption of analytical tools to provide decision support capabilities. Employing sensors facilitates air pollution monitoring, but the lack of predictive capabili...
Article
Full-text available
Applied machine learning in bioinformatics is growing as computer science slowly invades all research spheres. With the arrival of modern next-generation DNA sequencing algorithms, metagenomics is becoming an increasingly interesting research field as it finds countless practical applications exploiting the vast amounts of generated data. This stud...
Preprint
Full-text available
Class imbalance is a problem of significant importance in applied deep learning where trained models are exploited for decision support and automated decisions in critical areas such as health and medicine, transportation , and finance. The challenge of learning deep models from imbalanced training data remains high, and the state-of-the-art soluti...
Preprint
Full-text available
Class imbalance is a problem of significant importance in applied deep learning where trained models are exploited for decision support and automated decisions in critical areas such as health and medicine, transportation, and finance. The challenge of learning deep models from imbalanced training data remains high, and the state-of-the-art solutio...
Chapter
The next-generation sequencing revolution has impacted biological research by allowing the collection and analysis of very large datasets. However, despite the large availability of data, current computational methods used by biologists present some limitations in challenging domains, such as extremely imbalanced datasets characterized by almost on...
Article
Full-text available
The increasing presence of renewable energy plants has created new challenges such as grid integration, load balancing and energy trading, making it fundamental to provide effective prediction models. Recent approaches in the literature have shown that exploiting spatio-temporal autocorrelation in data coming from multiple plants can lead to better...
Conference Paper
Full-text available
The effects of air pollution on people, the environment , and the global economy are profound-and often under-recognized. Air pollution is becoming a global problem. Urban areas have dense populations and a high concentration of emission sources: vehicles, buildings, industrial activity, waste, and wastewater. Tackling air pollution is an immediate...
Article
Full-text available
Smart grids are power grids where clients may actively participate in energy production, storage and distribution. Smart grid management raises several challenges, including the possible changes and evolutions in terms of energy consumption and production, that must be taken into account in order to properly regulate the energy distribution. In thi...
Article
Full-text available
Remote Sensing (RS) image classification has recently attracted great attention for its application in different tasks, including environmental monitoring, battlefield surveillance, and geospatial object detection. The best practices for these tasks often involve transfer learning from pre-trained Convolutional Neural Networks (CNNs). A common appr...
Article
Full-text available
Scene classification relying on images is essential in many systems and applications related to remote sensing. The scientific interest in scene classification from remotely collected images is increasing, and many datasets and algorithms are being developed. The introduction of convolutional neural networks (CNN) and other deep learning techniques...
Article
Full-text available
Aim: The analysis of network traffic plays a crucial role in modern organizations since it can provide defense mechanisms against cyberattacks. In this context, machine learning algorithms can be fruitfully adopted to identify malicious patterns in network sessions. However, they cannot be directly applied to a raw data representation of network tr...
Conference Paper
In many real-world applications, the characteristics of data collected by activity logs, sensors and mobile devices change over time. This behavior is known as concept drift. In complex environments, which produce high dimensional data streams, machine learning tasks become cumbersome, as models become outdated very quickly. In our study, we assess...
Conference Paper
Following a series of deep learning breakthroughs in the area of image segmentation, multiple objects in an image input can be finely sub-categorized. Although Convolutional Neural Networks (CNNs) are known for their state-of-the-art performance in image classification, they present drawbacks when used to analyze different data types, such as time...
Article
Full-text available
Recent developments in sensor networks and mobile computing led to a huge increase in data generated that need to be processed and analyzed efficiently. In this context, many distributed data mining algorithms have recently been proposed. Following this line of research, we propose the DENCAST system, a novel distributed algorithm implemented in Ap...
Article
Full-text available
In renewable energy forecasting, data are typically collected by geographically distributed sensor networks, which poses several issues. (i) Data represent physical properties that are subject to concept drift, i.e., their characteristics could change over time. To address the concept drift phenomenon, adaptive online learning methods should be con...
Chapter
Full-text available
In the last years, there was a growing interest in the use of Big Data models to support advanced data analysis functionalities. Many companies and organizations lack IT expertise and adequate budget to have benefits from them. In order to fill this gap, a model-based approach for Big Data Analytics-as-a-service (MBDAaaS) can be used. The proposed...
Article
Full-text available
In this paper, we tackle the problem of power prediction of several photovoltaic (PV) plants spread over an extended geographic area and connected to a power grid. The paper is intended to be a comprehensive study of one-day ahead forecast of PV energy production along several dimensions of analysis: i) The consideration of the spatio-temporal auto...
Chapter
Full-text available
Predicting the output power of renewable energy production plants distributed on a wide territory is a really valuable goal, both for marketing and energy management purposes. Vi-POC (Virtual Power Operating Center) project aims at designing and implementing a prototype which is able to achieve this goal. Due to the heterogeneity and the high volum...
Conference Paper
Full-text available
Predicting the output power of renewable energy production plants distributed on a wide territory is a valuable goal, both for marketing and energy management purposes. In this paper, we describe Vi-POC (Virtual Power Operating Center) – a distributed system for storing huge amounts of data, gathered from energy production plants and weather predic...
Chapter
Full-text available
The problem of accurately predicting the energy production from renewable sources has recently received an increasing attention from both the industrial and the research communities. It presents several challenges, such as facing with the rate data are provided by sensors, the heterogeneity of the data collected, power plants efficiency, as well as...
Article
Full-text available
The problem of accurately predicting the energy production from renewable sources has recently received an increasing attention from both the industrial and the research communities. It presents several challenges, such as facing with the high rate data are provided by sensors, the heterogeneity of the data collected, power plants efficiency, as we...

Network

Cited By