
Hadi Fanaee-THalmstad University · ITE- School of Information Technology
Hadi Fanaee-T
PhD in Computer Science
Natural Learning: A Revolution in Explainability and Interpretability
Paper: https://arxiv.org/pdf/2404.05903.pdf
About
46
Publications
55,450
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,016
Citations
Introduction
Hadi Fanaee-T received his PhD degree (with distinction) in Computer Science from Faculty of Science of University of Porto in 2015. His main research interests are tensor analysis, anomaly detection and spatiotemporal data mining and he has published over 20 peer-reviewed articles in these areas . He has served as a PC Member for leading conferences such as IJCAI, AAAI, ECML-PKDD, DS , DSSA, and so forth. He also reviewes articles for high impact journals such as TDKE, DAMI, MLJ, KAIS, KBS, and CSUR. He was invited talk at the summer school on mining big and complex data. Dr. Fanaee is the leader of SimTensor project (simtensor.eu.org) and was the finalist in ERCIM Cor Baayen Young Researcher Award 2017. Presently, he is co-advising three PhD students and a Master student.
Additional affiliations
Education
February 2011 - November 2015
September 2008 - September 2010
Publications
Publications (46)
Several approaches have been developed for improving the ship energy efficiency, thereby reducing operating costs and ensuring compliance with climate change mitigation regulations. Many of these approaches will heavily depend on measured data from onboard IoT devices, including operational and environmental information, as well as external data so...
iHelm Porject Summary Report
Time series forecasting is an important problem with various applications in different domains. Improving forecast performance has been the center of investigation in the last decades. Several research studies have shown that old statistical method, such as ARIMA, are still state-of-the-art in many domains and applications. However, one of the main...
The increasing use of AI methods in various applications has raised concerns about their explainability and transparency. Many solutions have been developed within the last few years to either explain the model itself or the decisions provided by the model. However, the number of contributions in the field of eXplainable AI (XAI) is increasing at s...
Many real-world tensors come with missing values. The task of estimation of such missing elements is called tensor completion (TC). It is a fundamental problem with a wide range of applications in data mining, machine learning, signal processing, and computer vision. In the last decade, several different algorithms have been developed, couple of th...
Electronic Health Records (EHR) data is routinely generated patient data that can provide useful information for analytical tasks such as disease detection and clinical event prediction. However, temporal EHR data such as physiological vital signs and lab test results are particularly challenging. Temporal EHR features typically have different samp...
Densification events in time-evolving networks refer to instants in which the network density, that is, the number of edges, is substantially larger than in the remaining. These events can occur at a global level, involving the majority of the nodes in the network, or at a local level involving only a subset of nodes.While global densification even...
Social networks are becoming larger and more complex as new ways of collecting social interaction data arise (namely from online social networks, mobile devices sensors, ...). These networks are often large-scale and of high dimensionality. Therefore, dealing with such networks became a challenging task. An intuitive way to deal with this complexit...
Oil spills cause environmental pollution with a serious threat to local communities and sustainable development. Accidental oil spills can be modelled as a stochastic process where each oil spill event is described by its spatial locations and incidence-time and hence allow for space-time cluster analysis. Space-time cluster analysis can detect spa...
Tensor decompositions are multi-way analysis tools which have been successfully applied in a wide range of different fields. However, there are still challenges that remain few explored, namely the following: when applying tensor decomposition techniques, what should we expect from the result? How can we evaluate its quality? It is expected that, w...
We introduce a new concept called “Iterative Multi-Mode Discretization (IMMD)” which is a new type of efficient data sparsification that can scale up many tasks in data mining. In this paper we demonstrate the application of IMMD in co-clustering, i.e. simultaneous clustering of the rows and columns in a matrix. We propose IMMD-CC, a novel co-clust...
The increasing presence of renewable energy plants has created new challenges such as grid integration, load balancing and energy trading, making it fundamental to provide effective prediction models. Recent approaches in the literature have shown that exploiting spatio-temporal autocorrelation in data coming from multiple plants can lead to better...
Anomaly detection in time-evolving networks has many applications, for instance, traffic analysis in transportation networks and intrusion detection in computer networks. One group of popular methods for anomaly detection from evolving networks are robust online subspace trackers. However, these methods suffer from problem of insensitivity to drast...
The number of tuberculosis (TB) cases in Pakistan ranks fifth in the world. The National TB Control Program (NTP) has recently reported more than 462,920 TB patients in Khyber Pakhtunkhwa province, Pakistan from 2002 to 2017. This study aims to identify spatial and space-time clusters of TB cases in Khyber Pakhtunkhwa province Pakistan during 2015–...
Extracting operation cycles from the historical reading of sensors is an essential step in IoT data analytics. For instance, we can exploit the obtained cycles for learning the normal states to feed into semi-supervised models or dictionaries for efficient real-time anomaly detection on the sensors. However, this is a difficult problem due to this...
Existing approaches for detecting anomalous events in time-evolving networks usually focus on detecting events involving the majority of the nodes, which affect the overall structure of the network. Since events involving just a small subset of nodes usually do not affect the overall structure of the network, they are more difficult to spot. In thi...
Dimension reduction (DR) methods play an inevitable role in analyzing and visualizing high-dimensional multi-source data. In the recent decades many variants of these methods have been developed in various disciplines and domains. Due to the diversity and an ever-increasing number of developed techniques, choosing the right method for the given pro...
Motivation:
Visualization of high-dimensional data is an important step in exploratory data analysis and knowledge discovery. However, it is challenging, because the interpretation is highly subjective. If we see dimensionality reduction (DR) techniques as the main tool for data visualization, they are like multiple cameras that look into the data...
Due to the scale and complexity of todays’ social networks, it becomes infeasible to mine them with traditional approaches. A possible solution to reduce such scale and complexity is to produce a compact (lossy) version of the network that represents its major properties. This task is known as graph summarization, which is the subject of this resea...
Identifying the abnormally high-risk regions in a spatiotemporal space that contains an unexpected disease count is helpful to conduct surveillance and implement control strategies. The EigenSpot algorithm has been recently proposed for detecting space-time disease clusters of arbitrary shapes with no restriction on the distribution and quality of...
The data on suspected measles cases and the population at risk.
(PDF)
The software solution for the proposed algorithm in MATLAB.
(PDF)
Mobility mining has lots of applications in urban planning and transportation systems. In particular, extracting mobility patterns enables service providers to have a global insight about the mobility behaviors which consequently leads to providing better services to the citizens. In the recent years several data mining techniques have been present...
SimTensor is a multi-platform, open-source software for generating artificial tensor data (either with CP/PARAFAC or Tucker structure) for reproducible research on tensor factorization algorithms. SimTensor is a stand-alone application based on MATALB. It provides a wide range of facilities for generating tensor data with various configurations. It...
A traffic tensor or simply is a new data model for conventional origin/destination (O/D) matrices. Tensor models are traffic data analysis techniques which use this new data model to improve performance. Tensors outperform other models because both temporal and spatial fluctuations of traffic patterns are simultaneously taken into account, obtainin...
Traditional spectral-based methods such as PCA are popular for anomaly detection in a variety of problems and domains. However, if data includes tensor (multiway) structure (e.g. space-time-measurements), some meaningful anomalies may remain invisible with these methods. Although tensor-based anomaly detection (TAD) has been applied within a variet...
Tensor analysis is a powerful tool for multiway problems in data mining, signal processing, pattern recognition and many other areas. Nowadays, the most important challenges in tensor analysis are efficiency and adaptability. Still, the majority of techniques are not scalable or not applicable in streaming settings. One of the promising frameworks...
Syndromic surveillance systems continuously monitor multiple pre-diagnostic
daily streams of indicators from different regions with the aim of early
detection of disease outbreaks. The main objective of these systems is to
detect outbreaks hours or days before the clinical and laboratory confirmation.
The type of data that is being generated via th...
Hotspot detection aims at identifying subgroups in the observations that are
unexpected, with respect to the some baseline information. For instance, in
disease surveillance, the purpose is to detect sub-regions in spatiotemporal
space, where the count of reported diseases (e.g. Cancer) is higher than
expected, with respect to the population. The s...
Failure detection in telecommunication networks is a vital task. So far,
several supervised and unsupervised solutions have been provided for
discovering failures in such networks. Among them unsupervised approaches has
attracted more attention since no label data is required. Often, network
devices are not able to provide information about the typ...
Space and time are two critical components of many real world systems. For
this reason, analysis of anomalies in spatiotemporal data has been a great of
interest. In this work, application of tensor decomposition and eigenspace
techniques on spatiotemporal hotspot detection is investigated. An algorithm
called SST-Hotspot is proposed which accounts...
Event labeling is the process of marking events in unlabeled data. Traditionally, this is done by involving one or more human experts through an expensive and time consuming task. In this article we propose an event labeling system relying on an ensemble of detectors and background knowledge. The target data are the usage log of a real bike sharing...
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return
back at another position. Currently, there are about over 500 bike-sharing programs around the world wh...
Online forums enable users to discuss together around various topics. One of the serious problems of these environments is high volume of discussions and thus information overload problem. Unfortunately without considering the users interests, traditional Information Retrieval (IR) techniques are not able to solve the problem. Therefore, employment...
So far, several supervised and unsupervised solutions have been provided for detecting
failures in telecommunication networks. Among them, unsupervised approaches attracted more
attention since no labeled data is required [1]. Principal component analysis (PCA) is a wellknown
unsupervised technique to solve this type of problem when data is organiz...
Top-k spatial preference queries has a wide range of
applications in service recommendation and decision support
systems. In this work we first introduce three state of the art
algorithms and apply them on a real data set which includes
geographic coordinates and quality data of over 355 hotels, 276
point of interests and 563 restaurants in Lisbon,...
Nowadays, a vast amount of spatio-temporal data are being generated by devices like cell phones, GPS and remote sensing devices and therefore discovering interesting patterns in such data became an interesting topics for researchers. One of these topics has been spatio-temporal clustering which is a novel sub field of data mining and Recent researc...
The ranking of geographically referenced objects based on the amenities
in their neighborhood has been recently addressed in [2, 3]. These
works present a methodology to specify queries using spatial and nonspatial
features to compute an overall score for the given objects. The
spatial features considered are the location of the objects to be ranke...
Online forums enable users to discuss together around various topics. One of the serious problems of these
environments is high volume of discussions and thus information overload problem. Unfortunately without
considering the users interests, traditional Information Retrieval (IR) techniques are not able to solve the problem. Therefore, employment...
In this paper, the use of Adaptive Neural-Fuzzy Inference System (ANFIS) to study the design of Short-Term Load Forecasting (STLF) systems for the east of Iran was explored. This paper forecasts consumed load by using multi ANFIS. Entries of the presented model are into the multi ANFIS including the date of the day, temperature maximum and minimum,...
Questions
Questions (5)
We want to do research on CDR data. The best data set for this purpose is D4D challenge data set (http://www.d4d.orange.com/en/Accueil). Unfortunately the challenge is closed and the data set is no longer available. Anyone who has already downloaded it can please provide me a private link or something?
Does anyone know any other public CDR dataset?
Thanks
I would like to simulate an time-evolving network (or social network) with the ability to:
- Pre-define a desired number of communities.
- Pre-define a desired number of anomalies (edges, nodes or substructure, or time-stamp)
Does anyone know any software or publication related to this issue?