Fernando Bação

Fernando Bação
  • PhD in Information Management
  • Full Professor at Universidade NOVA de Lisboa

About

133
Publications
186,433
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,975
Citations
Current institution
Universidade NOVA de Lisboa
Current position
  • Full Professor

Publications

Publications (133)
Article
Full-text available
This study examines the interplay between text summarization techniques and embeddings from Language Models (LMs) in constructing expert systems dedicated to the retrieval of legal precedents, with an emphasis on achieving cost-efficiency. Grounded in the growing domain of Artificial Intelligence (AI) in law, our research confronts the perennial ch...
Article
Full-text available
Although the imbalanced learning problem is best known in the context of classification tasks, it also affects other areas of learning algorithms, such as regression. For regression, the problem is characterized by the existence of a continuous target variable domain and the need for models capable of making accurate predictions about rare events....
Article
Full-text available
The e-commerce industry’s rapid growth, accelerated by the COVID-19 pandemic, has led to an alarming increase in digital fraud and associated losses. To establish a healthy e-commerce ecosystem, robust cyber security and anti-fraud measures are crucial. However, research on fraud detection systems has struggled to keep pace due to limited real-worl...
Article
The importance of legal precedents in ensuring consistent jurisprudence is undisputed. Particularly in jurisdictions following the Common law, but even in Civil law systems, uniformity in case law requires adherence to precedents. However, with the growing volume of cases, manual identification becomes a bottleneck, prompting the need for automatio...
Article
In recent years, the field of Topic Modeling (TM) has grown in importance due to the increasing availability of digital text data. TM is an unsupervised learning technique that helps uncover latent semantic structures in large sets of documents, making it a valuable tool for finding relevant patterns. However, evaluating the performance of TM algor...
Chapter
The illicit activity in Blockchain reached an all-time high in 2021. In this work, we combined two machine learning techniques, Autoencoder (AE) and Extreme Gradient Boosting (XGBoost), to improve the performance of predicting illicit activity at the account level. The choice of autoencoding technique allows us to be able to detect new MOs (modus o...
Article
Full-text available
Judges frequently rely their reasoning on precedents. Courts must preserve uniformity in decisions while, depending on the legal system, previous cases compel rulings. The search for methods to accurately identify similar previous cases is not new and has been a vital input, for example, to case-based reasoning (CBR) methodologies. This literature...
Article
Full-text available
Competitive Intelligence allows an organization to keep up with market trends and foresee business opportunities. This practice is mainly performed by analysts scanning for any piece of valuable information in a myriad of dispersed and unstructured sources. Here we present MapIntel, a system for acquiring intelligence from vast collections of text...
Article
Full-text available
Organized retail crime (ORC) is a significant issue for retailers, marketplace platforms, and consumers. Its prevalence and influence have increased fast in lockstep with the expansion of online commerce, digital devices, and communication platforms. Today, it is a costly affair, wreaking havoc on enterprises’ overall revenues and continually jeopa...
Article
Full-text available
The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch...
Article
Full-text available
The generation of synthetic data can be used for anonymization, regularization, oversampling, semi-supervised learning, self-supervised learning, and several other tasks. Such broad potential motivated the development of new algorithms, specialized in data generation for specific data formats and Machine Learning (ML) tasks. However, one of the mos...
Article
Full-text available
Due to the difficulties inherent in diagnostics and prognostics, maintaining machine health remains a substantial issue in industrial production. Current approaches rely substantially on human engagement, making them costly and unsustainable, especially in high-volume industrial complexes like fulfillment centers. The length of time that fulfillmen...
Article
Full-text available
Apples are ranked third, after bananas and oranges, in global fruit production. Fresh apples are more likely to be appreciated by consumers during the marketing process. However, apples inevitably suffer mechanical damage during transport, which can affect their economic performance. Therefore, the timely detection of apples with surface defects ca...
Article
Full-text available
Active learning (AL) is a well-known technique to optimize data usage in training, through the interactive selection of unlabeled observations, out of a large pool of unlabeled data, to be labeled by a supervisor. Its focus is to find the unlabeled observations that, once labeled, will maximize the informativeness of the training dataset, therefore...
Preprint
Full-text available
Precedent is the cornerstone of the Common law system. Even in jurisdictions that follow Civil law, precedents constrain decisions when case law is sufficiently uniform. A systematic disregard of precedents makes judgments less coherent and the law less just. Nevertheless, relying on precedents can also make courts more efficient, whereas recent ad...
Article
Full-text available
Fraud, corruption, and collusion are the most common types of crime in public procurement processes; they produce significant monetary losses, inefficiency, and misuse of the public treasury. However, empirical research in this area to detect these crimes is still insufficient. This article presents a systematic literature review focusing on the mo...
Preprint
Full-text available
Judges frequently rely their reasoning on precedents. In every circumstance, courts must preserve uniformity in case law and, depending on the legal system, previous cases compel rulings. The search for methods to accurately identify similar previous cases is not new and has been a vital input, for example, to case-based reasoning (CBR) methodologi...
Preprint
Judges frequently rely their reasoning on precedents. Courts must preserve uniformity in decisions while, depending on the legal system, previous cases compel rulings. The search for methods to accurately identify similar previous cases is not new and has been a vital input, for example, to case-based reasoning (CBR) methodologies. This literature...
Preprint
Competitive Intelligence allows an organization to keep up with market trends and foresee business opportunities. This practice is mainly performed by analysts scanning for any piece of valuable information in a myriad of dispersed and unstructured sources. Here we present MapIntel, a system for acquiring intelligence from vast collections of text...
Chapter
Full-text available
Precedents constitute the starting point of judges’ reasoning in national legal systems. Precedents are also an essential input for case-based reasoning (CBR) methodologies. Although considerable research has been done on CBR applied to legal practice, the precedent retrieval techniques are a relatively new and unexplored field of AI & Law. Only a...
Chapter
Competitive Intelligence allows an organization to keep up with market trends and foresee business opportunities. This practice is mainly performed by analysts scanning for any piece of valuable information in a myriad of dispersed and unstructured sources. Here we present MapIntel, a system for acquiring intelligence from vast collections of text...
Preprint
Full-text available
In the Machine Learning research community, there is a consensus regarding the relationship between model complexity and the required amount of data and computation power. In real world applications, these computational requirements are not always available, motivating research on regularization methods. In addition, current and past research have...
Article
Full-text available
In the age of the data deluge there are still many domains and applications restricted to the use of small datasets. The ability to harness these small datasets to solve problems through the use of supervised learning methods can have a significant impact in many important areas. The insufficient size of training data usually results in unsatisfact...
Article
In this study, we use panel data to analyse the impact of an R&D tax credit on R&D personnel, particularly the impact on Ph.D. holders allocation, comparing low R&D intensity firms with medium-high and high R&D intensity firms. The results show that, in medium-high and high R&D intensity firms, the R&D tax credit had a significant impact on allocat...
Article
Learning from imbalanced data sets is known to be a challenging task. There are many proposals to tackle the challenge for classification problems, but regarding regression the solutions are few. In the context of regression, imbalanced learning means that there is a concern with the accurate prediction of the target values in a subset of the conti...
Chapter
Public procurement fraud is a plague that produces significant economic losses in any state and society, but empirical studies to detect it in this area are still scarce. This article presents a review of the most recent literature on public procurement to identify techniques for fraud detection using Network Science. Applying the PRISMA methodolog...
Article
Full-text available
Shopping through Live-Streaming Shopping Apps (LSSAs) as an emerging consumption phenomenon has increased dramatically in recent years, especially during the COVID-19 lockdown period. However, insufficient studies have focused on the psychological processes undergone in different customer demographics while shopping via LSSAs under pandemic conditi...
Article
Full-text available
Fraud in public funding can have deleterious consequences for societies’ economic, social, and political well-being. Fraudulent activity associated with public procurement contracts accounts for losses of billions of euros every year. Thus, it is of utmost relevance to explore analytical frameworks that can help public authorities identify agents t...
Article
Full-text available
In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data “on-demand” for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on...
Article
Full-text available
India has proven to be one of the most diverse and dynamic economic regions in the world. Its industry focuses predominantly on the service sector and immediate economic growth seems to steer India into the economic superpower. India's unique business landscape is felt at a regional level, where massive urbanization has become an unavoidable conseq...
Article
Full-text available
Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge...
Article
Traditional supervised machine learning classifiers are challenged to learn highly skewed data distributions as they are designed to expect classes to equally contribute to the minimization of the classifiers cost function. Moreover, the classifiers design expects equal misclassification costs, causing a bias for overrepresented classes. Different...
Article
Full-text available
New technologies applied to transportation services in the city, enable the shift to sustainable transportation modes making bike-sharing systems (BSS) more popular in the urban mobility scenario. This study focuses on understanding the spatiotemporal station and trip activity patterns in the Lisbon BSS, based in 2018 data taken as the baseline, an...
Preprint
Full-text available
Fraud in public funding can have deleterious consequences for the economic, social, and political well-being of societies. Fraudulent activity associated with public procurement contracts accounts for losses of billions of euros every year. Thus, it is of utmost relevance to explore analytical frameworks that can help public authorities identify ag...
Article
Full-text available
Injuries have become devastating and often under-recognized public health concerns. In Canada, injuries are the leading cause of potential years of life lost before the age of 65. The geographical patterns of injury, however, are evident both over space and time, suggesting the possibility of spatial optimization of policies at the neighborhood sca...
Article
Full-text available
Cities are moving towards new mobility strategies to tackle smart cities’ challenges such as carbon emission reduction, urban transport multimodality and mitigation of pandemic hazards, emphasising on the implementation of shared modes, such as bike-sharing systems. This paper poses a research question and introduces a corresponding systematic lite...
Article
Wealth in the Greater Toronto Area (GTA) continues to grow each year as Toronto's consumer market and population increase. Using a machine learning segmentation based on self-organizing maps, this paper examines the demographics, socioeconomics, and expenditure consumption patterns of the GTA's consumers. The results suggest that SOM may contribute...
Article
Full-text available
Owing to the convenience, reliability and contact-free feature of Mobile payment (M-payment), it has been diffusely adopted in China during the COVID-19 pandemic to reduce the direct and indirect contacts in transactions, allowing social distancing to be maintained and facilitating stabilization of the social economy. This paper aims to comprehensi...
Conference Paper
Mobile payment (M-payment), as an emerging financial transaction method has been widely adopted in various contexts. In order to investigate the significance factors and espoused cultural moderators impacting users' M-payment continuance usage intention in China, this study proposes a comprehensive model integrating Unified Theory of Acceptance and...
Article
Food delivery apps (FDAs) as an emerging online-to-offline mobile technology, have been widely adopted by catering businesses and customers. Especially, as they have provided two-way beneficial catering delivery services in rescuing catering enterprises and satisfying customers’ technological and mental exceptions under the COVID-19 global pandemic...
Article
Full-text available
The field of data science has had a significant impact in both academia and industry, and with good reason [...]
Article
Full-text available
The automatic production of land use/land cover maps continues to be a challenging problem, with important impacts on the ability to promote sustainability and good resource management. The ability to build robust automatic classifiers and produce accurate maps can have a significant impact on the way we manage and optimize natural resources. The d...
Conference Paper
This paper implements the systematic literature review investigating the factors impacting on mobile payment adoption from user perspective. There are total 58 selected paper been analyzed through proposed five steps systematic literature review process. The results present that culture as an important factor impacting on user adoption intention, m...
Article
Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm, as well as any other oversampling method based on the SMOTE m...
Article
Massive open online courses (MOOCs), contribute significantly to individual empowerment because they can help people learn about a wide range of topics. To realize the full potential of MOOCs, we need to understand their factors of success, here defined as the use, user satisfaction, along the individual and organizational performance resulting fro...
Article
Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versa...
Article
This article presents an analysis of the global digital divide, based on data collected from 45 countries, including the ones belonging to the European Union, OECD, Brazil, Russia, India, and China (BRIC). The analysis shows that one factor can explain a large part of the variation in the seven ICT variables used to measure the digital development...
Chapter
The interest in using information to improve the quality of living in large urban areas and the efficiency of its governance has been around for decades. Nevertheless, recent developments in information and communications technology have sparked new ideas in academic research, all of which are usually grouped under the umbrella term of Smart Cities...
Article
Full-text available
Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versa...
Article
Learning from imbalanced datasets is challenging for standard algorithms, as they are designed to work with balanced class distributions. Although there are different strategies to tackle this problem, methods that address the problem through the generation of artificial data constitute a more general approach compared to algorithmic modifications....
Article
Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line seg...
Article
Learning from imbalanced datasets is a frequent but challenging task for standard classification algorithms. Although there are different strategies to address this problem, methods that generate artificial data for the minority class constitute a more general approach compared to algorithmic modifications. Standard oversampling methods are variati...
Article
Full-text available
This paper analyzes the digital development of 110 countries and its relationship with economic development. Using factor analysis, we combined seven ICT-related variables into a single measure of digital development. This measure was then used as the dependent variable in an OLS model that allows non-linear effects, with the GDP per capita of coun...
Article
In many remote-sensing projects, one is usually interested in a small number of land-cover classes present in a study area and not in all the land-cover classes that make-up the landscape. Previous studies in supervised classification of satellite images have tackled specific class mapping problem by isolating the classes of interest and combining...
Article
Full-text available
In many remote sensing projects on land cover mapping, the interest is often in a sub-set of classes presented in the study area. Conventional multi-class classification may lead to a considerable training effort and to the underestimation of the classes of interest. On the other hand, one-class classifiers require much less training, but may overe...
Data
E-learning systems are emerging in many settings of our society. Schools, universities, and several other organizations use these systems. E-learning systems allow learning anytime and anywhere. This medium may seem to be the answer to all learning barriers, but the effect of non-cognitive skills on the success of e-learning systems is yet to be ex...
Article
E-learning systems are enablers in the learning process, strengthening their importance as part of the educational strategy. Understanding the determinants of e-learning success is crucial for defining instructional strategies. Several authors have studied e-learning implementation and adoption, and various studies have addressed e-learning success...
Article
This paper addresses the international and internal digital divides that exist across and within the European member states according to the educational attainment of their populations. Our results suggest that even for those European countries that are outperforming their counterparts in terms of digital development, such as Finland, some internal...
Article
Full-text available
E-learning systems have witnessed a usage and research increase in the past decade. This article presents the e-learning concepts ecosystem. It summarizes the various scopes on e-learning studies. Here we propose an e-learning theoretical framework. This theory framework is based upon three principal dimensions: users, technology, and services rela...
Book
This is a book is a collection of articles that will be submitted as full papers to the AGILE annual international conference. These papers go through a rigorous review process and report original and unpublished fundamental scientific research. Those published cover significant research in the domain of geographic information science systems. This...
Article
Full-text available
There is a clear belief among academics and policy makers about the importance of ICT for sustainable development and welfare. Thus, all across the world, a variety of strategies to promote the digital development have been proposed and implemented by national and international authorities. Simultaneously, academics have been dedicating their effor...
Conference Paper
Full-text available
E-Learning systems play an important role in our society; they facilitate instructors in the teaching process and also enable learners to access knowledge. Although, e-learning is not the first concept that refers to the use computerized systems in the learning process. This paper describes a bibliometric study. In this paper, we present the e-lear...
Conference Paper
Full-text available
Massive open online courses (MOOCs) are black swans. A black swan is an unexpected event that emerges from reality and alters the reality itself. MOOCs have affected supply and demand in higher education. MOOCs distribute knowledge, classes on many areas of expertise for free. Millions of users, from all over the world are enrolling MOOCs courses....
Article
Full-text available
A plethora of national and regional applications need land-cover information covering large areas. Manual classification based on visual interpretation and digital per-pixel classification are the two most commonly applied methods for land-cover mapping over large areas using remote-sensing images, but both present several drawbacks. This paper tes...
Chapter
Full-text available
Portugal is a country with a high per capita consumption of medical drugs. High levels of medication implies not only risk to the patient but also a strong burden to the National Health Service (Serviço Nacional de Saúde—SNS). Polymedication, according to many authors, is the consumption of at least five different drugs. Polymedication can have ser...
Conference Paper
Full-text available
E-learning systems are widely used from academia to industry. The usage of e-learning systems raises new research contexts. Multiple collaborative learning systems were implemented to improve people interaction, communication, working, coordinating activities, socializing and learning. E-learning systems play an significant role in the learning act...
Chapter
A problem that Portugal is facing, which needs urgent effective health policies, is the socio-economic differences and inequalities that arise in access to health care. In this study we used data from National Health Survey of 2005/2006 to investigate if socio-economic differences are related both to the frequency which health services are used and...
Article
Our research analyses the digital divide within the European Union 27 between the years of 2008 and 2010. To accomplish this we use multivariate statistical methods, more specifically factor and cluster analysis, to address the European digital disparities. Our results lead to an identification of two latent dimensions and five groups of countries....
Article
Clustering constitutes one of the most popular and important tasks in data analysis. This is true for any type of data, and geographic data is no exception. In fact, in geographic knowledge discovery the aim is, more often than not, to explore and let spatial patterns surface rather than develop predictive models. The size and dimensionality of the...
Conference Paper
Full-text available
Our research aims to analyze the digital divide within the European Union 27 (EU-27). Hence we used a multivariate approach, more specifically Factor Analysis, to study the digital disparities between European Countries. Two latent dimensions on this subject were found. We also found statistical evidence that one of the dimensions on digital develo...
Article
Full-text available
Not all wildfire ignitions result in burned areas of a similar size. The aim of this study was to explore whether there was a size-dependent pattern (in terms of resulting burned area) of fire ignitions in Portugal. For that purpose we characterised 71,618 fire ignitions occurring in the country in the period 2001–2003, in terms of population densi...
Article
Full-text available
Clustering constitutes one of the most popular and important tasks in data analysis. The size and dimensionality of the existing geospatial databases stress the need for efficient and robust spatial clustering algorithms. In this paper we present the GeoSOM suite as a spatial clustering tool. GeoSOM suite implements the GeoSOM algorithm, which allo...
Article
Full-text available
Portugal has the highest density of wildfire ignitions among southern European countries. The ability to predict the spatial patterns of ignitions constitutes an important tool for managers, helping to improve the effectiveness of fire prevention, detection and firefighting resources allocation. In this study, we analyzed 127 490 ignitions that occ...
Article
Full-text available
A new methodology is presented that measures density in urban systems. By combining highly detailed height measurements with, amongst others, topographical data we are able to quantify urban volume. This new approach is demonstrated in two separate case studies that relate to the temporal and spatial dimension of the urban environment, respectively...
Conference Paper
Full-text available
The large amount of spatial data available today demands the use of data mining tools for its analysis. One of the most used data mining techniques is clustering. Several methods for spatial clustering exist, but many consider space as just another variable. We present in this paper a tool particularly suited for spatial clustering: the GeoSOM suit...
Conference Paper
Full-text available
This paper presents a simple way to compensate the magnification effect of Self-Organizing Maps (SOM) when creating cartograms using Carto- SOM. It starts with a brief explanation of what a c artogram is, how it can be used, and what sort of metrics can be used to asses s its quality. The methodology for creating a cartogram with a SOM is then pres...
Article
Full-text available
The basic idea of a cartogram is to distort a map. This distortion comes from the substitution of area for some other variable (in most examples population). The objective is to scale each region according to the value it represents for the new variable, while keeping the map recognizable. The use of cartograms is previous to the use of computerize...
Conference Paper
Full-text available
The method proposed in this paper supports the UAV network path definition in an autonomously way, taking into consideration the density of the detected events at each moment, in each place. We use the self-organizing maps to detect event patterns in the field of view of the sensors, allowing unmanned aerial vehicles(UAV) path definition based on e...
Conference Paper
Full-text available
To deal with the huge volume of information provided by remote sensing satellites, which produce images used for agriculture monitoring, urban planning, deforestation detection and so on, several algorithms for image classification have been proposed in the literature. This article compares two approaches, called Expectation-Maximization (EM) and S...
Chapter
Full-text available
According to the statistics Portugal has the highest density of wildfire ignitions among southern European countries. The ability to predict ignition occurrence constitutes an important tool for managers, helping to improve the effectiveness of fire prevention, detection and fire fighting resources allocation. In this study we used a database with...

Network

Cited By