
Paolo Giudici- Professor of Statistics
- Professor (Full) at University of Pavia
Paolo Giudici
- Professor of Statistics
- Professor (Full) at University of Pavia
SAFE (Sustainable, Accurate, Fair and Explainable) machine learning
About
294
Publications
111,275
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,085
Citations
Introduction
Professor of Statistics and of machine learning.
Author of several scientific publications, which mainly concern statistical learning models to obtain predictions and/or risk measures in economics and finance.
H-index: 49 (Google scholar), 37 (Scopus), 32 (Clarivate Web of Science).
Current institution
Additional affiliations
February 2024 - present
May 2016 - December 2016
April 2010 - December 2016
Credito Valtellinese Banking Group
Position
- Managing Director
Education
December 1990 - October 1993
September 1989 - December 1990
October 1984 - July 1989
Publications
Publications (294)
The paper arises from the experience of Applied Stochastic Models in Business and Industry which has seen, over the years, more and more contributions related to Machine Learning rather than to what was intended as a stochastic model. The very notion of a stochastic model (e.g., a Gaussian process or a Dynamic Linear Model) can be subject to change...
Several explainable AI methods are available, but there is a lack of a systematic comparison of such methods. This paper contributes in this direction, by providing a framework for comparing alternative explanations in terms of complexity and robustness. We exemplify our proposal on a real case study in the cybersecurity domain, namely, phishing we...
The coefficient of variation, which measures the variability of a distribution from its mean, is not uniquely defined in the multidimensional case, and so is the multidimensional Gini index, which measures the inequality of a distribution in terms of the mean differences among its observations. In this paper, we connect these two notions of sparsit...
Measuring distances in a multidimensional setting is a challenging problem, which appears in many fields of science and engineering. In this paper, to measure the distance between two multivariate distributions, we introduce a new measure of discrepancy which is scale invariant and which, in the case of two independent copies of the same distributi...
Measuring the degree of inequality expressed by a multivariate statistical distribution is a challenging problem, which appears in many fields of science and engineering. In this paper, we propose to extend the well known univariate Gini coefficient to multivariate distributions, by maintaining most of its properties. Our extension is based on the...
This paper introduces a collaborative, human-centered taxonomy of AI, algorithmic and automation harms. We argue that existing taxonomies, while valuable, can be narrow, unclear, typically cater to practitioners and government, and often overlook the needs of the wider public. Drawing on existing taxonomies and a large repository of documented inci...
Machine learning models are widely used to decide whether to accept or reject credit loan applications. However, similarly to human‐based decisions, they may discriminate between special groups of applicants, for instance based on age, gender, and race. In this paper, we aim to understand whether machine learning credit lending models are biased in...
This paper investigates the effects of the economic shock produced by the COVID-19 outbreak and diffusion on households'. Through a survey administered to Italian households, without loss of generality, we investigate changes in financial and economic decisions and the households' ability to cope with daily purchases, repay their debt obligations a...
Predictions arising from deep neural networks may be very accurate but not very robust, leading to uncertainty in their outcome. This critical problem is receiving growing attention from the Machine Learning (ML) community. A practical solution that is increasingly applied is calculating confidence bounds for the ML predictions. Most confidence bou...
Artificial Intelligence relies on the application of machine learning models which, while reaching high predictive accuracy, lack explainability and robustness. This is a problem in regulated industries, as authorities aimed at monitoring the risks arising from the application of Artificial Intelligence methods may not validate them. No measurement...
A key point to assess statistical forecasts is the evaluation of their predictive accuracy. Recently, a new measure, called Rank Graduation Accuracy (RGA), based on the concordance between the ranks of the predicted values and the ranks of the actual values of a series of observations to be forecast, was proposed for the assessment of the quality o...
Inequality measures are quantitative measures that take values in the unit interval, with a zero value characterizing perfect equality. Although originally proposed to measure economic inequalities, they can be applied to several other situations, in which one is interested in the mutual variability between a set of observations, rather than in the...
Phishing is a fraudulent practice aimed at convincing individuals to reveal sensitive information, such as account credentials or credit card details, by clicking the links of malicious websites. To reduce the impacts of phishing, the timely identification of these websites is essential. For this purpose, machine learning models are often devised....
Artificial Intelligence relies on the application of machine learning models which, while reaching high predictive accuracy, lack explainability and robustness. This is a problem in regulated industries, as authorities aimed at monitoring the risks arising from the application of AI methods may not validate them.
No measurement methodologies are y...
Financial technologies, stemming from the application of artificial intelligence to big data in finance, are continuously expanding, across different markets and financial services. While financial technologies bring many opportunities, such as reduced costs and extended inclusion, they also bring risks, among which cyber risks, which are constantl...
Financial technologies, stemming from the application of artificial intelligence to big data in finance, are continuosly expanding, across different markets and financial services. While financial technologies bring many opportunities, such as reduced costs and extended inclusion, they also bring risks, among which cyber risks, which are constantly...
A trustworthy application of Artificial Intelligence requires to measure in advance its possible risks. When applied to regulated industries, such as banking, finance and insurance, Artificial Intelligence methods lack explainability and, therefore, authorities aimed at monitoring risks may not validate them. To solve this issue, explainable machin...
Phishing is a very dangerous security threat that affects individuals as well as companies and organizations. To fight the risks associated with this threat, it is important to detect phishing websites in a timely manner. Machine learning models work well for this purpose as they can predict phishing cases, using information on the underlying websi...
This paper shows how to improve the measurement of credit scoring by means of factor clustering. The improved measurement applies, in particular, to small and medium enterprises (SMEs) involved in P2P lending. The approach explores the concept of familiarity which relies on the notion that the more familiar/similar things are, the closer they are i...
Artificial intelligence methods, based on machine learning models, are rapidly changing financial services, and credit lending in particular, complementing traditional bank lending with platform lending. While financial technologies improve user experience and possibly lower costs, they may increase risks and, in particular, the model risks that de...
The diffusion of Environmental, Social and Governance (ESG) metrics is increasingly affecting corporates behaviour and their ability to attract investors. Corporate ESG practices are nowadays considered as a key element in evaluating creditworthiness and the cost of capital, to direct funds to the best‐performing companies that limit the harmful im...
Background: Short-term forecasts of infectious disease contribute to situational awareness and capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise forecasts’ predictive performance by combining independent models into an ensemble. Here we report the performance of ensemb...
Artificial Intelligence methods, based on either statistical or machine learning models, are rapidly changing financial services, credit lending in particular, complementing traditional bank lending with platform lending.While financial technologies improve user experience, and possibly lower costs, they may increase risks and, in particular, the m...
Reconciliation enforces coherence between hierarchical forecasts, in order to satisfy a set of linear constraints. However, most works focus on the reconciliation of the point forecasts. We instead focus on probabilistic reconciliation and we analyze the properties of the reconciled distributions by considering reconciliation via conditioning. We p...
The assessment of the health impacts of the COVID-19 pandemic requires the consideration of mobility networks. To this aim, we propose to augment spatio-temporal point process models with mobility network covariates. We show how the resulting model can be employed to predict contagion patterns and to help in important decisions such as the distribu...
Artificial Intelligence methods, based on machine learning from data, are rapidly changing financial services, and credit lending in particular, complementing bank lending with platform lending.While financial technologies improve user experience, and possibly lower costs, they may increase risks and, in particular, the model risks that derive from...
Since the start of the 21st century, the world has not confronted a more serious threat to global public health than the COVID-19 pandemic. While governments initially took radical actions in response to the pandemic to avoid catastrophic collapse of their health care systems, government policies have also had numerous knock-on socioeconomic, polit...
The Data Atlas is the centerpiece of the PERISCOPE project’s data-driven research. The Atlas constitutes a centralized access point for the exploration, visualization and analysis of the original data produced by PERISCOPE partners, integrated with the most relevant information about the COVID-19 pandemic and its effects on health, economics, polic...
The COVID‐19 pandemic has highlighted the importance of reliable statistical models which, based on the available data, can provide accurate forecasts and impact analysis of alternative policy measures. Here we propose Bayesian time‐dependent Poisson autoregressive models that include time‐varying coefficients to estimate the effect of policy covar...
Background
Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here we report...
We construct a network volatility index (NetVIX) via market interconnectedness and volatilities to measure global market risk. The NetVIX multiplicatively decomposes into a network volatility effect and a network contagion effect. It also additively decomposes into volatility contributions of each market. We apply our measure to study the relations...
This paper investigates the impact of network centrality on borrowers’ and lenders’ behavior in P2P lending. The empirical analysis on a leading Chinese lending platform reveals that the lenders who are in the center of a network not only invest by larger amounts but also more swiftly than their peers, reflecting the experience and information adva...
Many investors have been attracted by Crypto assets in the last few years. However, despite the possibility of gaining high returns, investors bear high risks in crypto markets. To help investors and make the markets more reliable, Robot advisory services are rapidly expanding in the field of crypto asset allocation. Robot advisors not only reduce...
One of the main consequences of the digital revolution, which for the last few years has been transforming almost every economic activity, has been an unprecedented availability of big data. At the same time, recent technological breakthroughs have provided tools (technological infrastructures and analytical methodologies) capable of processing the...
Cyber incidents are becoming more sophisticated and their costs difficult to quantify. Using a unique database of cyber events across sectors in the US, we document the characteristics and drivers of cyber incidents. Cyber costs are higher for larger firms and for incidents that impact several organisations simultaneously. Events with malicious int...
While multilateral climate negotiations are at a deadlock, climate finance faces a crossroads as the lending community needs to develop renewed strategies on the ‘Future of Environment Funds’. Most policy and scholarly attention have been directed on how to improve the largest multilateral climate fund – the Green Climate Fund (GCF) – own funding,...
Feature selection is a popular topic. The main approaches to deal with it fall into the three main categories of filters, wrappers and embedded methods. Advancement in algorithms, though proving fruitful, may be not enough. We propose to integrate an explainable AI approach, based on Shapley values, to provide more accurate information for feature...
The aim of this paper is to propose a portfolio selection methodology capable to take into account asset tail co-movements as additional constraints in Markowitz model. We apply the methodology to the observed time series of the 10 largest crypto assets, in terms of market capitalization, over the period 20 September 2017–31 December 2020 (1200 dai...
This paper describes a multidimensional machine learning model and its implementation in a financial technology (fintech) application. The aim of the model is to learn from both micro economic financial data and macro economic trends the credit rating of companies that ask for credit. We show that the proposed model is able to reward the companies...
The detection of money laundering is a very important problem, especially in the financial sector. We propose a mathematical specification of the problem in terms of a classification tree model that ”automates” expert based manual decisions. We operationally validate the model on a concrete application that originates from a large Italian bank. The...
Robot advisory services are rapidly expanding, responding to a growing interest people have in directly managing their savings. Robot-advisors may reduce costs and improve the quality of asset allocation services, making user’s involvement more transparent. Against this background, there exists the possibility that robot advisors underestimate mark...
In credit risk estimation, the most important element is obtaining a probability of default as close as possible to the effective risk. This effort quickly prompted new, powerful algorithms that reach a far higher accuracy, but at the cost of losing intelligibility, such as Gradient Boosting or ensemble methods. These models are usually referred to...
We propose an endemic-epidemic model: a negative binomial space–time autoregression, which can be employed to monitor the contagion dynamics of the COVID-19 pandemic, both in time and in space. The model is exemplified through an empirical analysis of the provinces of northern Italy, heavily affected by the pandemic and characterised by similar non...
Artificial intelligence (AI) methods are becoming widespread, especially when data are not sufficient to build classical statistical models, as is the case for cyber risk management. However, when applied to regulated industries, such as energy, finance, and health, AI methods lack explainability. Authorities aimed at validating machine learning mo...
L’interesse verso assetti produttivi compatibili con la tutela dell’ambiente, con un maggiore equilibrio sociale e con adeguate prassi di governance rappresenta una necessità vieppiù sentita dalle banche, incoraggiate a perseguire simili obiettivi da una regolamentazione sempre più pervasiva e dalla crescente consapevolezza di investitori e clienti...
This work investigates financial volatility cascades generated by SARS-CoV-2 related news using concepts developed in the field of seismology. We analyze the impact of socio-economic and political announcements, as well as of financial stimulus disclosures, on the reference stock markets of the United States, United Kingdom, Spain, France, Germany...
We present a statistical model that can be employed to monitor the time evolution of the COVID‐19 contagion curve and the associated reproduction rate. The model is a Poisson autoregression of the daily new observed cases and dynamically adapt its estimates to explain the evolution of contagion in terms of a short‐term and long‐term dependence of c...
We aim to understand the dynamics of crypto asset prices and, specifically, how price information is transmitted among different bitcoin market exchanges, and between bitcoin markets and traditional ones. To this aim, we hierarchically cluster bitcoin prices from different exchanges, as well as classic assets, by enriching the correlation based min...
The paper aims to assess, from an empirical viewpoint, the advantages of a stablecoin whose value is derived from a basket of underlying currencies, against a stablecoin which is pegged to the value of one major currency, such as the dollar. To this aim, we first find the optimal weights of the currencies that can comprise our basket. We then emplo...
The paper proposes an explainable Artificial Intelligence model that can be used in credit risk management and, in particular, in measuring the risks that arise when credit is borrowed employing peer to peer lending platforms. The model applies correlation networks to Shapley values so that Artificial Intelligence predictions are grouped according...
The paper examines the relationships among market assets during stressful times, using two recently proposed econometric modeling techniques for tail risk measurement: the extreme downside hedge (EDH) and the extreme downside correlation (EDC). We extend both measures taking into account the sensitivity of asset's return to innovations not only fro...
The aim of this paper is to propose a portfolio selection methodology capable to take into account asset tail co-movements, frequent among crypto assets. To achieve this aim we consider both systemic and tail risks as additional constraints in Markowitz model. We apply the methodology to the observed time series the ten largest crypto-assets, in te...
RISK MANAGEMENT MAGAZINE Anno 15, numero 3 Settembre-Dicembre 2020
We propose an Explainable AI model that can be employed in order to explain why a customer buys or abandons a non-life insurance coverage. The method consists in applying similarity clustering to the Shapley values that were obtained from a highly accurate XGBoost predictive classification algorithm. Our proposed method can be embedded into a techn...
In a world that is increasingly connected on-line, cyber risks become critical. Cyber risk management is very difficult, as cyber loss data are typically not disclosed. To mitigate the reputational risks associated with their disclosure, loss data may be collected in terms of ordered severity levels. However, to date, there are no risk models for o...
Financial contagion among countries can arise from different channels, the most important of which are financial markets and bank lending. The paper aims to build an econometric network approach to understand the extent to which contagion spillovers (from one country to another) aris from financial markets, from bank lending, or from both. To achie...
At the beginning of the COVID-19 pandemic, Intesa Sanpaolo has developed a contagion model aimed at calibrating the measures to be taken to safeguard its employees and the provision of banking services, according to the risk deriving from the external environment.
The model is based on both external and internal views: the combination of such eleme...
Explainability of artificial intelligence methods has become a crucial issue, especially in the most regulated fields, such as health and finance. In this paper, we provide a global explainable AI method which is based on Lorenz decompositions, thus extending previous contributions based on variance decompositions. This allows the resulting Shapley...
A very key point in the process of the Covid-19 contagion control is the introduction of effective policy measures, whose results have to be continuously monitored through accurate statistical analysis. To this aim we propose an innovative statistical tool, based on the Gini-Lorenz concentration approach, which can reveal how well a country is doin...
This paper extends the extreme downside correlation (EDC) and extreme downside hedge (EDH) methodology to model the interdependence in the sensitivity of assets to the downside risk of other financial assets under severe firm-level and market conditions. The model is applied to analyze both systematic and systemic exposures in the Iranian Food Indu...
We present a statistical model which can be employed to understand the contagion dynamics of the COVID-19, which can heavily impact health, economics and finance. The model is a Poisson autoregression of the daily new observed cases, and can reveal whether contagion has a trend, and where is each country on that trend. Model results are exemplified...
We propose to study the dynamics of financial contagion by means of a class of point process models employed in the modeling of seismic contagion. The proposal extends network models, recently introduced to model financial contagion, in a space-time point process perspective. The extension helps to improve the assessment of credit risk of an instit...
Questions
Questions (2)
I am looking for an explicit formula that weighs the predictions from each database into a combined one.