Article

Probability and the Weighing of Evidence

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The scientific -or any other -community has not developed a general method for moving from a quantitative value for a parameter to a quantitative statement about a proposition, such as a scientific hypothesis. However, Pierce described an approach in 1878 (Peirce 2014), which has been modified and applied by many others (Good 1950;Jaynes 2003;Edwards 1992;Pardo and Allen 2007; Allen and Pardo 2019; Fairfield and Charman 2022). Essentially, it is a comparison of the relative likelihood or plausibility of the evidence (E) under two competing hypotheses or explanations (H 1 and H 0 ), expressed as a ratio of one hypothesis to another, giving a likelihood ratio (LR). ...
... By convention, a base-10 logarithm is used and the resulting log-likelihood ratio is multiplied by 10 to give decibel units: 10 Log 10 (LR * ). This quantity is known as the weight of evidence (WoE) (Peirce 2014;Good 1950). Logarithms have several advantages. ...
Preprint
Full-text available
In a criminal investigation, an inferential error occurs when the probability that a suspect is the source of some evidence -- such as a fingerprint -- is taken as the probability of guilt. This is known as the ultimate issue error, and the same error occurs in statistical inference when the probability that a parameter equals some value is incorrectly taken to be the probability of a hypothesis. Almost all statistical inference in the social and biological sciences is subject to this error, and replacing every instance of "hypothesis testing" with "parameter testing" in these fields would more accurately describe the target of inference. The relationship between parameter values and quantities derived from them, such as p-values or Bayes factors, have no direct quantitative relationship with scientific hypotheses. Here, we describe the problem, its consequences, and suggest options for improving scientific inference.
... The weight of evidence Good (1985) points out that the concept of weight of evidence, which is used in many areas (e.g., in science, medicine, law, and daily life), is a function of the probabilities of the data under two hypotheses (see also Good, 1950Good, , 1965Good, , 1979Good, , 1994Good, , 1995. Formally, this relation takes the form ...
... When H 1 and H 2 are simple (point) hypotheses the Bayes factor is equal to the likelihood ratio (Royall, 2017). Good defined the weight of evidence as the logarithm of the Bayes factor (Good, 1950(Good, , 1985(Good, , 1994, because it is additive and symmetric (e.g., log(BF = 10) = 2.3 and log(BF = 1/10) = −2.3, the average of which is 0). In contrast, the Bayes factor scale is not symmetric -the average of a Bayes factor of 10 and 1/10 is larger than 1. ...
Article
Full-text available
Bayes factor hypothesis testing provides a powerful framework for assessing the evidence in favor of competing hypotheses. To obtain Bayes factors, statisticians often require advanced, non-standard tools, making it important to confirm that the methodology is computationally sound. This paper seeks to validate Bayes factor calculations by applying two theorems attributed to Alan Turing and Jack Good. The procedure entails simulating data sets under two hypotheses, calculating Bayes factors, and assessing whether their expected values align with theoretical expectations. We illustrate this method with an ANOVA example and a network psychometrics application, demonstrating its efficacy in detecting calculation errors and confirming the computational correctness of the Bayes factor results. This structured validation approach aims to provide researchers with a tool to enhance the credibility of Bayes factor hypothesis testing, fostering more robust and trustworthy scientific inferences.
... This, of course, takes us to the fraught question of the "best" measure of confirmation. For my purposes in this paper in interacting with the phenomenon of "explaining away," the simple Bayes factor ratio (Good, 1950) will be the most convenient measure, though I am well aware of other measures. 1 The same points can be readily seen using the popular r measure (Keynes, 1921), which has also been used in the literature on explaining away. ...
... For further discussion of measures of support and of coherence, seeCrupi et al. (2007),Fitelson (1998),Good (1950),Keynes (1921),McGrew (2003),McGrew (2016).Content courtesy of Springer Nature, terms of use apply. Rights reserved. ...
Article
Full-text available
I examine the concept of granting for the sake of the argument in the context of explanatory reasoning. I discuss a situation where S wishes to argue for H1 as a true explanation of evidence E and also decides to grant, for the sake of the argument, that H2 is an explanation of E. S must then argue that H1 and H2 jointly explain E. When H1 and H2 compete for the force of E, it is usually a bad idea for S to grant H2 for the sake of the argument. If H1 and H2 are not positively dependent otherwise, there is a key argumentative move that he will have to make anyway in order to retain a place at the table for H1 at all—namely, arguing that the probability of E given H2 alone is low. Some philosophers of religion have suggested that S can grant that science has successfully provided natural explanations for entities previously ascribed to God, while not admitting that theism has lost any probability. This move involves saying that the scientific explanations themselves are dependent on God. I argue that this “granting” move is not an obvious success and that the theist who grants these scientific successes may have to grant that theism has lost probability.
... The expected number of false positives when searching a database of random individuals, based on a "positive" criterion ≥ where is a specified threshold, can be derived from the Turing expectation ( > | ) < 1 [34][35][36] . If a database is of size N we never expect more than / adventitious matches. ...
... In practice, comparisons with large intelligence databases size N, will require high LRs in order to reduce the number of false positive matches. For example, if the database size is 1m, and the LR recovered is 10,000, from the Turing expectation c. 100 adventitious matches may be observed in such a database (Pm=N/LR) [34][35][36] . Smaller databases (N=55), such as that used here will result in much lower adventitious match rates. ...
Preprint
Full-text available
Humans constantly shed DNA into the surrounding environment. This DNA may either remain suspended in the air or it settles onto surfaces as house dust. In this study, we explored the potential use of human DNA recovered from air and dust to investigate crimes where there are no visible traces available – for example, from a recently vacated drugs factory where multiple workers had been present. Samples were collected from three indoor locations (offices, meeting rooms and laboratories) characterized by different occupancy types and cleaning regimes. The resultant DNA profiles were compared with the reference profiles of 55 occupants of the premises. Our findings showed that household dust samples are rich sources of DNA and provide an historical record of occupants within the specific locality of collection. Detectable levels of DNA were also observed in air and dust samples from ultra-clean forensic laboratories which can potentially contaminate casework samples. We provide a Bayesian statistical model to estimate the minimum number of dust samples needed to detect all inhabitants of a location. The results of this study suggest that air and dust could become novel sources of evidence to identify current and past occupants of a crime scene.
... We do not go into detail on all of them, but the reader can find a wide range of these indicators in [18,12] for a brief literature overview but nonetheless quite exhaustive. This section focuses on presenting the "Weight of evidence" (WoE) [7] and its comparison with the Shapley values proposed in the previous section, since this indicator is (i) close to the equation presented above (equation 10) and (ii) among the most widely used indicators for the naive Bayes classifier. ...
... We used the scipy.stats.kendalltau with the default parameter, i.e τ -b.7 It would also be interesting to see the correlations of only the most important variables (e.g. the top five), since usually only a few of the most important features are perceptible to humans. ...
Preprint
Full-text available
This paper has been accepted at the workshop AIMLAI of ECML-PKDD 2023 - "Variable selection or importance measurement of input variables to a machine learning model has become the focus of much research. It is no longer enough to have a good model, one also must explain its decisions. This is why there are so many intelligibility algorithms available today. Among them, Shapley value estimation algorithms are intelligibility methods based on cooperative game theory. In the case of the naive Bayes classifier, and to our knowledge, there is no ``analytical" formulation of Shapley values. This article proposes an exact analytic expression of Shapley values in the special case of the naive Bayes Classifier. We analytically compare this Shapley proposal, to another frequently used indicator, the Weight of Evidence (WoE) and provide an empirical comparison of our proposal with (i) the WoE and (ii) KernelShap results on real world datasets, discussing similar and dissimilar results. The results show that our Shapley proposal for the naive Bayes classifier provides informative results with low algorithmic complexity so that it can be used on very large datasets with extremely low computation time."
... We do not go into detail on all of them, but the reader can find a wide range of these indicators in [18,12] for a brief literature overview but nonetheless quite exhaustive. This section focuses on presenting the "Weight of evidence" (WoE) [7] and its comparison with the Shapley values proposed in the previous section, since this indicator is (i) close to the equation presented above (equation 10) and (ii) among the most widely used indicators for the naive Bayes classifier. ...
... We used the scipy.stats.kendalltau with the default parameter, i.e τ -b.7 It would also be interesting to see the correlations of only the most important variables (e.g. the top five), since usually only a few of the most important features are perceptible to humans. ...
Conference Paper
Full-text available
(Preprint version) - Variable selection or importance measurement of input variables to a machine learning model has become the focus of much research. It is no longer enough to have a good model, one also must explain its decisions. This is why there are so many intelligibility algorithms available today. Among them, Shapley value estimation algorithms are intelligibil-ity methods based on cooperative game theory. In the case of the naive Bayes classifier, and to our knowledge, there is no "analytical" formulation of Shapley values. This article proposes an exact analytic expression of Shapley values in the special case of the naive Bayes Classifier. We analytically compare this Shapley proposal, to another frequently used indicator, the Weight of Evidence (WoE) and provide an empirical comparison of our proposal with (i) the WoE and (ii) KernelShap results on real world datasets, discussing similar and dissimilar results. The results show that our Shapley proposal for the naive Bayes classifier provides informative results with low algorithmic complexity so that it can be used on very large datasets with extremely low computation time.
... These strategies help to explore data and screen variables. The underlying theory of WOE was provided by Good (1950), and the expression describes whether the evidence in favour or against some hypothesis is more or less strong. Although frequently employed in scientific and social science research, WOE analysis is rarely used in education research (Weed, 2005). ...
... Although frequently employed in scientific and social science research, WOE analysis is rarely used in education research (Weed, 2005). It calculates the percentage of events vs nonevents for a given attribute (Good, 1950). An event stands for something that has already happened, such as a student's dropout from university, and a nonevent represents the opposite, non-dropout. ...
Article
Full-text available
Background: Higher education authorities continue to be concerned about dropout rates among university students. Dropping out affects cost efficiency and tarnishes the reputation of the institution. As a result, profiling at risk student of dropout is crucial. Purpose: To build a profile of students at risk of dropout using administrative university data. Methods: The researcher employed a data mining technique in which predictors were chosen based on their weight of evidence (WOE) and information value (IV). The chosen predictors were then used to build a profile of students at risk of dropout. Findings: According to the findings, the student is at risk if was born in the years 1931 to 1967 or 1994 to 2001; fails more than four modules in a year with a participation average mark of 43percent or less; and has joined the university in the second academic year. Conclusions: This study concludes that matric scores had no bearing on the chance of a student dropping out for the cohorts of students who attended the institution between 2008 and 2018. However, the variables number of modules failed, participation average marks, entry level and year of birth had a bearing for building a profile for the students at risk of dropout for the cohorts of students who attended the institution between 2008 and 2018.
... 9-11), independently of both Abraham Wald and George Barnard (who had also come up with the idea for wartime applications). 13 Good's 1950 book treats sequential analysis in Section 6.2 (pp. 64-66). ...
... At first glance, the paper seems like a curiosity: a jumble of simple results and techniques in statistical inference. In fact, it is clear in retrospect that what the paper actually does is lay out the sequence of steps in the statistical attack on the Enigma, each step being an integral part in that attack: Here, the link with Good's book and some of his papers (Good, 1950, 1953, 1956, 1961, 1969, and Good and Toulmin, 1956and 1968, not merely to Turing but to Bletchley Park and cryptanalysis, was revealed, but no detail given. The deciban, for example, is described as being used as part of "an important classified process called Banburismus" (but we are not actually told what Banburismus is), and that the main application of the deciban "was to sequential analysis, not for quality control but for discriminating between hypotheses" (but we are not told what those hypotheses were). ...
... The model including the candidates for the prior distributions are tested in prior predictive checks, introduced by Good (1950). In prior predictive checks a large amount of replication data is simulated from the given model and prior distributions with the aim to decide if the model and parameter distributions are (biologically) plausible. ...
... However, the aim is not to choose prior distributions that exactly reflect the distribution of the historical data but rather to select weakly-informative prior distributions that lead to a prior predictive distribution that is less informative (flatter) compared to the actual observed historical data distribution but that excludes values that seem implausibly high in context of the historical data. The theory is that, for a big number of generated data sets, the empirical distribution of the simulated data approximates the prior predictive distribution of the data (see Good (1950)). ...
Preprint
Full-text available
The planning and conduct of animal experiments in the European Union is subject to strict legal conditions. Still, many preclinical animal experiments are only poorly designed. As a consequence, discoveries that are made in one animal experiment, cannot be reproduced in another animal experiment or discoveries in translational animal research fail to be translated to humans. When designing new experiments in a classical frequentist framework, the sample size for the new experiment is chosen with the goal to achieve at least a certain statistical power, given a statistical test for a null hypothesis, a significance threshold and a minimally relevant effect size. In a Bayesian framework, inference is made by a combination of both the information from newly observed data and also by a prior distribution, that represents a priori information on the parameters. In translational animal experiments, a priori information is present in previously conducted experiments to the same outcome in similar animals. The prior information can be incorporated in a systematic way in the design and analysis of a new animal experiment by summarizing the historical data in a (Bayesian) meta-analysis model and using the meta-analysis model to make predictions for the data in the new experiment. This is called meta-analytic predictive (MAP) approach. In this work, concepts of how to design translational animal experiments by MAP approaches are introduced and compared to classical frequentist power-oriented sample size planning. Current chances and challenges, that exist in the practical application of these approaches in translational animal research, are discussed. Special emphasis is put on the construction of prior distributions and sample size calculation by design analysis. The considerations are motivated by a real world translational research example.
... When making a decision, the decision-making process often rely on a set of alternative consequences, whether rational or irrational. However, the rationality of the decision will depend on our degree of belief of a scenario occurring [14]. Of course, our degree of belief is influenced by evidence. ...
... Accordingly, the Bayes factor is the ratio of the posterior odds of H 1 to its prior odds, regardless of the value of the prior odds [19]. The logarithm of Eq. 3 is called the Weight of evidence (WoE) [40], which dates back to work by Alan Turing during World War II at Bletchley Park -although Turing referred to this factor as the factor in favour of H [14]. High-risk areas, such as medicine and law, often present more than one alternative hypothesis. To this extend, we can use the generalised Bayes factor (GBF) [12,41]. ...
Conference Paper
Full-text available
In both technical as well as ethics of AI domains, there have been more and more calls for a turn to critical machine learning with the emphasis on the fairness (bias), accountability and transparency (FAT) of machine learning (ML) systems in general, but specifically of automated decision-making (ADM) systems powered by the newest deep neural network research and technology. In recent years, the field of eXplainable artificial intelligence (XAI) has exploded as a crucial feature for fostering trust in ML systems. The goal of XAI is to enable objectives such as fairness , accountability and transparency (FAT). How exactly XAI addresses these matters is not clear by following a technical approach only. There is a gap between the explanations generated by XAI and questions related to, for example, discrimination against minority groups, who should be held accountable for the decision involving a ML system or whether users trust the system. To this end we propose a multidisciplinary approach to consolidate explanations generated by XAI, which requires the expertise of ML technicians, and how users interpret these explanations along the guidelines of FAT, which requires the expertise of social scientists. We illustrate how a XAI technique, most relevant explanation (MRE) is an appropriate metric to evaluate fairness and accountability by allowing for comparison of competing explanations and alternative reasoning. Notions from Aristotle's ethics and from restorative justice are considered in order to effect a better harmony between outcomes of MRE and social science requirements in terms of FAT. Our contribution is novel as it offers a concrete demonstration of how the tech community can heed the call for recognising that ML systems are socio-technical systems and actively respond to FAT concerns of end-users. Furthermore, our paper situates ML systems in a multi-and inter-disciplinary context in a clear and effective way, as engagement in the way we suggest is an intentional act reflecting sensitivity to FAT concerns.
... The other way is to calculate the prior that you would need to believe in order to achieve an FPR of, say 5%. This so-called reverse Bayes approach was mentioned by Good (1950) and has been used, for example, by Carlin & Louis (1996), Matthews (2001Matthews ( , 2018, Held (2013) and Colquhoun (2017Colquhoun ( , 2018. In both cases it would be up to you to persuade your readers and journal editors that the value of the prior probability is reasonable. ...
Preprint
Full-text available
It is widely acknowledged that the biomedical literature suffer from a surfeit of false positive results. Part of the reason for this is the persistence of the myth that observation of a p value less than 0.05 is sufficient justification to claim that you've made a discovery. It is hopeless to expect users to change their reliance on p values unless they are offered an alternative way of judging the reliability of their conclusions. If the alternative method is to have a chance of being adopted widely, it will have to be easy to understand and to calculate. One such proposal is based on calculation of false positive risk. It is suggested that p values and confidence intervals should continue to be given, but that they should be supplemented by a single additional number that conveys the strength of the evidence better than the p value. This number could be the minimum false positive risk (that calculated on the assumption of a prior probability of 0.5, the largest value that can be assumed in the absence of hard prior data). Alternatively one could specify the prior probability that it would be necessary to believe in order to achieve a false positive risk of, say, 0.05.
... Formulating QUBO matrices is rather straightforward; however, depending on the data used to form the matrix, numerical problems might arise. To mitigate the complications arising from missing values in the data when performing correlation calculations, we chose to apply WoE transformation 10,11 into the original data. WoE is a statistical measure with a convenient property of handling missing values and outliers. ...
Article
Full-text available
The emergence of quantum computing proposes a revolutionary paradigm that can radically transform numerous scientific and industrial application domains. However, realizing this promise in industrial applications is far from being practical today. In this paper, we discuss industry experiences with respect to quantum computing, and the gap between quantum software engineering research and state-of-the-practice in industry-scale quantum computing.
... Alternatively, the empirical Bayes estimate of may be used, which represents the value that maximizes the marginal likelihood function (3). Finally, in order to assess prior sensitivity, a reverse-Bayes approach (Good, 1950;Best et al., 2021;Held et al., 2022a) may be used to find the mixture weight such that a certain posterior distribution is obtained. ...
Preprint
Full-text available
Replication of scientific studies is important for assessing the credibility of their results. However, there is no consensus on how to quantify the extent to which a replication study replicates an original result. We propose a novel Bayesian approach based on mixture priors. The idea is to use a mixture of the posterior distribution based on the original study and a non-informative distribution as the prior for the analysis of the replication study. The mixture weight then determines the extent to which the original and replication data are pooled. Two distinct strategies are presented: one with fixed mixture weights, and one that introduces uncertainty by assigning a prior distribution to the mixture weight itself. Furthermore, it is shown how within this framework Bayes factors can be used for formal testing of scientific hypotheses, such as tests regarding the presence or absence of an effect. To showcase the practical application of the methodology, we analyze data from three replication studies. Our findings suggest that mixture priors are a valuable and intuitive alternative to other Bayesian methods for analyzing replication studies, such as hierarchical models and power priors. We provide the free and open source R package repmix that implements the proposed methodology.
... Jack Good has noted that marginal likelihoods operate just like likelihood functions, calling them Bayesian likelihoods. He used weight of evidence for the logarithm of the ratio of marginal likelihoods for two hypotheses (i.e., the logarithm of the Bayes factor), providing an additive (rather than multiplicative) measure of how much the available evidence favores one hypothesis over the other; the terminology originates with Alan Turing (Good, 1950(Good, , 1985. 6 The values of marginal likelihoods depend more sensitively on properties of prior distributions than do posterior PDFs for parameters; in particular, they are roughly inversely proportional to the prior ranges of parameters. ...
Preprint
Full-text available
Bayesian inference gets its name from *Bayes's theorem*, expressing posterior probabilities for hypotheses about a data generating process as the (normalized) product of prior probabilities and a likelihood function. But Bayesian inference uses all of probability theory, not just Bayes's theorem. Many hypotheses of scientific interest are *composite hypotheses*, with the strength of evidence for the hypothesis dependent on knowledge about auxiliary factors, such as the values of nuisance parameters (e.g., uncertain background rates or calibration factors). Many important capabilities of Bayesian methods arise from use of the law of total probability, which instructs analysts to compute probabilities for composite hypotheses by *marginalization* over auxiliary factors. This tutorial targets relative newcomers to Bayesian inference, aiming to complement tutorials that focus on Bayes's theorem and how priors modulate likelihoods. The emphasis here is on marginalization over parameter spaces -- both how it is the foundation for important capabilities, and how it may motivate caution when parameter spaces are large. Topics covered include the difference between likelihood and probability, understanding the impact of priors beyond merely shifting the maximum likelihood estimate, and the role of marginalization in accounting for uncertainty in nuisance parameters, systematic error, and model misspecification.
... We would like to point out the not-so-widely-known fact that the frequentist bootstrap is actually an instance of Bayesian nonparametric inference, as follows. Suppose that the context C of the problem under study by You (Good, 1950: a person wishing to reason sensibly in the presence of uncertainty) implies that Your uncertainty about real-valued observables {Y 1 , Y 2 , . . . }, which have not yet been observed, is exchangeable. ...
... Once the notion of specifying a prior distribution for is accepted, the framework of Bayesian inference can be developed deductively from one of several systems of axioms (for example, Ramsey [3]; Good [4]; Savage [5]; De Groot [6]); for a detailed evaluation, see Fishburn [7]. ...
Chapter
Full-text available
In statistics, frequentist approach has often been considered as the only appropriate way to carry out scientific and applied work. However, since the 1950s, Bayesian statistics has been progressively gaining ground in academia. The purpose of this study is to demonstrate the points of encounter between these two apparently opposite currents of thought. For it, several topics are reviewed, explaining what Bayes’ Theorem is by means of didactic examples. On the other hand, it is shown that the frequentist reject the central postulate of the Bayesian approach, but are forced to replace it with alternative solutions, the most generalized being the Maximum Likelihood. Facing this discrepancy, it is suggested that it could be a misinterpretation between both approaches and offer examples in which Bayes’ postulate and the Maximum Likelihood principle yield the same numerical answer. Then, inferences from a priori information, both non-informative and informative, are analyzed and the inferential proposals of both schools are explored. In addition, the fiducial approach, which works with sufficient statistics, is discussed. All these aspects are discussed from the mathematical perspectives of renowned statisticians such as Fisher, Keynes, Carnap, Good, Durbin, Box, Giere, Neyman, Pearson, among others. In addition, philosophical assumptions that philosophers such as Lakatos, Popper and Kuhn, among others, have failed to offer are sought in order to establish a possible reconciliation between these currents of statistical thought in apparent conflict.
... Conditional on these hypothetical values, the expert's opinion about the median values of the variables should change and she is asked to revise her median assessments. The idea of using hypothetical sample data dates back at least as far as Good (1950), who called it 'the device of imaginay results'. For i = 1,. . . ...
Article
Full-text available
This paper presents an elicitation method for quantifying an expert's subjective opinion about a multivariate normal distribution. It is assumed that the expert's opinion can be adequately represented by a natural conjugate distribution (a normal inverse-Wishart distribution), and the expert performs various assessment tasks that enable the hyperparameters of the distribution to be estimated. There is some choice in the way hyperparameters are determined from the elicited assessments and empirical work underlies the choices made. The method is implemented in an interactive computer program that questions the expert and forms the subjective distribution. An example illustrating use of the method is given. 1123
... We can model the probability of a false consistency using the birthday paradox [35], shown as a green line in Figure 5. The birthday paradox is a classic technique used to find the probability of a collision of two or more randomly chosen elements in a set. ...
Preprint
Data synchronization in decentralized storage systems is essential to guarantee sufficient redundancy to prevent data loss. We present SNIPS, the first succinct proof of storage algorithm for synchronizing storage peers. A peer constructs a proof for its stored chunks and sends it to verifier peers. A verifier queries the proof to identify and subsequently requests missing chunks. The proof is succinct, supports membership queries, and requires only a few bits per chunk. We evaluated our SNIPS algorithm on a cluster of 1000 peers running Ethereum Swarm. Our results show that SNIPS reduces the amount of synchronization data by three orders of magnitude compared to the state-of-the-art. Additionally, creating and verifying a proof is linear with the number of chunks and typically requires only tens of microseconds per chunk. These qualities are vital for our use case, as we envision running SNIPS frequently to maintain sufficient redundancy consistently.
... For general references about the probability calculus and concepts and specific probability distributions see Jaynes 2003;MacKay 2005;Jeffreys 1983;Gregory 2005;Bernardo & Smith 2000;Hailperin 1996;Good 1950;Fenton & Neil 2019;Johnson et al. 1996;2005;1994;Kotz et al. 2000. ...
Preprint
Full-text available
In fields such as medicine and drug discovery, the ultimate goal of a classification is not to guess a class, but to choose the optimal course of action among a set of possible ones, usually not in one-one correspondence with the set of classes. This decision-theoretic problem requires sensible probabilities for the classes. Probabilities conditional on the features are computationally almost impossible to find in many important cases. The main idea of the present work is to calculate probabilities conditional not on the features, but on the trained classifier's output. This calculation is cheap, needs to be made only once, and provides an output-to-probability "transducer" that can be applied to all future outputs of the classifier. In conjunction with problem-dependent utilities, the probabilities of the transducer allow us to find the optimal choice among the classes or among a set of more general decisions, by means of expected-utility maximization. This idea is demonstrated in a simplified drug-discovery problem with a highly imbalanced dataset. The transducer and utility maximization together always lead to improved results, sometimes close to theoretical maximum, for all sets of problem-dependent utilities. The one-time-only calculation of the transducer also provides, automatically: (i) a quantification of the uncertainty about the transducer itself; (ii) the expected utility of the augmented algorithm (including its uncertainty), which can be used for algorithm selection; (iii) the possibility of using the algorithm in a "generative mode", useful if the training dataset is biased.
... We would like to point out the not-so-widely-known fact that the frequentist bootstrap is actually an instance of Bayesian nonparametric inference, as follows. Suppose that the context C of the problem under study by You (Good, 1950: a person wishing to reason sensibly in the presence of uncertainty) implies that Your uncertainty about real-valued observables {Y 1 , Y 2 , . . . }, which have not yet been observed, is exchangeable. ...
Preprint
In this discussion note, we respond to the fascinating paper "Martingale Posterior Distributions" by E. Fong, C. Holmes, and S. G. Walker with a couple of comments. On the basis of previous research, a theorem is stated regarding the relationship between frequentist bootstrap and stick-breaking process.
... En las últimas décadas se ha discutido ampliamente si Keynes, tras recibir la crítica de Ramsey, cambió sus ideas sobre la probabilidad y adhirió a una perspectiva personalista de la probabilidad. Algunos autores argumentaron que este es el caso (Bateman, 1987(Bateman, , 1996Good, 1950), mientras que otros defendieron la posición contraria (Carabelli, 1988;O'Donnell, 1989O'Donnell, , 1991. La cuestión es particularmente delicada ya que Keynes, tras trabajar el tema por casi dos décadas, no volvió a presentar sistemáticamente sus ideas sobre la probabilidad. ...
Article
Full-text available
Resumen Antes de volcarse al campo económico, J. M. Keynes dedicó más de una década de su vida a trabajar casi exclusivamente sobre el concepto de probabilidad. La concepción keynesiana de la probabilidad se diferencia de las dos corrientes interpretativas principales que dominan el campo en la actualidad, a saber, la corriente frecuentista y la subjetiva. En este artículo se busca reconstruir las críticas de J.M. Keynes a la concepción frecuentista de la probabilidad dejando de lado sus diferencias con la visión subjetiva. Los argumentos que Keynes esgrimió en contra de la interpretación frecuentista no han sido suficientemente tratados en la literatura especializada, quedando usualmente opacados por otros temas. En el artículo se muestra cómo desde la perspectiva de Keynes, la adopción del frecuentismo conduce al abandono de la probabilidad como guía de conducta, especialmente en el marco de razonamiento inductivos. Intentos de enmendar este problema conducen indefectiblemente a juicios de relevancia como los que Keynes formuló en su teoría. Palabras claves J.M. Keynes, Tratado sobre la probabilidad, probabilidad lógica, teoría frecuentista Keynes criticism to the frequentist conception of probability Abstract Before turning to the economic field, J. M. Keynes dedicated more than a decade of his life to working almost exclusively on the concept of probability. The Keynesian conception of probability differs from the two main interpretative currents that dominate the field today, namely the frequentist and the subjective current. This article seeks to reconstruct Keynes criticism to the frequentist conception of probability, leaving aside its differences with the subjective vision. The arguments that Keynes used against the frequentist interpretation have not been sufficiently covered in the specialized literature, being usually overshadowed by other topics. The article shows how, from Keynes's perspective, the adoption of frequentism leads to the abandonment of probability as a guide to conduct, especially within the framework of inductive reasoning. Attempts to amend this problem inevitably lead to judgments of relevance such as those that Keynes made in his theory.
... [55] While Ref. [56] discusses a slightly different "frequency distribution", the following citation from page 60 there is exactly to the point here: "But it is often possible to find a simple form that fits the frequency distribution approximately. If this can be done it has the advantage of describing the results of the statistics briefly. ...
Article
Full-text available
We apply Bayesian statistics to the estimation of correlation functions. We give the probability distributions of auto- and cross-correlations as functions of the data. Our procedure uses the measured data optimally and informs about the certainty level of the estimation. Our results apply to general stationary processes and their essence is a nonparametric estimation of spectra. It allows one to better understand the statistical noise fluctuations, assess the correlations between two variables, and postulate parametric models of spectra that can be further tested. We also propose a method to numerically generate correlated noise with a given spectrum.
... In other words, the Bayes factor quantifies the degree to which observed measurements discriminate between competing propositions. The Bayes factor is the ingredient by which the prior odds in favor of a proposition are multiplied in virtue of the knowledge of the findings (Good, 1958): ...
Book
Full-text available
Bayes Factors for Forensic Decision Analyses with R provides a self-contained introduction to computational Bayesian statistics using R. With its primary focus on Bayes factors supported by data sets, this book features an operational perspective, practical relevance, and applicability—keeping theoretical and philosophical justifications limited. It offers a balanced approach to three naturally interrelated topics: – Probabilistic Inference: Relies on the core concept of Bayesian inferential statistics, to help practicing forensic scientists in the logical and balanced evaluation of the weight of evidence. – Decision Making: Features how Bayes factors are interpreted in practical applications to help address questions of decision analysis involving the use of forensic science in the law. – Operational Relevance: Combines inference and decision, backed up with practical examples and complete sample code in R, including sensitivity analyses and discussion on how to interpret results in context. Over the past decades, probabilistic methods have established a firm position as a reference approach for the management of uncertainty in virtually all areas of science, including forensic science, with Bayes' theorem providing the fundamental logical tenet for assessing how new information—scientific evidence—ought to be weighed. Central to this approach is the Bayes factor, which clarifies the evidential meaning of new information, by providing a measure of the change in the odds in favor of a proposition of interest, when going from the prior to the posterior distribution. Bayes factors should guide the scientist's thinking about the value of scientific evidence and form the basis of logical and balanced reporting practices, thus representing essential foundations for rational decision making under uncertainty. This book would be relevant to students, practitioners, and applied statisticians interested in inference and decision analyses in the critical field of forensic science. It could be used to support practical courses on Bayesian statistics and decision theory at both undergraduate and graduate levels, and will be of equal interest to forensic scientists and practitioners of Bayesian statistics for driving their evaluations and the use of R for their purposes.
... From a Bayesian perspective, specificity contributes to severity by means of increasing the diagnosticity or expected evidential value of a hypothesis test. This concept can be operationalized as the expected absolute log-Bayes factor 8 of the experiment (Good, 1950;1975;1979;Lindley, 1956;Nelson, 2005;Schönbrodt & Wagenmakers, 2018;Stefan et al., 2019): ...
Article
Full-text available
A tradition that goes back to Sir Karl R. Popper assesses the value of a statistical test primarily by its severity: was there an honest and stringent attempt to prove the tested hypothesis wrong? For "error statisticians" such as Mayo (1996, 2018), and frequentists more generally, severity is a key virtue in hypothesis tests. Conversely, failure to incorporate severity into statistical inference, as allegedly happens in Bayesian inference, counts as a major methodological shortcoming. Our paper pursues a double goal: First, we argue that the error-statistical explication of severity has substantive drawbacks; specifically, the neglect of research context and the specificity of the predictions of the hypothesis. Second, we argue that severity matters for Bayesian inference via the value of specific, risky predictions: severity boosts the expected evidential value of a Bayesian hypothesis test. We illustrate severity-based reasoning in Bayesian statistics by means of a practical example and discuss its advantages and potential drawbacks.
... Naive Bayesian networks (NB) are composed of directed acyclic graphs with only one parent (representing the unobserved node) and several children (corresponding to observed nodes). It has a rigid independence assumption between input variables [52]. The naïve Bayes classifier calculates conditional probability from likelihood and prior distribution using the Bayes theorem: ...
Article
Full-text available
Crop phenology monitoring is a necessary action for precision agriculture. Sentinel-1 and Sentinel-2 satellites provide us with the opportunity to monitor crop phenology at a high spatial resolution with high accuracy. The main objective of this study was to examine the potential of the Sentinel-1 and Sentinel-2 data and their combination for monitoring sugarcane phenological stages and evaluate the temporal behaviour of Sentinel-1 parameters and Sentinel-2 indices. Seven machine learning models, namely logistic regression, decision tree, random forest, artificial neural network, support vector machine, naïve Bayes, and fuzzy rule based systems, were implemented, and their predictive performance was compared. Accuracy, precision, specificity, sensitivity or recall, F score, area under curve of receiver operating characteristic and kappa value were used as performance metrics. The research was carried out in the Indo-Gangetic alluvial plains in the districts of Hisar and Jind, Haryana, India. The Sentinel-1 backscatters and parameters VV, alpha and anisotropy and, among Sentinel-2 indices, normalized difference vegetation index and weighted difference vegetation index were found to be the most important features for predicting sugarcane phenology. The accuracy of models ranged from 40 to 60%, 56 to 84% and 76 to 88% for Sentinel-1 data, Sentinel-2 data and combined data, respectively. Area under the ROC curve and kappa values also supported the supremacy of the combined use of Sentinel-1 and Sentinel-2 data. This study infers that combined Sentinel-1 and Sentinel-2 data are more efficient in predicting sugarcane phenology than Sentinel-1 and Sentinel-2 alone.
... Here, the estimated diagnosticity of the weather forecast will be defined as π (where 0.5 ≤ π ≤ 1). The diagnosticity of information will influence the level of certainty Thomas believes he will possess after receiving that information and updating his belief, as it equates to the strength of the evidence in Bayes' theorem (Good, 1950). He can then calculate the EU of the future states he may enter. ...
Thesis
When trying to form accurate beliefs and make good choices, people often turn to one another for information and advice. But deciding whom to listen to can be a challenging task. While people may be motivated to receive information from accurate sources, in many circumstances it can be difficult to estimate others’ task-relevant expertise. Moreover, evidence suggests that perceptions of others’ attributes are influenced by irrelevant factors, such as facial appearances and one’s own beliefs about the world. In this thesis, I present six studies that investigate whether messenger characteristics that are unrelated to the domain in question interfere with the ability to learn about others’ expertise and, consequently, lead people to make suboptimal social learning decisions. Studies one and two explored whether (dis)similarity in political views affects perceptions of others’ expertise in a non-political shape categorisation task. The findings suggest that people are biased to believe that messengers who share their political opinions are better at tasks that have nothing to do with politics than those who do not, even when they have all the information needed to accurately assess expertise. Consequently, they are more likely to seek information from, and are more influenced by, politically similar than dissimilar sources. Studies three and four aimed to formalise this learning bias using computational models and explore whether it generalises to a messenger characteristic other than political similarity. Surprisingly, in contrast to the results of studies one and two, in these studies there was no effect of observed generosity or political similarity on expertise learning, information-seeking choices, or belief updating. Studies five and six were then conducted to reconcile these conflicting results and investigate the boundary conditions of the learning bias observed in studies one and two. Here, we found that, under the right conditions, non-politics-based similarities can influence expertise learning and whom people choose to hear from; that asking people to predict how others will answer questions enhances learning from observed outcomes; and that it is unlikely that inattentiveness explains why we observed null effects in studies three and four.
... There are sophisticated measures for assessing calibration experiments quantitatively, but one simple one is known as "Turing's rule". This follows from a theorem, attributed by IJ Good to Alan Turing [66,67], that shows that the "expected value of a LR in favour of a wrong proposition is one". This implies that the average value of the LR in a large H d -true calibration experiment should be approximately one: in practice, this means a highly skewed distribution, with most LRs close to zero and the odd one or two substantially greater than one. ...
Article
Full-text available
The forensic community has devoted much effort over the last decades to the development of a logical framework for forensic interpretation, which is essential for the safe administration of justice. We review the research and guidelines that have been published and provide examples of how to implement them in casework. After a discussion on uncertainty in the criminal trial and the roles that the DNA scientist may take, we present the principles of interpretation for evaluative reporting. We show how their application helps to avoid a common fallacy and present strategies that DNA scientists can apply so that they do not transpose the conditional. We then discuss the hierarchy of propositions and explain why it is considered a fundamental concept for the evaluation of biological results and the differences between assessing results given propositions that are at the source level or the activity level. We show the importance of pre-assessment, especially when the questions relate to the alleged activities, and when transfer and persistence need to be considered by the scientists to guide the court. We conclude with a discussion on statement writing and testimony. This provides guidance on how DNA scientists can report in a balanced, transparent, and logical way.
... Bayesian reasoning is an inductive approach, consistent with the laws of probability, to analyze degrees of plausibility among multiple explanations. Incorporating probability into logical analysis of propositions goes back to the work of Good [17] and De Finetti [18], who examined logical combinations of propositions that may be expressed as a composition of negations, conjunctions, and disjunctions, where antecedents retain some degree of uncertainty. ...
Preprint
Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our n-ary, or n-argument, activation functions fill this gap by approximating belief functions (probabilistic Boolean logic) using logit representations of probability. Just as Boolean logic determines the truth of a consequent claim from relationships among a set of antecedent propositions, probabilistic formulations generalize predictions when antecedents, truth tables, and consequents all retain uncertainty. Our activation functions demonstrate the ability to learn arbitrary logic, such as the binary exclusive disjunction (p xor q) and ternary conditioned disjunction ( c ? p : q ), in a single layer using an activation function of matching or greater arity. Further, we represent belief tables using a basis that directly associates the number of nonzero parameters to the effective arity of the belief function, thus capturing a concrete relationship between logical complexity and efficient parameter representations. This opens optimization approaches to reduce logical complexity by inducing parameter sparsity.
... Under what conditions do we get commutativity in this model? Hartry Field (1978), presumably inspired by the old Bayesian idea (Good 1950(Good , 1983) that ratios of new to old odds furnish the correct representation of what is learned from new evidence alone, established the remarkable result that the classical PK formula (3.1) could be transformed into the "re-parameterized" form ...
Article
Richard Jeffrey’s Conditioning, Kinematics, and Exchangeability is one of the foundational documents of probability kinematics. However, the section entitled Successive Updating contains a subtle error involving updating by so-called relevance quotients in order to ensure the commutativity of successive probability kinematical revisions. Upon becoming aware of this error Jeffrey formulated the appropriate remedy, but never discussed the issue in print. To head off any confusion, it seems worthwhile to alert readers of Jeffrey’s paper to the aforementioned error, and to document his remedy, placing it in the context of both earlier and subsequent work on commuting probability kinematical revisions.
... This requires that the evidential weight of an unprecedented finding is sufficient to put it in conflict with the sceptical prior rendering it non-credible. In the AnCred framework, this implies a finding possesses intrinsic credibility at level α if the estimate b θ is outside the corresponding sceptical prior interval ÀSL, SL ½ extracted using Reverse-Bayes from the finding itself, i. e. b θ 2 >SL 2 with SL given in (10). Matthews showed this implies an unprecedented finding is intrinsically credible at level α ¼ 0:05 if its pvalue does not exceed 0.013. ...
Article
Full-text available
It is now widely accepted that the standard inferential toolkit used by the scientific research community - null-hypothesis significance testing (NHST) - is not fit for purpose. Yet despite the threat posed to the scientific enterprise, there is no agreement concerning alternative approaches for evidence assessment. This lack of consensus reflects long-standing issues concerning Bayesian methods, the principal alternative to NHST. We report on recent work that builds on an approach to inference put forward over 70 years ago to address the well-known “Problem of Priors” in Bayesian analysis, by reversing the conventional prior-likelihood-posterior (“forward”) use of Bayes's Theorem. Such Reverse-Bayes analysis allows priors to be deduced from the likelihood by requiring that the posterior achieve a specified level of credibility. We summarise the technical underpinning of this approach, and show how it opens up new approaches to common inferential challenges, such as assessing the credibility of scientific findings, setting them in appropriate context, estimating the probability of successful replications, and extracting more insight from NHST while reducing the risk of misinterpretation. We argue that Reverse-Bayes methods have a key role to play in making Bayesian methods more accessible and attractive for evidence assessment and research synthesis. As a running example we consider a recently published meta-analysis from several randomized controlled trials (RCTs) investigating the association between corticosteroids and mortality in hospitalized patients with COVID-19. This article is protected by copyright. All rights reserved.
... Hence, providing that the defence hypothesis is simply 'not the prosecution hypothesis' (i.e. 'prosecution hypothesis is false'), the LR is a genuine and meaningful measure of the probative value of evidence in the sense originally popularised in [35], since it translates directly to changes in probability of the hypotheses. ...
Article
Full-text available
The likelihood ratio (LR) is a commonly used measure for determining the strength of forensic match evidence. When a forensic expert determines a high LR for DNA found at a crime scene matching the profile of a suspect they typically report that 'this provides strong support for the prosecution hypothesis that the DNA comes from the suspect'. Our observations suggest that, in certain circumstances, the use of the LR may have led lawyers and jurors into grossly overestimating the probative value of a LTDNA mixed profile 'match'
Preprint
Full-text available
Classical statistical methods have theoretical justification when the sample size is predetermined by the data-collection plan. In applications, however, it's often the case that sample sizes aren't predetermined; instead, investigators might use the data observed along the way to make on-the-fly decisions about when to stop data collection. Since those methods designed for static sample sizes aren't reliable when sample sizes are dynamic, there's been a recent surge of interest in e-processes and the corresponding tests and confidence sets that are anytime valid in the sense that their justification holds up for arbitrary dynamic data-collection plans. But if the investigator has relevant-yet-incomplete prior information about the quantity of interest, then there's an opportunity for efficiency gain, but existing approaches can't accommodate this. Here I build a new, regularized e-process framework that features a knowledge-based, imprecise-probabilistic regularization that offers improved efficiency. A generalized version of Ville's inequality is established, ensuring that inference based on the regularized e-process remains anytime valid in a novel, knowledge-dependent sense. In addition to anytime valid hypothesis tests and confidence sets, the proposed regularized e-processes facilitate possibility-theoretic uncertainty quantification with strong frequentist-like calibration properties and other Bayesian-like features: satisfies the likelihood principle, avoids sure-loss, and offers formal decision-making with reliability guarantees.
Article
Full-text available
F P Ramsey, RB Braithwaite, and all of their many supporters over the last 103 years, never read Keynes's A Treatise on Probability. It is easy to show this simply by studying pp. 4-6 of chapter I of the A Treatise on Probability and comparing Keynes’s analysis to page 3 of Ramsey’s 1922 review that was published in Cambridge Magazine and republished in 1989 in The BritishJournal for the Philosophy of Science. Pages 4-6 provide an excellent introduction to Keynes’s formal analysis contained in Part IIof the A Treatise on Probability.All of Keynes’s analysis is based on the work of G Boole. Consider Boole’s basic, introductory statements in chapter I of his 1854The Laws of Thought: “Instead, then, of saying that Logic is conversant with relations among things and relations among facts, we are permitted to say that it is concerned with relations among things and relations among propositions.… Among such relations I suppose to be included those which affirm or deny existence with respect to things, and those which affirm or deny truth with respect to propositions. Now let those things or those propositions among which relation is expressed be termed the elements of the propositions by which such relation is expressed. Proceeding from this definition, we may then say that the premises of any logical argument express given relations among certain elements, and that the conclusion must express an implied relation among those elements, or among a part of them, i.e. a relation implied by or inferentially involved in the premises… As the conclusion must express a relation among the whole or among a part of the elements involved in the premises, it is requisite that we should possess the means of eliminating those elements which we desire not to appear in the conclusion, and of determining the whole amount of relation implied by the premises among the elements which we wish to retain. Those elements which do not present themselves in the conclusion are, in the language of the common Logic, called middle terms; and the species of elimination exemplified intreatises on Logic consists in deducing from two propositions, containing a common element or middle term, a conclusionconnecting the two remaining terms.”(Boole,1854 ,pp.7-8;underline and italics added). Now there is one conclusion that we canderive from Boole, which is that in an argument form, the conclusion must be related to the premises .Other terms besides relatedthat one could use , would be the words relevant, similar or like .Keynes, in fact, uses all four terms -related ,like similar, andrelevant.It is impossible to deploy Boole’s relational, propositional logic in the case where the propositions, premises and conclusion, areunrelated, irrelevant, dissimilar or unlike each other. This is, however, exactly what Ramsey did .Ramsey presented a series ofargument forms where the premises and conclusion are unrelated to each other, unlike each other, irrelevant to each other, ordissimilar to each other, so that it would be impossible to compare the premises and conclusion to each other. Note that Keynes explicitly rejects this explicitly in 1921 by arguing that the premises and conclusion must be logically connected to each other (Keynes, A Treatise on Probability, p.5 ) or must always be comparable to each other (Keynes,ibid.,pp.137-138):“We can only be interested in our final results when they deal with actually existent and intelligible probabilities—for our object is,always, to compare one probability with another—and we are not incommoded, therefore, in our symbolic operations by thecircumstance that sums and products do not exist between every pair of probabilities.” (Keynes,1921, pp.137-138). Ransey does exactly the opposite. Ramsey chooses to deal with actually nonexistent and unintelligible probabilities so that it is impossible to compare one probability with another. Ramsey makes the absurd, idiotic, preposterous, and incomprehensible claim that Keynes’s logical theory of probability is based on propositions which are completely unrelated to each other. An example of this is C. Misak’s favorite, cited example taken from Ramsey’s 1922 review(Misak,2020,p.114) ,that “...My carpet is blue, Napoleon was a great general...” (Ramsey,1922,p.3) Bertrand Russell, unfortunately without explicitly identifying Ramsey by name in his 1922review, gave the following nonsensical type of Ramsey example: “2+2 =4, Napoleon disliked poodles.” (Russell,1922,p.120,*ft).
Chapter
In this chapter, we give an overview on predictive modeling, used by actuaries. Historically, we moved from relatively homogeneous portfolios to tariff classes, and then to modern insurance, with the concept of “premium personalization.” Modern modeling techniques are presented, starting with econometric approaches, before presenting machine-learning techniques.
Article
Full-text available
Ramsey's criticism, that "… the obvious one is that there really do not seem to be any such things as the probability relations he describes", ignored Keynes's analysis, for example, on page 36 of the A Treatise on Probability that required the propositions linking the premises and conclusion had to be similar. Ramsey ignored the fact that Keynes's method was based on using Boole's relational, propositional logic that required that the propositions had to be connected, related or similar before the formal, mathematical, symbolic logic could be applied. Keynes pointed out that the "…analogy between orders of similarity and probability is so great that its apprehension will greatly assist that of the ideas I wish to convey" (Keynes, 1921, p.36; italics added). Ramsay failed to grasp Keynes's analogy between similarity and probability before criticizing Keynes. Ramsey's main example of supposed errors in Keynes's analysis relies on examples, such as his "My carpet is green, Napoleon was a great general "(Ramsey, 1922, p.3), which involves dissimilar and unrelated propositions which are not connected. Keynes's introductory comments on p.36 were then explored in Part III of the A Treatise on Probability in far greater depth and detail by Keynes. Keynes's objective, probability relations are simply objective relations connecting old, known situations with new situations that can be shown to be related. Human pattern recognition skills involve using resemblance functions based on past memory that projects past knowledge of old situations into new situations, where there are similarities that are seen to exist between the old, known situation and a new, unexplored situation. One then can come up with a rational degree of belief regarding how some new situation will play out, given the similarities that exist between the old and new situations.
Article
Full-text available
Concussions are a serious public health problem, with significant healthcare costs and risks. One of the most serious complications of concussions is an increased risk of subsequent musculoskeletal injuries (MSKI). However, there is currently no reliable way to identify which individuals are at highest risk for post-concussion MSKIs. This study proposes a novel data analysis strategy for developing a clinically feasible risk score for post-concussion MSKIs in student-athletes. The data set consists of one-time tests (eg, mental health questionnaires), relevant information on demographics, health history (including details regarding the concussion such as day of the year and time lost) and athletic participation (current sport and contact level) that were collected at a single time point as well as multiple time points (baseline and follow-up time points after the concussion) of the clinical assessments (ie, cognitive, postural stability, reaction time and vestibular and ocular motor testing). The follow-up time point measurements were treated as individual variables and as differences from the baseline. Our approach used a weight-of-evidence (WoE) transformation to handle missing data and variable heterogeneity and machine learning methods for variable selection and model fitting. We applied a trainingtesting sample splitting scheme and performed variable preprocessing with the WoE transformation. Then, machine learning methods were applied to predict the MSKI indicator prediction, thereby constructing a composite risk score for the training-testing sample. This methodology demonstrates the potential of using machine learning methods to improve the accuracy and interpretability of risk scores for MSKI.
Chapter
There is a growing consensus in the social sciences on the virtues of research strategies that combine quantitative with qualitative tools of inference. Integrated Inferences develops a framework for using causal models and Bayesian updating for qualitative and mixed-methods research. By making, updating, and querying causal models, researchers are able to integrate information from different data sources while connecting theory and empirics in a far more systematic and transparent manner than standard qualitative and quantitative approaches allow. This book provides an introduction to fundamental principles of causal inference and Bayesian updating and shows how these tools can be used to implement and justify inferences using within-case (process tracing) evidence, correlational patterns across many cases, or a mix of the two. The authors also demonstrate how causal models can guide research design, informing choices about which cases, observations, and mixes of methods will be most useful for addressing any given question.
Chapter
Why is deciding to do something sometimes so slow and difficult? How do we make decisions when lacking key information? When making decisions, the higher areas of the brain deliberately suppress lower areas capable of generating much faster but ill-considered responses while they develop ones that are more sophisticated, based on what can be gained in return. In this engaging book, the authors explore the increasingly popular neural model that may explain these mechanisms: the linear approach to threshold ergodic rate (LATER). Presenting a detailed description of the neurophysiological processes involved in decision-making and how these link to the LATER model, this is the first major resource covering the applications in describing human behaviour. With over 100 illustrations and a thorough discussion of the mathematics supporting the model, this is a rigorous yet accessible resource for psychologists, cognitive neuroscientists and neurophysiologists interested in decision-making.
Conference Paper
Full-text available
Artificial Intelligence (AI) is being used to improve customer satisfaction, create more efficiencies, and lower costs, in several sectors of the business environment. The study explores consumer perspectives on AI as well as humanized Artificial Intelligence (HAI) and what its implications are on customer satisfaction. Using an exploratory cum descriptive design, the study implemented a convergent mixed-methods approach to collect data from respondents. Through a snowball sampling strategy, 29 respondents completed the survey. For the qualitative component, two focus groups were held remotely, where various aspects of AI and HAI were explored with the participants. The findings indicate chatbots and HAI were more favorable compared to other forms of AI technologies to advance customer satisfaction. It is recommended that organizations implement HAI and AI technologies with caution without over-humanizing these technologies at the same time, creating a balance between consumer acceptance and the priority to enhance customer satisfaction
Article
Evidential support is often equated with confirmation, where evidence supports hypothesis H if and only if it increases the probability of H. This article argues against this received view. As the author shows, support is a comparative notion in the sense that increase-in-probability is not. A piece of evidence can confirm H, but it can confirm alternatives to H to the same or greater degree; and in such cases, it is at best misleading to conclude that the evidence supports H. The author puts forward an alternative view that defines support in terms of measures of degree of confirmation. The proposed view is both sufficiently comparative and able to accommodate the increase-in-probability aspect of support. The author concludes that the proposed measure-theoretic approach to support provides a superior alternative to the standard confirmatory approach.
Article
This article develops and explores a robust Bayes factor derived from a calibration technique that makes it particularly compatible with elicited prior knowledge. Building on previous explorations, the particular robust Bayes factor, dubbed a neutral-data comparison, is adapted for broad comparisons with existing robust Bayes factors, such as the fractional and intrinsic Bayes factors, in configurations defined by informative priors. The calibration technique is furthermore developed for use with flexible parametric priors—that is, mixture prior distributions with components that may be symmetric or skewed—, and demonstrated in an example context from forensic science. Throughout the exploration, the neutral-data comparison is shown to exhibit desirable sensitivity properties, and to show promise for adaptation to elaborate data-analysis scenarios.
Chapter
Full-text available
This chapter presents an overview of statistics in forensic science, with an emphasis on the Bayesian perspective and the role of the Bayes factor in logical inference and decision. The chapter introduces the reader to three key topics that forensic scientists commonly encounter and that are treated in this book: model choice, evaluation and investigation. For each of these themes, Bayes factors will be developed in later chapters and discussed using practical examples. Particular attention will be given to the distinction between feature- and score-based Bayes factors, representing different approaches to deal with input information (i.e., measurements). This introductory chapter also provides theoretical background that analysts might need during data analysis, including elements of forensic interpretation, computational methods, decision theory, prior elicitation and sensitivity analysis.
Chapter
Alan Turing (1912–1954) made seminal contributions to mathematical logic, computation, computer science, artificial intelligence, cryptography and theoretical biology. In this volume, outstanding scientific thinkers take a fresh look at the great range of Turing's contributions, on how the subjects have developed since his time, and how they might develop still further. The contributors include Martin Davis, J. M. E. Hyland, Andrew R. Booker, Ueli Maurer, Kanti V. Mardia, S. Barry Cooper, Stephen Wolfram, Christof Teuscher, Douglas Richard Hofstadter, Philip K. Maini, Thomas E. Woolley, Eamonn A. Gaffney, Ruth E. Baker, Richard Gordon, Stuart Kauffman, Scott Aaronson, Solomon Feferman, P. D. Welch and Roger Penrose. These specially commissioned essays will provoke and engross the reader who wishes to understand better the lasting significance of one of the twentieth century's deepest thinkers.
Article
Why is it good to be less, rather than more incoherent? Julia Staffel, in her excellent book “Unsettled Thoughts,” answers this question by showing that if your credences are incoherent, then there is some way of nudging them toward coherence that is guaranteed to make them more accurate and reduce the extent to which they are Dutch-bookable. This seems to show that such a nudge toward coherence makes them better fit to play their key epistemic and practical roles: representing the world and guiding action. In this paper, I argue that Staffel’s strategy needs a small tweak. While she identifies appropriate measures of epistemic value, she does not identify appropriate measures of practical value. Staffel measures practical value using Dutchbookability scores. But credences have practical value in virtue of recommending actions that produce as much utility as possible. And while susceptibility to a Dutch book is a surefire sign that one’s credences are needlessly bad at this task, one’s degree of Dutch-bookability is not itself a good measure of how well they recommend practically valuable actions. Strictly proper scoring rules, I argue, are the right tools for measuring both epistemic and practical value. I show that we can rerun Staffel’s strategy swapping in strictly proper scoring rules for Dutch-bookability measures. So long as one’s epistemic scoring rule and practical scoring rule are “sufficiently similar,” there is some way of nudging incoherent credences toward coherence that is guaranteed to yield more of both types of value.
Chapter
Full-text available
Humans constantly search for and use information to solve a wide range of problems related to survival, social interactions, and learning. While it is clear that curiosity and the drive for knowledge occupies a central role in defining what being human means to ourselves, where does this desire to know the unknown come from? What is its purpose? And how does it operate? These are some of the core questions this book seeks to answer by showcasing new and exciting research on human information-seeking. The volume brings together perspectives from leading researchers at the cutting edge of the cognitive sciences, working on human brains and behavior within psychology, computer science, and neuroscience. These vital connections between disciplines will continue to lead to further breakthroughs in our understanding of human cognition.
ResearchGate has not been able to resolve any references for this publication.