Article

Improving police decision making: General principles and practical applications of Receiver Operating Characteristic analysis

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Receiver operating characteristic (ROC) analysis is a widely used and accepted method for improving decision making performance across a range of diagnostic settings. The goal of this paper is to demonstrate how ROC analysis can be used to improve the quality of decisions made routinely in a policing context. To begin, I discuss the general principles underlying the ROC approach and demonstrate how one can conduct the analysis. Several practical applications of ROC analysis are then presented by drawing on a number of policing tasks where the procedure has been used already (bite mark identification and linking serial crimes) or where it could be used in the future (statement validity assessment and determining the veracity of suicide notes). I conclude by considering briefly some of the potential difficulties that may be encountered when using ROC analysis in the policing context and offer some possible solutions to these problems. Copyright © 2005 John Wiley & Sons, Ltd.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... We will call this approach the "pairwise linking task". Like other twoalternative (yes-no) type diagnostic decisions that must be made based on ambiguous evidence, the goals when using this approach are to identify what behaviors are best suited for distinguishing between crimes committed by the same offender versus different offenders, and to determine how similar two crimes should be before a decision is made that they have been committed by the same offender (i.e., establish an appropriate decision threshold; Bennell, 2005;Swets et al., 2000). Research has shown that it is possible to achieve both these goals and to accomplish the pairwise crime linkage task with a reasonable degree of accuracy (Bennell et al., 2014). ...
... As is typical in crime linkage research, when across-crime similarity scores are calculated in this fashion, they tend to be larger, on average, for crime pairs committed by the same offender. Borrowing from work in other diagnostic fields such as radiology (e.g., Swets et al., 2000), it has been argued that the degree of overlap between these distributions indicates how useful the behaviors in question will be for discriminating between crimes committed by the same offender versus different offenders (i.e., the more overlap, the more difficult it will be; Bennell, 2005). For example, if the distributions overlap completely, it will be impossible for the similarity scores that gave rise to those distributions to be used for discriminatory purposes because every score is just as likely to be associated with crimes committed by F I G U R E 1 Hypothetical distributions of across-crime similarity scores for crimes committed by the same offender (the right distribution) versus different offenders (the left distribution). ...
... Researchers have also argued that a threshold can be set anywhere along the x-axis in this figure (the dashed line in Figure 1) to indicate when a "linked decision" should be made for a particular pair of crimes, and that the decision outcomes resulting from possible thresholds can be examined to determine an "optimal" threshold (Bennell, 2005). More specifically, when conceptualizing crime linkage decisions in this way, there are four possible decision outcomes. ...
Article
Deciding whether two crimes have been committed by the same offender or different offenders is an important investigative task. Crime linkage researchers commonly use receiver operating characteristic (ROC) analysis to assess the accuracy of linkage decisions. Accuracy metrics derived from ROC analysis—such as the area under the curve (AUC)—offer certain advantages, but also have limitations. This paper describes the benefits that crime linkage researchers attribute to the AUC. We also discuss several limitations in crime linkage papers that rely on the AUC. We end by presenting suggestions for researchers who use ROC analysis to report on crime linkage. These suggestions aim to enhance the information presented to readers, derive more meaningful conclusions from analyses, and propose more informed recommendations for practitioners involved in crime linkage tasks. Our reflections may also benefit researchers from other areas of psychology who use ROC analysis in a wide range of prediction tasks.
... The literature in respectal forager suggests they are serial offenders. In the absence of forensic, visual or witness evidence, the most effective way of identifying the presence of serial offending is through an accurate crime linkage process (Rossmo, 2000 andBennell and. Once serial foraging offending can be confirmed it is then possible to more accurately and effectively target the responses. ...
... It is in this context that the technique will be used within this study. Several studies have been done to date using this technique to build on the aforementioned regression analysis to predict crime linkage and calibrate the validity of crime linking features, and most importantly identify and produce decision making thresholds (Benell and Bennell andJones, 2005). The knowledge provided in this chapter will continue to build upon these studies by testing them in a new context, namely that of foraging burglary offenders. ...
... After the plotting of this data is complete it will usually provide a curved line which is known as the ROC curve and runs from left to right diagonally. The start of the curve which is situated in the bottom left hand corner indicates a strict decision-making threshold and finishes in the top right-hand corner, where a lenient threshold lies (Bennell andJones, 2005:27 andSwets, 1998). The area underneath the ROC curve is known as the 'area under the curve' (AUC) and this is the important area that provides the decision-making threshold. ...
Thesis
Drawn from ecology, the optimal forager predictive policing methodology has been identified as the primary tasking tool used by police services to tackle domestic burglary. Built upon established findings that the target selection behaviour of foraging domestic burglary offenders can be predicted, this thesis examines the physical offending and geographical characteristics of foraging offenders in greater detail. This study evolves established research evidence by drawing upon criminological methods that have potential to increase the approaches effectiveness before testing their applicability in respect of foraging criminals. Ecological research evidence relating to assumptions of foraging behaviour are used to devise theoretical manifestations within criminal behaviour which are subsequently tested for and used to build a theoretical model to combat them. The study achieves all of this through a number of key research chapters, these include (1) identifying predictive thresholds for linking burglary offences committed by foraging criminals (2) drawing on existing assumptions within ecology the study then seeks to identify their presence within foraging criminals, including the presence of significant crime displacement, and (3) geographical profiling is identified and tested as a potential solution to combat the evasive behaviour of foraging offenders as a response to the increased police presence that the optimal forager model is designed to co-ordinate. Underpinning the study throughout is an examination of the enablers and blockers present that impact upon the effectiveness of such transitions of theory into practice. Overall, the thesis provides new theoretical material by creating a framework of foraging offender typologies. The key practical implications for policing include a model for tackling the identified theoretical foraging typologies to increase the crime prevention and reduction efforts in respect of domestic burglary.
... understand the processes underlying the linking task and to systematically determine the degree to which it is possible to successfully link aseries of crimes (e.g. Bennell & Canter, 2002; Bennell &Jones, 2005; Ewart, Oatley, &Burn, 2005; Grubin et al.,2001; Santtila,F ritzon, &T amelander,2 005; Santtila,J unkkila, &S andnabba, 2005; Santtila, Korpela, &H akkanen,2 004; Woodhams, Grant, &P rice, 2007; Woodhams, Hollin, & Bull, 2007; Woodhams&Toye, 2007). ...
... This method, borrowed directly from the field of signald etection theory( Green &S wets, 1966),i sk nown as receiver operatingc haracteristic (ROC) analysis. The principles underlying this analytical technique have been discussed elsewhere (Swets, 1996),ashas its relevance to the area of policing(Bennell, 2005). The purpose of the current article is rather to: (1) present theoreticala nd practicala rguments supporting theu se of this approach for studying/conducting linkagea nalysis over alternative methods; (2) illustrate the practical application of this approach to the linking task through an empirical analysis of serial rape data; and (3) challenge commonly held assumptions about linkagea nalysis based on the empirical findings that emergef rom this analysis. ...
... Both methodsa re rational and,a rguably,b othy ield results that are mores ensible in producing the desired balance of decision outcomes than would occur if an arbitrary threshold were selected. Technically,h owever,n either of these approaches can be considered optimal (Bennell &J ones, 2005).The optimal approach fors electing athreshold would ideally account fort he base-rate probabilities of encountering crimes committed by the same offender versusd ifferent offendersi nt he jurisdiction under consideration, along with the costs and benefits of the various linking decisions (Swets, 1992). Unfortunately,a t the moment, it is difficult to assign quantitative valuestosome of these terms (e.g. ...
Article
Full-text available
Purpose. Through an examination of serial rape data, the current article presents arguments supporting the use of receiver operating characteristic (ROC) analysis over traditional methods in addressing challenges that arise when attempting to link serial crimes. Primarily, these arguments centre on the fact that traditional linking methods do not take into account how linking accuracy will vary as a function of the threshold used for determining when two crimes are similar enough to be considered linked.Methods. Considered for analysis were 27 crime scene behaviours exhibited in 126 rapes, which were committed by 42 perpetrators. Similarity scores were derived for every possible crime pair in the sample. These measures of similarity were then subjected to ROC analysis in order to (1) determine threshold-independent measures of linking accuracy and (2) set appropriate decision thresholds for linking purposes.Results. By providing a measure of linking accuracy that is not biased by threshold placement, the analysis confirmed that it is possible to link crimes at a level that significantly exceeds chance (AUC = .75). The use of ROC analysis also allowed for the identification of decision thresholds that resulted in the desired balance between various linking outcomes (e.g. hits and false alarms).Conclusions. ROC analysis is exclusive in its ability to circumvent the limitations of threshold-specific results yielded from traditional approaches to linkage analysis. Moreover, results of the current analysis provide a basis for challenging common assumptions underlying the linking task.
... Linking crimes behaviorally can result in four different outcomes, two of which are correct (hit, correct rejection) and two of which are incorrect (false alarm, miss) i (Bennell, 2005). ROC analysis plots the probability of a hit versus the probability of a false alarm at each decision threshold (e.g., from 0 to 1 if using Jaccard's coefficient) rather than just one. ...
... Interested readers are referred to Bennell (2005) of .75, which can be considered good according to published standards (Swets, 1988). ...
... 95% CI = .86-.89), representing a significant (p<.001) and excellent level of predictive accuracy (Hosmer & Lemeshow, 2000). The ROC curve can be seen in Fig. 2. **Insert Fig. 2 approx here** Youden's index was calculated to identify the decision-threshold (for deciding when a pair should be considered linked) at which the proportion of hits would be maximised whilst the proportion of false alarms would be minimised (Bennell, 2005). ...
Article
Full-text available
Case linkage involves identifying crime series on the basis of behavioral similarity and distinctiveness. Research regarding the behavioral consistency of serial rapists has accumulated; however, it has its limitations. One of these limitations is that convicted or solved crime series are exclusively sampled whereas, in practice, case linkage is applied to unsolved crimes. Further, concerns have been raised that previous studies might have reported inflated estimates of case linkage effectiveness due to sampling series that were first identified based on similar modus operandi (MO), thereby overestimating the degree of consistency and distinctiveness that would exist in naturalistic settings. We present the first study to overcome these limitations; we tested the assumptions of case linkage with a sample containing 1) offenses that remain unsolved, and 2) crime series that were first identified as possible series through DNA matches, rather than similar MO. Twenty-two series consisting of 119 rapes from South Africa were used to create a dataset of 7021 crime pairs. Comparisons of crime pairs that were linked using MO vs. DNA revealed significant, but small differences in behavioral similarity with MO-linked crimes being characterized by greater similarity. When combining these two types of crimes together, linked pairs (those committed by the same serial offender) were significantly more similar in MO behavior than unlinked pairs (those committed by two different offenders) and could be differentiated from them. These findings support the underlying assumptions of case linkage. Additional factors thought to impact on linkage accuracy were also investigated. KeywordsComparative case analysis–Linkage analysis–Behavioral linking–Sexual assault–Sexual offense
... When these points are plotted on a graph (hits on the y-axis and false alarms on the x-axis) and the points are connected, the result is a concave curve. The area under the curve (AUC) can be used as a measure of linking accuracy (Bennell, 2005). The AUC can range from 0 (total inaccuracy) to 1 (total accuracy), although most ROC curves fall above the positive diagonal on the graph, which represents an AUC of 0.50 (chance accuracy). ...
... Importantly, given that the AUC represents the location of the entire ROC curve in the ROC graph, this measure provides an index of linking accuracy (for both J and S) that is not specific to any single decision threshold. In this way, the AUC provides a more valid measure of linking accuracy (for other advantages associated with ROC analysis, see Bennell, 2005; Bennell et al., 2009). All ROC analyses were performed with the use of the ROC analysis subroutine in SPSS (v. ...
Article
When relying on crime scene behaviours to link serial crimes, linking accuracy may be influenced by the measure used to assess across-crime similarity and the types of behaviours included in the analysis. To examine these issues, the present study compared the level of linking accuracy achieved by using the simple matching index (S) to that of the commonly used Jaccard's coefficient (J) across themes of arson behaviour. The data consisted of 42 crime scene behaviours, separated into three behavioural themes, which were exhibited by 37 offenders across 114 solved arsons. The results of logistic regression and receiver op-erating characteristic analysis indicate that, with the exception of one theme where S was more effective than J at discriminating between linked and unlinked crimes, no significant differences emerged between the two similarity measures. In addition, our results suggest that thematically unrelated behaviours can be used to link crimes with the same degree of accuracy as thematic-ally related behaviours, potentially calling into the question the importance of theme-based approaches to behavioural linkage analysis.
... As with the different methods of conducting crime linkage, these different decision thresholds haven't been explored, nor is it clear whether a lower (or higher) threshold relative to other analysts is consistently adopted by an individual across their practice. As has been explained in papers testing the validity of the crime linkage principles, the adoption of a strict versus lenient threshold will likely affect the accuracy of crime linkage decisions (i.e., the relative proportion of hits, false alarms, misses and correct rejections; Bennell, 2005). ...
Chapter
Crime linkage can be a useful tool in the investigation of sexual offenses when other, physical evidence is unavailable or too costly to process. It involves identifying behavior that is both consistent and distinctive, and thus forms an identifiable pattern through which a series of offenses committed by the same offender can be distinguished. While there is a substantial body of research to support the principles of crime linkage, samples often contain only one type of sexual offense, and further research is needed into offenses such as voyeurism and exhibitionism. In practice, there are a number of ways in which crime linkage can be conducted, and a variety of terms are used to describe these different processes. While writings from practitioners provide insight into how crime linkage is conducted, research now needs to focus more on systematically mapping its practice and documenting procedural differences. There are also a number of additional considerations that require further research attention where the practice of crime linkage is concerned, such as the utility of computerised databases designed to assist with the process, the human decision-making element of linking and how bias can affect this, and the effects of expertise and training on linkage efficacy.
... The results of the present study clearly demonstrate that effective discrimination between genuine suicide notes and diaries depends on the inherent discriminatory power of a given variable, or possibly a combination of several variables, especially with respect to linguistic factors. In a practical context, it is futile to rely on the discriminatory power of only a single piece of evidence in quantitative linguistic analysis to arrive at a given scientific conclusion (Bennell, 2005;Bennell & Jones, 2005). Therefore, one of the primary advantages of using this methodology to discriminate between suicide notes and normal diaries is that accuracy and utility may be considered simultaneously, without one variable biasing the other. ...
Article
The objective of this study was to explore linguistic and psychological differences in suicide notes and the diaries of non-suicidal people. Fifty-six suicide notes and 56 personal diary entries were analyzed to provide basic descriptive data on linguistic and psychological variables using the Korean Linguistic Inquiry and Word Count Program. The results revealed that the two groups noticeably differed—suicide notes had more words or phrases per sentence; fewer modifiers, numerals, and affixes/suffixes associated with logically elaborated sentences; and more first- and second-person pronouns than the diaries of non-suicidal individuals. Suicide notes also used fewer positive words and future tense verbs, as well as more negative, sadness/depression-related words. Inter-correlations between the linguistic and psychological domains revealed some heterogeneity between the groups. Further research is needed on the practical applications of this study.
... ROC analysis is a useful measure of predictive accuracy because it provides an estimate that is independent from specific decision thresholds (e.g. Bennell, 2005). Furthermore, the AUC is flexible in terms of being able to evaluate a wide variety of offender behaviours and able to compare across samples that differ in terms of base rate and composition (Bennell, 2002;Liu et al., 2011). ...
... Canter (2000) mentions that, in order to manage the sometimes complex process of decision-making, heuristics or "shortcuts" are used in the information processing, which may introduce potential biases (e.g., confirmation bias, Oskamp, 1965;Oatley, Ewart, & Zeleznikow, 2006). Moreover, according to Bennell (2005), these decisions are difficult to make for two reasons in Table 1 Indicators of positive outcomes and imminent danger during critical incidents ...
Article
Full-text available
The aim of the study is to examine several combinations of risk factors known to the police – i.e., perpetrator and situational characteristics of the incidents – leading to two possible outcomes in domestic hostage and barricade incidents (HBI): autoaggressive and heteroaggressive behaviors from the perpetrator. Using a series of conjunctive analyses on a sample of 534 hostage and barricade incidents, results show that, depending on the specific combination of factors, the likelihood of violence could very much vary, suggesting that the checklist approach is not adequate for risk assessment in hostage and barricade incidents. HIGH-RISK represents a decision support system that can be easily integrated to any crisis-intervention structure (CIS) or police unit dealing with hostage and barricade incidents.
... ROC analysis is an analytical technique commonly implemented in case linkage analysis to represent the varying decision thresholds used by police investigators (Bennell, 2005;Bennell & Jones, 2005). A ROC graph is produced and represents the variation in hits (the prediction that two crimes are linked is correct) to false alarms (the prediction that two crimes are linked when they are not) ratios across different Jaccard scores. ...
Article
The present study investigated behavioural consistency across sexual offending. Variations in behavioural consistency may arise from an increased influence of situational and contextual aspects. However, there is paucity of research exploring variations in behavioural consistency relative to the temporal sequence of the behaviour (e.g., occurring prior to or during the offence). A sample of 49 male serial stranger sexual offenders responsible for 147 offences across four temporal phases of a sexual offence was used in the current study. For each offence, four crime phases were identified: 1) pre-crime, 2) victim selection, 3) approach, and 4) assault. Behavioural consistency within and across offence series were examined utilizing Jaccard’s Coefficient and Receiver Operating Characteristics (ROC). Results indicated a high degree of behavioural consistency across all crime aspects; behaviours that were more dependent on situational influences were inherently less predictable and demonstrated to be less consistent. Further, increased behavioural consistency was associated with offender characteristics of a more stable nature. The implications of these findings are discussed.
... Where they exist, the focus of psychological studies of detective work is often investigative decision-making (e.g., Alison, Barrett, & Crego, 2007;Almond, Alison, Eyre, Crego, & Goodwill, 2008;Bennell, 2005;Brandl, 1993a;Hall, 2005;Mullins et al., 2008;O'Keefe, 2002;Rossmo, 2008b). According to Markman and Medin: "decision situations are generally defined as those in which the decision maker has some unsatisfied goal and a set of options that might satisfy the goal" (2002, p.413). ...
Thesis
Full-text available
This thesis explores psychological mechanisms underlying the acquisition, interpretation and exploitation of information in complex criminal enquiries. Detective work is conceptualised as problem-solving and the importance of sense-making is highlighted. A model of investigative sense-making is presented, grounded in social-cognitive psychological and criminological research and bringing together several theoretical concepts within one coherent framework. Two studies explored aspects of this framework. First, 42 UK police officers gave written responses to four crime-related vignettes. Content analysis of the answers showed how sense-making about what had occurred varied according to the vignettes and between participants. Building on this pilot, a simulated investigation method was developed and tested with 22 UK detectives. Qualitative content analysis of ‘think aloud’ transcripts (using the qualitative analysis package N-Vivo) focused on how participants made sense of the victim’s story, the characteristics of the offender and the plausibility of potential suspects. Participants spontaneously generated and tested multiple hypotheses about investigative information using mental simulation, tolerating high levels of uncertainty throughout the ‘investigation’ and paying particular attention to investigative opportunities. This research suggests that successful detectives need the ability to imagine multiple potential explanations for investigative data and the knowledge to identify the opportunities for action such data affords. Download at http://etheses.bham.ac.uk/353/
... life threatening, as might be the case with cancer screening). Although certainly not the only measure that can be used to assess linking accuracy, the AUC does have several advantages associated with it (Bennell, 2005), which may explain why it has become so popular in linking studies. This is especially true when the AUC is being used to examine levels of linkage accuracy across different studies. ...
Article
The number of published studies examining crime linkage analysis has grown rapidly over the last decade, to the point where a special issue of this journal has recently been dedicated to the topic. Many of these studies have used a particular measure (the area under the receiver operating characteristic curve, or the AUC) to quantify the degree to which it is possible to link crimes. This article reviews studies that have utilised the AUC and examines how good we are currently at linking crimes (within the context of these research studies) and what factors impact linking accuracy. The results of the review suggest that, in the majority of cases, moderate levels of linking accuracy are achieved. Of the various factors that have been examined that might impact linking accuracy, the three factors that appear to have the most significant impact are crime type, behavioural domain, and jurisdiction. We discuss how generalisable these results are to naturalistic investigative settings. We also highlight some of the important limitations of the linking studies that we reviewed and offer up some strategies for moving this area of research forward.
... Dimensional versus multivariate behavioural linking Before a crime analyst or behavioural investigative advisor has to evaluate whether to link a target offence to a larger sample or to compare a limited string of sexual offences, he or she has to decide which characteristics are sufficiently salient and promising for such a task. Relatively consistent but common behaviours have the potential for large amounts of hits and false alarms, whereas rather unique and rare offender behaviours reduce the amount of false alarms but increase the possibility of missing out on other relevant offences (for a detailed discussion of crime linking as a diagnostic task, see Bennell, 2005;Bennell & Jones, 2005). One pertinent consideration is whether to rely on several exact offence characteristics successively or simultaneously (i.e. ...
Article
Full-text available
The empirical support for linkage analysis is steadily increasing, but the question remains as to what method of linking is the most effective. We compared a more theory‐based, dimensional behavioural approach with a rather pragmatic, multivariate behavioural approach with regard to their accuracy in linking serial sexual assaults in a UK sample of serial sexual assaults (n = 90) and one‐off sexual assaults (n = 129). Their respective linkage accuracy was assessed by (1) using seven dimensions derived by non‐parametric Mokken scale analysis (MSA) as predictors in discriminant function analysis (DFA) and (2) 46 crime scene characteristics simultaneously in a naive Bayesian classifier (NBC). The dimensional scales predicted 28.9% of the series correctly, whereas the NBC correctly identified 34.5% of the series. However, a subsequent inclusion of non‐serial offences in the target group decreased the amount of correct links in the dimensional approach (MSA–DFA: 8.9%; NBC: 32.2%). Receiver operating characteristic analysis was used as a more objective comparison of the two methods under both conditions, confirming that each achieved good accuracies (AUCs = .74–.89), but the NBC performed significantly better than the dimensional approach. The consequences for the practical implementation in behavioural case linkage are discussed. Copyright © 2012 John Wiley & Sons, Ltd.
... It is important to note here that the statistical dependence between the linked and unlinked crime pairs violated the assumption of independence for logistic regression (Bennell & Canter, 2002). ROC analysis does not have such an assumption and has additional advantages over logistic regression when assessing predictive accuracy (Bennell, 2005). The results of the ROC analysis should therefore be given greater credence. ...
Article
Whilst case linkage is used with serious forms of serial crime (e.g. rape and murder), the potential exists for it to be used with volume crime. This study replicates and extends previous research on the behavioural linking of burglaries. One hundred and sixty solved residential burglaries were sampled from a British police force. From these, 80 linked crime pairs (committed by the same serial offender) and 80 unlinked crime pairs (committed by two different serial offenders) were created. Following the methodology used by previous researchers, the behavioural similarity, geographical proximity, and temporal proximity of linked crime pairs were compared with those of unlinked crime pairs. Geographical and temporal proximity possessed a high degree of predictive accuracy in distinguishing linked from unlinked pairs as assessed by logistic regression and receiver operating characteristic analyses. Comparatively, other traditional modus operandi behaviours showed less potential for linkage. Whilst personality psychology literature has suggested we might expect to find a relationship between temporal proximity and behavioural consistency, such a relationship was not observed. Copyright © 2010 John Wiley & Sons, Ltd.
Article
Full-text available
An increasing amount of research has been conducted on crime linkage, a practice that has already been presented as expert evidence in some countries; however it is questionable whether standards of admissibility, applied in some jurisdictions, have been achieved (e.g., the Daubert criteria). Much research has assessed the two basic assumptions underpinning this practice: that offenders are consistent in the way they commit their crimes and that offenders commit their crimes in a relatively distinctive manner. While studies of these assumptions with stranger sex offenses exist, they are problematic for two reasons: (1) small samples (usually N=50 series, 194 offenses; and N= 50 one-off offenses) and by sampling the offenses of both serial and one-off sex offenders, thereby representing a more ecologically valid test of the assumptions. The two assumptions were tested simultaneously by assessing how accurately 365 linked crime pairs could be differentiated from 29,281 unlinked crime pairs through the use of Leave-One-Out Cross-Validation logistic regression followed by Receiver Operating Characteristic analysis. An excellent level of predictive accuracy was achieved providing support for the assumptions underpinning crime linkage.
Article
Full-text available
Much previous research on behavioural case linkage has used binary logistic regression to build predictive models that can discriminate between linked and unlinked offences. However, classification tree analysis has recently been proposed as a potential alternative owing to its ability to build user‐friendly and transparent predictive models. Building on previous research, the current study compares the relative ability of logistic regression analysis and classification tree analysis to construct predictive models for the purposes of case linkage. Two samples are utilised in this study: a sample of 376 serial car thefts committed in the UK and a sample of 160 serial residential burglaries committed in Finland. In both datasets, logistic regression and classification tree models achieve comparable levels of discrimination accuracy, but the classification tree models demonstrate problems in terms of reliability or usability that the logistic regression models do not. These findings suggest that future research is needed before classification tree analysis can be considered a viable alternative to logistic regression in behavioural case linkage. Copyright © 2012 John Wiley & Sons, Ltd.
Article
Two studies examined the degree to which training could improve participants’ ability to determine the authenticity of suicide notes. In Study 1, informing participants about variables that are known to discriminate between genuine and simulated suicide notes did not improve their decision accuracy beyond chance, nor did this training allow participants to perform as accurately as a statistical prediction rule. In Study 2, the provision of additional training instructions did enhance participants’ decision accuracy but not to a level achieved by the statistical prediction rule. However, training that included all instructions simultaneously resulted in a slight performance decrease attributable to the fact that certain instructions proved problematic when applied to the sample of suicide notes upon which decisions were being made. The potential implications of these findings for police decision making and training are discussed.
Article
Counterterrorism investigations commonly suffer from information overload problems that make the identification of relevant patterns difficult. Geographic prioritization models can be useful tools in such situations. We applied the general theories and principles of the environmental criminology perspective, and the specific ideas and concepts developed in geographic profiling, as a basis for understanding the geospatial patterns of terrorist cell behavior in Ankara and Istanbul, Turkey. From a unique access to police investigation files, we collected spatial data on terrorism incidents and terrorist cell sites, mapped these locations, and measured the distances from cell sites to incident sites and the distances between cell sites. The resulting probability distributions provide the basis for the development of a geospatial model for intelligence management.
Article
Full-text available
Human performance on the geographic profiling task—where the goal is to predict an offender's home location from their crime locations—has been shown to equal that of complex actuarial methods when it is based on appropriate heuristics. However, this evidence is derived from comparisons of ‘X-marks-the-spot’ predictions, which ignore the fact that some algorithms provide a prioritization of the offender's area of spatial activity. Using search area as a measure of performance, we examine the predictions of students (N = 200) and an actuarial method under three levels of information load and two levels of heuristic-environment fit. Results show that the actuarial method produces a smaller search area than a concentric search outward from students' ‘X-marks-the-spot’ predictions, but that students are able to produce search areas that are smaller than those provided by the actuarial method. Students' performance did not decrease under greater information load and was not improved by adding a descriptive qualifier to the taught heuristic. Copyright © 2008 John Wiley & Sons, Ltd.
Article
The suicide note is a valuable source of information for assisting police forces in equivocal death investigations. The present study endeavored to develop statistical prediction rules to discriminate between genuine and simulated suicide notes. Discriminant function analysis was performed on a sample of 33 genuine and 33 simulated notes to identify variables that serve as best predictors of note authenticity. Receiver operating characteristic analysis was then applied to validate these models and establish decision thresholds. The optimal model yielded an accuracy score of .82, with average sentence length and expression of positive affect being particularly effective at discriminating between the notes. Theoretical implications are discussed as are the practical advantages of applying receiver operating characteristic analysis in the investigation of equivocal deaths.
Chapter
Full-text available
Problems of classification in the field of Investigative Psychology are defined and examples of each problem class are introduced. The problems addressed are behavioural differentiation, discrimination among alternatives, and prioritisation of investigative options. Contemporary solutions to these problems are presented that use smallest space analysis, receiver operating characteristic analysis, and probability functions.
Article
Full-text available
This paper examines whether patterns in communication behavior over time can predict the outcome of crisis negotiations. A sample of 189 interaction episodes was transcribed from 9 resolved negotiations and coded according to differences in the degree and type of behavior. Partial order scalogram analysis (POSAC) was used to produce a graphical representation of the similarities and differences among episodes while simultaneously uncovering the role of each behavior in shaping the negotiation process. Results showed that episodes could be represented along a partially ordered scale of competitiveness, which was structured by the occurrence of two types of behavior: Distributive-Expressive and Integrative-Instrumental. The likelihood of negotiation success reduced with movement up the competitive scale, and negotiations involving episodes that passed a threshold of extreme competition on the scale inevitably ended unsuccessfully regardless of future developments. As negotiations developed over time, behavior alternated between periods of increasing cooperation and periods of increasing competition, with unsuccessful negotiations associated with a concluding trend of increasing competitive behavior.
Article
Full-text available
The purpose of this study is to determine if readily available information about commercial and residential serial burglaries, in the form of the offender's modus operandi, provides a statistically significant basis for accurately linking crimes committed by the same offender. Logistic regression analysis is applied to examine the degree to which various linking features can be used to discriminate between linked and unlinked burglaries. Receiver operating characteristic (ROC) analysis is then performed to calibrate the validity of these features and to identify optimal decision thresholds for linking purposes. Contrary to crime scene behaviours traditionally examined to link serial burglaries, the distance between crime site locations demonstrated significantly greater effectiveness as a linking feature for both commercial and residential burglaries. Specifically, shorter distances between crimes signalled an increased likelihood that burglaries were linked. Thus, these results indicate that, if one examines suitable behavioural domains, high levels of stability and distinctiveness exist in the actions of serial burglars, and these actions can be used to accurately link crimes committed by the same offender. Copyright © 2005 John Wiley & Sons, Ltd.
Article
Full-text available
The area under the ROC curve is a common index summarizing the information contained in the curve. When comparing two ROC curves, though, problems arise when interest does not lie in the entire range of false-positive rates (and hence the entire area). Numerical integration is suggested for evaluating the area under a portion of the ROC curve. Variance estimates are derived. The method is applicable for either continuous or rating scale binormal data, from independent or dependent samples. An example is presented which looks at rating scale data of computed tomographic scans of the head with and without concomitant use of clinical history. The areas under the two ROC curves over an a priori range of false-positive rates are examined, as well as the areas under the two curves at a specific point.
Article
Full-text available
A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.
Article
Full-text available
This paper uses statistical models to test directly the police practice of utilising modus operandi to link crimes to a common offender. Data from 86 solved commercial burglaries committed by 43 offenders are analysed using logistic regression analysis to identify behavioural features that reliably distinguish between linked and unlinked crime pairs. Receiver operating characteristic analysis is then used to assign each behavioural feature an overall level of predictive accuracy. The results indicate that certain features, in particular the distances between burglary locations, lead to high levels of predictive accuracy. This study therefore reveals some of the important consistencies in commercial burglary behaviour. These have theoretical value in helping to explain criminal activity. They also have practical value by providing the basis for a diagnostic tool that could be used in comparative case analysis.
Article
E.S. Shneidman and N. L. Farberow (1957) preselected writers of simulated suicide notes to eliminate vulnerable subjects. Subsequent comparisons of genuine and simulated notes have perpetuated the methodological misstep of the original study. In this study, a new set of genuine notes were selected from completed suicides by men and women who left at least one note, who were White, and who were older than 18 years of age. The simulated note writers (SNWs) were unpreselected, unpaid community volunteers. Genuine note writers in the current and 1957 samples were not found to differ; SNWs from the samples did differ. Problems with the interpretation of differences between genuine and simulated notes are discussed, with a focus on the role-playing nature of the simulated notes.
Article
The Criteria-Based Content Analysis (CBCA) technique was developed to distinguish children's truthful from fabricated allegations. Research results suggest some use for the procedure, but many important theoretical and empirical issues remain unresolved, including wide differences in the apparent usefulness of individual CBCA criteria, inconsistencies in the number of criteria used, and the absence of decision rules for evaluating an individual child. Other issues include the effect of the child's age and cultural background, motivation to lie, being coached to lie, and belief in the validity of a false memory. Although the CBCA technique shows some promise in enabling raters to differentiate true from false statements, the authors conclude that the presentation of expert testimony derived from CBCA analyses of an individual child would be premature and unwarranted.
Article
Henkelman, Kay, and Bronskill (HKB) showed that although the problem of ROC analysis without truth is underconstrained and thus not uniquely solvable in one dimension (one diagnostic test), it is in principle solvable in two or more dimensions. However, they gave no analysis of the resulting uncertainties. The present work provides a maximum-likelihood solution using the EM (expectation-maximization) algorithm for the two- dimensional case. We also provide an analysis of uncertainties in terms of Monte Carlo simulations as well as estimates based on Fisher Information Matrices for the complete- and the missing-data problem. We find that the number of patients required for a given precision of estimate for the truth- unknown problem is a very large multiple of that required for the corresponding truth-known case.
Article
Reports an error in the original article by S. T. Black (Journal of Consulting and Clinical Psychology, 1993[Aug], Vol 61[4], 669–702). On page 701, the 4th column heading in Table 2 appears as "Dunnett's t test probability.' The correct column heading is "Dunn's t test probability.' (The following abstract of this article appeared in record 1993-45704-001.) E. S. Shneidman and N. L. Farberow (1957) preselected writers of simulated suicide notes to eliminate vulnerable Ss. Subsequent comparisons of genuine and simulated notes have perpetuated the methodological misstep of the original study. In this study, a new set of genuine notes were selected from completed suicides by men and women who left at least one note, who were White, and who were older than 18 yrs of age. The simulated note writers (SNWs) were unpreselected, unpaid community volunteers. Genuine note writers in the current and the 1957 samples were not found to differ; SNWs from the samples did differ. Problems with the interpretation of differences between genuine and simulated notes are discussed, with a focus on the role-playing nature of simulated notes. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The Criteria-Based Content Analysis (CBCA) technique was developed to distinguish children's truthful from fabricated allegations. Research results suggest some use for the procedure, but many important theoretical and empirical issues remain unresolved, including wide differences in the apparent usefulness of individual CBCA criteria, inconsistencies in the number of criteria used, and the absence of decision rules for evaluating an individual child. Other issues include the effect of the child's age and cultural background, motivation to lie, being coached to lie, and belief in the validity of a false memory. Although the CBCA technique shows some promise in enabling raters to differentiate true from false statements, the authors conclude that the presentation of expert testimony derived from CBCA analyses of an individual child would be premature and unwarranted. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Describes statement validity assessment (SVA), which is a set of techniques and procedures for obtaining and evaluating statements by children who allegedly have been sexually abused. SVA consists of a structured interview of the child witness, a system for analyzing the content of the child's recorded statement (criteria-based content analysis), and an overall method (validity checklist) for analyzing the validity of the allegations contained in the statement. The need for SVA in sexual abuse cases is discussed, together with the history and rationale of its development and details of the interview and assessment procedures. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
An automated detector designed to warn a system operator of a dangerous condition often has a low positive predictive value (PPV); that is, a small proportion of its warnings truly indicate the condition to be avoided. This is the case even for very sensitive detectors operating at very strict thresholds for issuing a warning because the prior probability of a dangerous condition is usually very low. As a consequence, operators often respond to a warning slowly or not at all. Reported here is a preliminary laboratory experiment designed in the context of signal detection theory that was conducted to examine the effects of variation in PPV on the latency of participants' response to a warning. Bonuses and penalties placed premiums on accurate performance in a background tracking task and on rapid response to the warnings. Observed latencies were short for high values of PPV, bimodal for middle-to-low values, and predominantly long for low values. The participants' response strategies for different PPVs were essentially optimal for the cost-benefit structure of the experiment. Some implications for system design are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Contents:Introduction and Scope, p.1Components of Diagnostic Decision Making, p.4Statistical Machinery, p.5Examples of Enhanced Descision Making, p.10Conclusion and Discussion, p.20Acknowledgements, p.23References, p.23Appendix: Some Concepts of Possibility, p.25
Article
The Criteria-Based Content Analysis (CBCA) technique has been proposed as a way to differentiate truth from falsehood in interviews of children. We studied adults, utilizing 14 of the 19 CBCA criteria. In a 22 design, 114 students estimated the truthfulness of the statements of 12 adults; 6 were true and 6 described an invented traumatic personal experience. Subjects viewed a videotape or read a written transcript; half were trained in CBCA and half were not. Trained subjects who saw videotapes performed significantly better than chance and were significantly more accurate than each of the other 3 groups. For the trained subjects, 10 of the 14 CBCA criteria yielded significant differences in the predicted direction between evaluations of truthful and invented statements.
Article
IntroductionDefinitionsChronic Hepatitis: An ExampleThe Performance of Cross-Validation, the Jackknife, and the Bootstrap in SimulationsThe Relationship between Cross-Validation and the JackknifeConclusions References
Article
Many diagnostic tasks require that a threshold be set to convert evidence that is a matter of degree into a positive or negative decision. Although techniques of decision analysis used in psychology help one select the particular threshold that is appropriate to a given situation and purpose, just the concept of adjusting the threshold to the situation is not appreciated in many important practical arenas. Testing for the human immunodeficiency virus (HIV) and for dangerous flaws in aircraft structures are used here as illustrations. This article briefly reviews the relevant techniques and develops those two examples with data. It suggests that use of the decision techniques could substantially benefit individuals and society and asks how that use might be facilitated.
Article
The area under the receiver operating characteristic (ROC) curve is a popular measure of the power of a (two-disease) diagnostic test, but it is shown here to be an inconsistent criterion: tests of indistinguishable clinical impacts may have different areas. A class of diagnosticity measures (DMs) of proven optimality is proposed instead. Once a regret(-like) measure of diagnostic uncertainty is agreed upon, the associated DM is uniquely defined and, indeed, calculable from the ROC curve configuration. Two scaled variants of the ROC are introduced and used to advantage in the analysis. They may also be helpful to students of medical decision making.
Article
The ability to detect lying was evaluated in 509 people including law-enforcement personnel, such as members of the U.S. Secret Service, Central Intelligence Agency, Federal Bureau of Investigation, National Security Agency, Drug Enforcement Agency, California police and judges, as well as psychiatrists, college students, and working adults. A videotape showed 10 people who were either lying or telling the truth in describing their feelings. Only the Secret Service performed better than chance, and they were significantly more accurate than all of the other groups. When occupational group was disregarded, it was found that those who were accurate apparently used different behavioral clues and had different skills than those who were inaccurate.
Article
Receiver operator characteristic (ROC) analysis, the preferred method of evaluating diagnostic imaging tests, requires an independent assessment of the true state of disease, which can be difficult to obtain and is often of questionable accuracy. A new method of analysis is described which does not require independent truth data and which can be used when several accurate tests are being compared. This method uses correlative information to estimate the underlying model of multivariate normal distributions of disease-positive and disease-negative patients. The method is shown to give results equivalent to conventional ROC analysis in a comparison of computed tomography, radionuclide scintigraphy, and magnetic resonance imaging for liver metastasis. When independent truth is available, the method can be extended to incorporate truth data or to evaluate the consistency of the truth data with the imaging data.
Article
Diagnostic systems of several kinds are used to distinguish between two classes of events, essentially "signals" and "noise". For them, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy. It is the only measure available that is uninfluenced by decision biases and prior probabilities, and it places the performances of diverse systems on a common, easily interpreted scale. Representative values of this measure are reported here for systems in medical imaging, materials testing, weather forecasting, information retrieval, polygraph lie detection, and aptitude testing. Though the measure itself is sound, the values obtained from tests of diagnostic systems often require qualification because the test data on which they are based are of unsure quality. A common set of problems in testing is faced in all fields. How well these problems are handled, or can be handled in a given field, determines the degree of confidence that can be placed in a measured value of accuracy. Some fields fare much better than others.
Article
Tasks in which an observation is the basis for discriminating between two confusable alternatives are used widely in psychological experiments. Similar tasks occur routinely in many practical settings in which the objective is a diagnosis of some kind. Several indices have been proposed to quantify the accuracy of discrimination, whether the focus is on an observer’s capacity or skill, on the usefulness of tools designed to aid an observer, or on the capability of a fully automated device. The suggestion treated here is that candidate indices be evaluated by calculating their relative operating characteristics (ROCs). The form of an index’s ROC identifies the model of the discrimination process that is implied by the index, and that theoretical form can be compared with the form of empirical ROCs. If an index and its model yield a grossly different form of ROC than is observed in the data, then the model is invalid and the index will be unreliable. Most existing indices imply invalid models. A few indices are suitable; one is recommended.
Article
Classic studies of written suicide notes have sought to develop criteria for discriminating genuine from simulated notes. In this article, the authors provide a method of discourse analysis and apply this method to the discrimination of genuine from simulated notes used in previous studies. Reports of significant differences among language measures as well as the results of a multiple discriminant analysis using the discourse analysis are reported. In addition, a language profile of the suicidal individual is given along with suggestions for research and clinical use of the method.
Article
Sensitivity and specificity are key measures of the performance of a given test in detecting a given disorder. For tests yielding numerical scores, sensitivity and specificity usually vary inversely over the range of theoretically possible cutoff scores, complicating the task of quantifying and comparing the diagnostic accuracy of tests. Receiver Operating Characteristic analysis (ROC) approaches this problem by plotting the curve of sensitivity versus 1-specificity for all possible cutoff scores of the test. The area under the ROC curve (AUC) can be used to describe the diagnostic accuracy of the test. Parametric and non-parametric methods exist that allow the calculation of the AUC and the comparison of tests. A disadvantage of parametric formulations is the assumption of a normal or Gaussian distribution of test scores. The present article presents a computer program that utilizes non-parametric formulations that do not require the normal distribution of test scores. The program calculates the sensitivity and specificity of a test at all possible cutoff scores, plots the ROC curve, calculates the AUC, its standard error and 95% confidence limits, and allows the comparison of tests on independent and correlated samples.
Article
E. S. Shneidman and N. L. Farberow (1957) preselected writers of stimulated suicide notes to eliminate vulnerable subjects. Subsequent comparisons of genuine and stimulated notes have perpetuated the methodological misstep of the original study. In this study, a new set of genuine notes were selected from completed suicides by men and women who left at least one note, who were White, and who were older than 18 years of age. The simulated note writers (SNWs) were unpreselected, unpaid community volunteers. Genuine note writers in the current and 1957 samples were not found to differ; SNWs from the samples did differ. Problems with the interpretation of differences between genuine and simulated notes are discussed, with a focus on the role-playing nature of the simulated notes.
Article
Evaluating children for possible sexual abuse is widely regarded as a difficult clinical endeavor. Practitioners are concerned with both the basis for professional opinions and the accuracy of their ultimate judgments. Current approaches are critically analyzed for conceptual integrity and empirical support. The authors conclude that improvements in practice will be more productive than efforts to devise a procedure for classification of cases. Implications of this approach and recommendations for further research are discussed.
Article
We show that truth-state runs in rank-ordered data constitute a natural categorization of continuously-distributed test results for maximum likelihood (ML) estimation of ROC curves. On this basis, we develop two new algorithms for fitting binormal ROC curves to continuously-distributed data: a true ML algorithm (LABROC4) and a quasi-ML algorithm (LABROC5) that requires substantially less computation with large data sets. Simulation studies indicate that both algorithms produce reliable estimates of the binormal ROC curve parameters a and b, the ROC-area index Az, and the standard errors of those estimates.
Article
Fifty colour prints of human bite marks were sent to 109 observers who were asked to decide using a six point rating scale, whether the marks had been produced by the teeth of an adult or a child. The observers consisted of accredited senior forensic dentists, accredited junior forensic dentists, general dental practitioners, final year dental students, police officers and social workers. The results were compared against a "gold standard" which was the actual verdict from the case. Comparison of the results between the groups of observers and the standard was made using Receiver Operating Characteristics (ROC) methodology. The best decisions were made by senior/junior experts or final year dental students. General dental practitioners and police officers were least able to differentiate correctly between adult and child bite marks. The effect of training is important and its effects need to be assessed in more detail in future studies.
Article
We review the principles and practical application of receiver-operating characteristic (ROC) analysis for diagnostic tests. ROC analysis can be used for diagnostic tests with outcomes measured on ordinal, interval or ratio scales. The dependence of the diagnostic sensitivity and specificity on the selected cut-off value must be considered for a full test evaluation and for test comparison. All possible combinations of sensitivity and specificity that can be achieved by changing the test's cut-off value can be summarised using a single parameter; the area under the ROC curve. The ROC technique can also be used to optimise cut-off values with regard to a given prevalence in the target population and cost ratio of false-positive and false-negative results. However, plots of optimisation parameters against the selected cut-off value provide a more-direct method for cut-off selection. Candidates for such optimisation parameters are linear combinations of sensitivity and specificity (with weights selected to reflect the decision-making situation), odds ratio, chance-corrected measures of association (e. g. kappa) and likelihood ratios. We discuss some recent developments in ROC analysis, including meta-analysis of diagnostic tests, correlated ROC curves (paired-sample design) and chance- and prevalence-corrected ROC curves.
Article
Thirty-two certified diplomates of the American Board of Forensic odontology (ABFO) participated in a study of the accuracy of bitemark analysis. Examiner experience as board-certified odontologists ranged from 2 to 22 years. Examiners were given sets of photographs (a cast in 1 case) of 4 bitemark cases and asked to report their certainty that each case was truly a bitemark and the apparent value of the case as forensic evidence. Participants also received 7 occluding sets of dental casts, 1 correct dentition for each case and three unrelated to any of the cases, and asked to rate how certain they were that each set of teeth had made each bitemark. Receiver operating characteristic (ROC) analysis resulted in an accuracy score of 0.86 (95% CI=0.82-0.91). Youden's index was used to determine a cutoff point for determining an accuracy score for each case. Accuracy scores were significantly correlated with bitemark certainty and forensic value (P<0.001 in both cases) but not with examiner experience (P=0.958). The use of individual ROC analysis with weighted Youden's index to calibrate individual accuracy was also demonstrated.
Article
The problem of signal detectability treated in this paper is the following: Suppose an observer is given a voltage varying with time during a prescribed observation interval and is asked to decide whether its source is noise or is signal plus noise. What method should the observer use to make this decision, and what receiver is a realization of that method? After giving a discussion of theoretical aspects of this problem, the paper presents specific derivations of the optimum receiver for a number of cases of practical interest. The receiver whose output is the value of the likelihood ratio of the input voltage over the observation interval is the answer to the second question no matter which of the various optimum methods current in the literature is employed including the Neyman - Pearson observer, Siegert's ideal observer, and Woodward and Davies' "observer." An optimum observer required to give a yes or no answer simply chooses an operating level and concludes that the receiver input arose from signal plus noise only when this level is exceeded by the output of his likelihood ratio receiver. Associated with each such operating level are conditional probabilities that the answer is a false alarm and the conditional probability of detection. Graphs of these quantities called receiver operating characteristic, or ROC, curves are convenient for evaluating a receiver. If the detection problem is changed by varying, for example, the signal power, then a family of ROC curves is generated. Such things as betting curves can easily be obtained from such a family. The operating level to be used in a particular situation must be chosen by the observer. His choice will depend on such factors as the permissable false alarm rate, a priori probabilities, and relative importance of errors. With these theoretical aspects serving as an introduction, attention is devoted to the derivation of explicit formulas for likelihood ratio, and for probability of detection and probability - of false alarm, for a number of particular cases. Stationary, band-limited, white Gaussian noise is assumed. The seven special cases which are presented were chosen from the simplest problems in signal detection which closely represent practical situations. Two of the cases form a basis for the best available approximation to the important problem of finding probability of detection when the starting time of the signal, signal frequency, or both, are unknown. Furthermore, in these two cases uncertainty in the signal can be varied, and a quantitative relationship between uncertainty and ability to detect signals is presented for these two rather general cases. The variety of examples presented should serve to suggest methods for attacking other simple signal detection problems and to give insight into problems too complicated to allow a direct solution.
Interviewing and deception
  • Alison
Alison (Eds.), Interviewing and deception (pp. 129–156). Aldershot, UK: Ashgate Publishing.
Statement validation
  • B Tully
Tully, B. (1999). Statement validation. In D. V. Canter, & L. J. Alison (Eds.), Interviewing and deception (pp. 85–103). Aldershot, UK: Ashgate Publishing.
On the problem of ROC analysis without truth: the EM algorithm and the information matrix
  • S V Beiden
  • G Campbell
  • K L Meier
  • R F Wagner
Beiden, S. V., Campbell, G., Meier, K. L., & Wagner, R. F. (2000). On the problem of ROC analysis without truth: the EM algorithm and the information matrix. Proceedings of the International Society for Optical Engineering, 3981, 126-134.
Zur Glaubwuerdigkeit von Zeugenaussagen Experimentelle Weberpruefung ausgewaehlter Glaubwuerdigkeitskriterien
  • G Kohnken
  • H Wegener
Kohnken, G., & Wegener, H. (1982). Zur Glaubwuerdigkeit von Zeugenaussagen Experimentelle Weberpruefung ausgewaehlter Glaubwuerdigkeitskriterien. Zeitschrift fuer Experimentelle und Angewandte Psychologie, 29, 92-111.
Interviewing and deception
  • B Tully
Tully, B. (1999). Statement validation. In D. V. Canter, & L. J. Alison (Eds.), Interviewing and deception (pp. 85-103). Aldershot, UK: Ashgate Publishing.
  • Ekman
  • Efron
  • Swets