Against Jury Comparisons
Abstract
This article explains why jurors should not be invited, or allowed, to identify persons in crime-related images and sound recordings. It draws directly on scientific research, never seriously considered by Australian courts, which shows that ordinary people are unexpectedly error-prone at face and voice comparisons. The article rejects the longstanding legal commitment to the identification of unfamiliar persons in images and voice recordings being commonplace-that is resembling our everyday social interactions. It also explains how accusatorial trials generate cognitive biases (eg through expectation, suggestion and confirmation) that irreparably contaminate juror perception and cognition. Scientific research indicates that jurors are likely to identify the defendant even when the defendant does not feature in the images or recording. The article also explains why investigators (eg police and translators) should not be allowed to express their biased and inexpert opinions in criminal proceedings. For, investigators are vulnerable to many of the same risks confronting jurors (and judges and appellate courts).
... Briefer reviews of admissibility in Australia and in Canada are included in Morrison & Enzinger [4]. 11 For criticism of this practice, see Edmond [5]. 12 In R v Flynn, 14 the England & Wales Court of Appeal stated: ...
... There is concern in the earwitness literature that context could influence a listener to expect to hear a particular individual and bias them toward identifying a speaker whom they hear as that individual, e.g., if a listener is asked to identify a speaker in a showup scenario rather than in a well-designed voice lineup [18,51]. Similar concerns apply when a trier of fact is asked to compare a voice on a recording with the voice of the defendant [5]. Abundant psychology research indicates that judges would not be immune from the potential effects of contextual bias [52]. ...
... Edmond [5] provides additional arguments for why judges and juries should not attempt to perform their own speaker identification and why they should not be presented with and should not consider lay or "ad hoc expert" speaker-identification testimony. Judges and juries are invited to perform their own speaker identification in the suggestive context of the accusatorial trial. ...
Expert testimony is only admissible in common law if it will potentially assist the trier of fact to make a decision that they would not be able to make unaided. The present paper addresses the question of whether speaker identification by an individual lay listener (such as a judge) would be more or less accurate than the output of a forensic-voice-comparison system that is based on state-of-the-art automatic-speaker-recognition technology. Listeners listen to and make probabilistic judgements on pairs of recordings reflecting the conditions of the questioned- and known-speaker recordings in an actual case. Reflecting different courtroom contexts, listeners with different language backgrounds are tested: Some are familiar with the language and accent spoken, some are familiar with the language but less familiar with the accent, and others are less familiar with the language. Also reflecting different courtroom contexts: In one condition listeners make judgements based only on listening, and in another condition listeners make judgements based on both listening to the recordings and considering the likelihood-ratio values output by the forensic-voice-comparison system.
Several forensic sciences, especially of the pattern-matching kind, are increasingly seen to lack the scientific foundation needed to justify continuing admission as trial evidence. Indeed, several have been abolished in the recent past. A likely next candidate for elimination is bitemark identification. A number of DNA exonerations have occurred in recent years for individuals convicted based on erroneous bitemark identifications. Intense scientific and legal scrutiny has resulted. An important National Academies review found little scientific support for the field. The Texas Forensic Science Commission recently recommended a moratorium on the admission of bitemark expert testimony. The California Supreme Court has a case before it that could start a national dismantling of forensic odontology. This article describes the (legal) basis for the rise of bitemark identification and the (scientific) basis for its impending fall. The article explains the general logic of forensic identification, the claims of bitemark identification, and reviews relevant empirical research on bitemark identification—highlighting both the lack of research and the lack of support provided by what research does exist. The rise and possible fall of bitemark identification evidence has broader implications—highlighting the weak scientific culture of forensic science and the law's difficulty in evaluating and responding to unreliable and unscientific evidence.
Many of the problems that people have identifying speakers solely by their voices are similar to those that people have as eyewitnesses. The amount of exposure, the nature of the identification process, and the number of exposures all matter in determining how likely a witness is to make a correct identification. Yet while the reliability of eyewitness identification has been a focal point in the news, scholarly literature, and the courts, the reliability of earwitness identification has gone virtually unnoticed in the case law and legal literature. The reluctance of the legal system to deal with this problem stems from a confluence of ignorance, rigid adherence to historical positions that are no longer tenable, and some interesting judicial missteps concerning the accuracy of "voiceprints" that have made courts unreceptive to voice identification research. This Article examines the law governing both lay and expert identification of speakers, and evaluates existing doctrine in light of a great deal of basic research into how accurate people are at identifying an individual by his voice. While the doctrine concerns itself with reliability, many of the factors that make voice identifications either more or less reliable are not considered in the case law. Additional research shows that experts in phonetics are more accurate in identifying speakers than are lay people, suggesting that experts may play some role in the process. However, this Article does not support the admission of voiceprint analysis to identify speakers, despite some recent appellate decisions that have accepted this approach. The Article concludes with a series of recommendations, including less suggestive identification techniques, the limited use of experts, a more sophisticated approach to assessing reliability, and the use of informative jury instructions.
A protocol for the collection of databases of audio recordings for forensic-voice-comparison research and practice is described. The protocol fulfils the following requirements. (1) The database contains at least two non-contemporaneous recordings of each speaker. (2) The database contains recordings of each speaker using different speaking styles, which are typical of speaking styles found in casework, and which are elicited as natural speech. (3) The database is usable for research and casework involving recording- and transmission-channel mismatch. The protocol includes three speaking tasks: (1) an informal telephone conversation; (2) an information exchange task over the telephone; and (3) a pseudo-police-style interview. Technical issues are also discussed.
Since the 1980s the volume of identification evidence derived from surveillance devices and telephones has increased dramatically. This article offers a critical analysis of the forensic use of voice comparison and identification evidence. First, it reviews the contemporary jurisprudence in common law and uniform Evidence Act jurisdictions, then explains some of the limitations with our current responses to voice evidence, particularly the dramatic rise in the reliance placed upon the opinions of investigators, interpreters and (other ad hoc) 'experts' as well as the willingness to leave voice comparison evidence (and exercises) to juries. Employing an original multi-disciplinary methodology, the article then problematises legal practice through the introduction of relevant social science research on voice comparison (and recognition). As the authors explain, relevant scientific research and opinions are rarely adduced by lawyers or referred to by trial judges when instructing or cautioning juries. In consequence, it is suggested that current legal rules and procedures do not adequately represent what is known beyond the courts and thereby fail to embody fundamental criminal justice principles concerned with truth and fairness.
The article provides a comprehensive analysis of the dangers of using voice identification and recognition evidence and proposes safeguards for its use in the pretrial and trial procedure.
This paper will describe the preparation and implementation of a voice lineup or voice parade which formed part of the police investigation into a rape. The police supplied, as a sample of a suspect's voice, the statutory recording of a police interview with the suspect, and also speech samples from eight of their panel of volunteers for identity parades to act as 'foils'. The task was to create a voice parade which would be fair to the suspect. Fragments free of incriminating content were edited from the suspect's sample to provide a 30 second composite sample of his speech, and a similar editing procedure was followed for the 'foils'. A further speaker, for whom an interview recording was available, was also added. To ensure the fairness of the samples, two experiments were carried out in accordance with recommendations formulated at the Forensic Laboratory of the Dutch Justice Ministry. The first provided an estimate of the relative perceived distance of each of the foils from the suspect. The two speakers least like the suspect were eliminated. The second ensured that neither the suspect nor any of the foils sounded stereotypically like a rapist. The problems inherent in such experiments will be discussed, as will the procedure and outcome of the voice parade.
A witness to a crime may be required to identify a speaker based on voice samples from a language which is not their first language. Previous experimental work has shown that knowledge of a language has an effect on an individual's ability to identify speakers. This paper examines whether this ability increases over the course of the British four-year language degree. The results from a series of different open-test voice line-up presentations showed that listener ability improved on beginning to study a foreign language, yet showed no unambiguous improvement after the second semester of study.
An account is given of a case in which a voice parade contributed significantly to prosecution evidence. A witness had overhead his landlord arranging for another younger man to set fire to a house (where a fire later that night resulted in a woman's death), and claimed to know the voice. A voice parade was constructed using composite samples from this suspect's interview tapes, and, as foils, composite samples from police interviews with similar young men from the London Asian community. The witness identified the man from the voice parade, and also recognized him in a visual parade. This, together with other evidence, resulted in both men being convicted. The paper outlines the problems involved in picking foils from the interview tapes supplied by the police, discusses the format and conduct of the resulting parade including the question asked of the witness, and summarizes challenges in court to the fairness of the parade. In conclusion ways are suggested in which the procedure might be streamlined and its reliability improved.
Two studies were conducted examining voice recognition testimony and its impact on jurors. In the first experiment, subjects listened to a tape recording of a brief sales pitch. After a retention interval of either 0, 7 or 14 days, subjects were unexpectedly asked to pick the salesperson's voice out of a five-voice taped lineup. Retention interval did not have a significant effect on hit rates or false alarms. Accuracy and pre-lineup confidence were not significantly correlated, although accuracy was related to post-lineup willingness to testify. In the second experiment, undergraduate subjects were asked to read a summary of a trial, describing a situation similar to that studied in experiment 1; the independent variables were the presence of an earwitness, the gender and confidence of the earwitness, and the retention interval. Only the presence of an earwitness had a significant main effect upon mock jurors' verdicts. However, there was a significant interaction between witness confidence and witness gender when an earwitness identification was presented.