ArticlePDF Available

Inter-rater reliability of vehicle color perception for forensic intelligence

Authors:

Abstract and Figures

The topcoat color of motor vehicles offers vital information while investigating vehicular accidents, especially in instance of hit-and-run, since witnesses seldom perceive and retain the plate details. Differences in color perceptions among individuals with normal vision may lead to confusion in determining the color of the car involved. In this way, witnesses of crash accidents could potentially initiate flawed leads in forensic investigation, and thus affect the administration of justice. In this study, the inter-rater reliability of vehicle color determination by different volunteers was explored. Six individuals observed the topcoat colors of 500 stationary and 500 moving vehicles from five locations, employing a common system of color gradation. The outcome was binary: the vehicle color was either a “match” or “non-match”. This was followed by statistical analysis in terms of the colors’ frequencies and inter-rater reliability, based on which more suitable color descriptions were determined for subsequent comparisons of stationary and moving vehicles. Higher match frequencies and greater inter-rater reliability were observed when color gradations were disregarded. The frequency of correct matches could have been closely related to their relative on-the-road distribution, regardless of the statuses of observed vehicles. It was also found that black and white were associated with a greater number of matches than were intermediate colors, which should be carefully interpreted during forensic investigation to avoid wrong leads. In conclusion, the present study demonstrated the forensic significance of vehicle topcoat color determination, particularly in cases where witness statements are crucial.
Content may be subject to copyright.
RESEARCH ARTICLE
Inter-rater reliability of vehicle color
perception for forensic intelligence
Khai LeeID
1
, Anis Amanina Abdul Fatah
2
, Nuryuhanis Mohd Norizan
2
, Zakiah Jefrey
2
,
Fatin Hanani Md Nawi
3
, Wan Fatihah Khairunisa Wan Nor
3
, Huan Xin Wong
4
, Saiful
Fazamil Mohd Ali
5‡
, Poh Ying Lim
6‡
, Kah Haw ChangID
1‡
, Ahmad Fahmi
Lim AbdullahID
1‡
*
1Forensic Science Program, School of Health Sciences, Health Campus, Universiti Sains Malaysia, Kubang
Kerian, Kelantan, Malaysia, 2Faculty of Resource Science and Technology, Universiti Malaysia Sarawak,
Kota Samarahan, Sarawak, Malaysia, 3School of Marine and Environmental Sciences, Universiti Malaysia
Terengganu, Kuala Terengganu, Terengganu, Malaysia, 4Department of Chemistry, Faculty of Science,
University of Malaya, Kuala Lumpur, Malaysia, 5Criminalistic Section, Forensic Division, Department of
Chemistry, Petaling Jaya, Selangor, Malaysia, 6Department of Community Health, Faculty of Medicine and
Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
These authors contributed equally to this work.
‡ These authors also contributed equally to this work.
*fahmilim@usm.my
Abstract
The topcoat color of motor vehicles offers vital information while investigating vehicular acci-
dents, especially in instance of hit-and-run, since witnesses seldom perceive and retain the
plate details. Differences in color perceptions among individuals with normal vision may
lead to confusion in determining the color of the car involved. In this way, witnesses of crash
accidents could potentially initiate flawed leads in forensic investigation, and thus affect the
administration of justice. In this study, the inter-rater reliability of vehicle color determination
by different volunteers was explored. Six individuals observed the topcoat colors of 500 sta-
tionary and 500 moving vehicles from five locations, employing a common system of color
gradation. The outcome was binary: the vehicle color was either a “match” or “non-match”.
This was followed by statistical analysis in terms of the colors’ frequencies and inter-rater
reliability, based on which more suitable color descriptions were determined for subsequent
comparisons of stationary and moving vehicles. Higher match frequencies and greater inter-
rater reliability were observed when color gradations were disregarded. The frequency of
correct matches could have been closely related to their relative on-the-road distribution,
regardless of the statuses of observed vehicles. It was also found that black and white were
associated with a greater number of matches than were intermediate colors, which should
be carefully interpreted during forensic investigation to avoid wrong leads. In conclusion, the
present study demonstrated the forensic significance of vehicle topcoat color determination,
particularly in cases where witness statements are crucial.
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 1 / 10
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Lee K, Abdul Fatah AA, Mohd Norizan N,
Jefrey Z, Md Nawi FH, Wan Nor WFK, et al. (2019)
Inter-rater reliability of vehicle color perception for
forensic intelligence. PLoS ONE 14(6): e0218428.
https://doi.org/10.1371/journal.pone.0218428
Editor: Barry Rosenfeld, Fordham University,
UNITED STATES
Received: August 30, 2018
Accepted: June 3, 2019
Published: June 18, 2019
Copyright: ©2019 Lee et al. This is an open access
article distributed under the terms of the Creative
Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in
any medium, provided the original author and
source are credited.
Data Availability Statement: All relevant data are
within the paper.
Funding: This research is funded by Universiti
Sains Malaysia Short Term Grant (304/PPSK/
61313156). Abdullah AFL is the author who
received the funding. The funders had no role in
study design, data collection and analysis, decision
to publish, or preparation of the manuscript.
Competing interests: The authors have declared
that no competing interests exist.
Introduction
Multiple coatings of automotive paints are applied to vehicles for both protective and decora-
tive purposes [1,2]. These paints are frequently encountered as trace evidence in vehicular
accidents, allowing for the association of questioned samples recovered from a scene with con-
trol samples of known sources through forensic examination. In certain vehicular accidents
such as hit-and-run, the investigative team would look for the plate number and color infor-
mation alleged by a witness or victim of the accident in order to begin their investigation.
Unfortunately, in cases where witnesses either did not notice or cannot remember plate num-
bers, the colors of vehicles, the descriptions of which are based solely on the perceptions of wit-
nesses, become the main lead. Therefore, the inability to correctly describe the colors of
vehicles may lead investigations astray.
During a trial, investigative officers or forensic scientists are often required to explain the
evidential value of questioned paint evidence recovered from a vehicular accident, including
its color [3,4]. Surveys and compilations of vehicle color distributions generalized as topcoat
color [5] (technically, unlike the solid paint system, the top layer of a metallic paint system is a
colorless clear coat that covers the metallic basecoat) have been conducted to assess the proba-
tive value of a vehicle’s color in supporting forensic conclusions [3,4,611]. In addition, color
determination through careful interpretation of submitted paint samples can also be correlated
with the color momentarily perceived by a witness during an accident, although this is often
difficult. Any lack of accuracy in describing the color perceived by a witness, and especially so
if there are two or more witnesses, would impact credibility during cross-examination [12].
Though a standardized color coding system for paint has been established in the forensic com-
munity [13], it is not readily available to the public, complicating the process of accurate iden-
tification of topcoat color. Previous literature has suggested the possibility of variations in
color perception by individuals with normal vision [1416], supporting the likelihood of such
differences generating flawed investigative leads or contradictory testimony.
Inter-rater differences in color determination could be related to categorical differences in
observers’ knowledge and perceptions [17]. Color perception is an observer’s ability to percep-
tually differentiate between colors [18], which could be subjectively affected by personal, cul-
tural, and national beliefs, values, prejudices, and other unknown factors [19]. In view of this,
the inter-rater reliability of vehicle topcoat color perception is a topic of interest in forensic
intelligence, and the aim of this study was to evaluate inter-rater differences in descriptions of
vehicle topcoat color in both static and moving conditions. Based on a prevailing system [3],
the topcoat colors of stationary and moving vehicles were surveyed and statistically analyzed.
A better color description system was highlighted to increase the agreement percentage among
observers. The inter-rater reliability of vehicle color determination among observers was eval-
uated and colors that could potentially lead to incorrect determination were also identified. To
the authors’ knowledge, a survey of this type has not been reported thus far. Such information
can serve as an initial lead for investigative teams to verify a witness statement, and subse-
quently assist forensic teams in tracing the vehicle involved in an accident.
Materials and methods
Survey
This study was conducted in five locations within the boundary of the Health Campus of Uni-
versiti Sains Malaysia, where a total of 500 stationary and 500 moving vehicles were randomly
sampled and studied. Topcoat colors of vehicles were observed at noon in clear weather by six
students (aged between 20 and 22 years) from various universities undergoing industrial
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 2 / 10
attachment training in the forensic science program. In this study, only passenger-type vehi-
cles, such as cars and vans, were included; heavy-duty vehicles such as trucks and buses were
excluded.
Prior to the survey, the authors conducted an introductory session with the six observers
using a standard auto color chart available in forensic laboratories as a guide. This was to
assure consistency in color determination, noting the choice of colors available and the color
naming system. Buckle et al.’s [3] chart consisting of 29 grades of colors was utilized; this infor-
mation is depicted in Table 1. Those colors that could not be included in any of the listed col-
ors in Table 1 were counted as “miscellaneous,” as done in our report on vehicle surveys [6].
In each location, the six observers simultaneously noted the topcoat color of each car at a
distance of approximately two meters. For both stationary and moving vehicles, each observer
was given three seconds to write down their observation on a sheet of paper provided. No
attempt was made to identify whether a vehicle’s paint system was solid or metallic since this is
difficult to determine at a glance. As the survey was conducted within the university, the speed
of moving vehicles could not have exceeded 60 km/hr.
The observers’ perceptions were tabulated. Subsequently, the extent of agreement between
individuals’ observation regarding topcoat color was checked and interpreted. Regarding
Table 1. Color chart.
No. Color [3]
1 Black
2 Light blue
3 Medium blue
4 Dark blue
5 Green-blue
6 Light brown
7 Medium brown
8 Dark brown
9 Red-brown
10 Red
11 Red-orange
12 Gold-bronze
13 Light gray
14 Medium gray
15 Dark gray
16 Light green
17 Medium green
18 Dark green
19 Yellow-green
20 Light yellow
21 Medium yellow
22 Dark yellow
23 Maroon
24 Orange
25 Purple
26 Pink
27 White
28 Off-white
29 Miscellaneous
https://doi.org/10.1371/journal.pone.0218428.t001
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 3 / 10
agreement, the outcome was binary: it was either a “match” or “non-match.” It was a match
when all the observers described the same color code, while it was a non-match when even one
observer varied in his/her description of the topcoat color.
Statistical analysis
Statistical analysis was conducted using Stata software version 12 (StataCorp, USA). Data
cleaning and descriptive analyses were performed to ensure there were no errors.
Evaluation of agreement percentage in relation to color description
Using the colors listed above [3], the frequency and percentage of matches and non-matches
in observers’ determinations of topcoat colors were calculated. The statistical output using
these colors formed the basic data. Subsequently, shade variations in the colors described in
Table 1 were clubbed together to correspond to the basic color, with matches and non-matches
calculated in this situation as well.
Comparison of inter-rater reliability of color determination between two
different color descriptions
Inter-rater reliability of color determination for both color descriptions was investigated.
Kappa test (κ) statistics were used to assess agreement among the six observers. These values
were interpreted as poor agreement (0.00–0.20), fair agreement (0.21–0.40), moderate agree-
ment (0.41–0.60), good agreement (0.61–0.80), and very good agreement (>0.80) [20]. In this
study, κstatistic values >0.60 were considered indicative of good inter-rater reliability and
<0.60 of poor inter-rater reliability. A p-value <0.05 was considered statistically significant.
Based on the statistical output, a color description with better inter-rater reliability was deter-
mined. The inter-rater reliability of moving vehicles was also verified using the color descrip-
tion determined in the previous section.
Determination of “non-match” color combinations among observers
The observational data set was further analyzed to determine the colors with a greater possibil-
ity of non-matches among the six observers. The frequency and percentage of the matches and
non-matches for each color were demonstrated and compared. Colors that could easily be
described differently by the six observers were identified.
Results
In this study, 214 matches were recorded, which was 72 cases less than the non-matches. This
finding indicates that all six observers concurred in the descriptions of only 42.8% of topcoat
colors when using the color shades described earlier [3]. Then, basic colors alone were used by
totaling the light, medium, and dark colors into one group. For instance, “light gray”‘,
“medium gray,” and “dark gray” were all clubbed together as one basic color: “gray.” When the
variations in the shades were eliminated, there remained 18 colors and the consequent fre-
quencies of matches increased by 153, constituting of a total of 73.4%.
Inter-rater reliability (κ) values for each color scored in the two calculations (one that
included the variations in shades and the other that considered the basic colors) were com-
puted (Table 2). By including the variations in shades as in the prevailing color description [3],
black, orange, pink, purple, and light gray recorded very good agreement (κ>0.80), followed
by maroon, dark green, white, medium blue, red, light brown, light green, light blue, and dark
gray with good agreement (0.61<κ<0.80). These colors demonstrated good inter-reliability
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 4 / 10
(κ>0.60). It was also found that the intermediate colors (i.e. green-blue, yellow-green, red-
orange, gold-bronze, and red-brown), as well as miscellaneous shades of basic colors, exhibited
poor inter-reliability (κ<0.60).
Using basic color descriptions, the number of matches decreased by 12 when vehicles were
in motion (Table 3), but the status of vehicles, either stationary or in motion, did not affect the
correct determination of vehicle topcoat colors under the observational conditions. Among
the six observers, regardless of whether the vehicles were stationary or moving, white, gray and
black were the top three matches and were ranked the same. The high match frequency of the
color blue among stationary vehicles was found to decrease when observing moving vehicles.
Red was the fourth most frequent match for moving cars. Overall, inter-rater reliability for
both stationary and moving vehicles demonstrated very good agreement at κvalues of 0.85
and 0.84, respectively.
Table 2. κvalues of each color in two different color descriptions for stationary vehicles.
Color Inter-reliability (κ) values��
Color chart [3] Basic color descriptions
Black 0.960.96
Blue Light 0.650.81
Medium 0.72
Dark 0.44
Green-blue 0.56 0.56
Brown Light 0.680.65
Medium 0.15
Dark 0.44
Red-brown 0.22 0.22
Red 0.680.68
Red-orange 0.52 0.52
Gold-bronze 0.32 0.32
Gray Light 0.890.91
Medium 0.45
Dark 0.63
Green Light 0.670.79
Medium 0.12
Dark 0.77
Yellow-green 0.52 0.52
Yellow Light 0.31 0.62
Medium 0.40
Dark 0.23
Maroon 0.780.78
Orange 0.920.92
Purple 0.910.91
Pink 0.920.92
White Pure white 0.730.98
Off-white 0.41
Miscellaneous 0.40 0.40
Overall 0.69Good agreement 0.85Very good agreement
Good inter-rater reliability (κ>0.60)
��Reliability test was significant, p<0.05
https://doi.org/10.1371/journal.pone.0218428.t002
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 5 / 10
In this study, a non-match was scored if there was even a single difference among observers.
Therefore, cases involving determination of two or more colors by the six observers were sepa-
rately recorded as non-matches in their respective color categories. White scored the lowest
percentage of non-matches, followed by black and gray. On the contrary, there were colors
with only non-matches, such as red-brown, gold-bronze, and yellow-green, where no agree-
ment was achieved among observers. It was also noted that the colors that could lead to differ-
ences in color determination were similar, regardless of whether the vehicles were stationary
or moving.
Discussion
The majority of non-matches reported in this study were seen as a consequence of the exis-
tence of multiple shades of the same basic color. For example, the six observers did not have
mutual agreement in determining “white” and “off-white” where 81 cases, accounting for
28.3% were recorded as non-matches. “Dark gray” and “medium gray,” with non-matches
recorded in 22 cases (7.7%), and “light gray” and “medium gray,” with non-matches recorded
in 21 cases (7.3%), showed similar trends. The higher proportion of matches reported when
considering only the basic color description as compared to when shades were included [3]
indicates the value of basic colors.
When shade variations in blue, brown, gray, green, yellow, and white were disregarded, the
inter-rater reliability increased considerably. White, which was initially coded separately as
“white” and “off-white,” became the most reliable color, replacing black. In the basic color
description, only the crossover colors (i.e. green-blue, yellow-green, red-orange, gold-bronze,
Table 3. Percentages of match and non-match frequencies based on color description.
Color Stationary vehicles
“Match” (n = 367)
Moving vehicles
“Match” (n = 355)
Frequency (%) Frequency (%)
Match Non-match Match Non-match
Black 56 (81.2)13 (18.8) 56 (76.7)17 (23.3)
Blue 26 (44.1)33 (55.9) 17 (42.5)23 (57.5)
Green blue 1 (20.0) 4 (80.0) 0 (0) 4 (100.0)��
Brown 8 (18.6)35 (81.4) 12 (22.2) 42 (77.8)
Red-brown 0 (0) 28 (100.0)�� 0 (0) 15 (100.0)
Red 7 (19.4) 29 (80.6) 30 (58.8)21 (41.2)
Red-orange 1 (4.2) 23 (95.8)1 (5.6) 17 (94.4)
Gold-bronze 0 (0) 14 (100.0)�� 0 (0) 12 (100.0)
Gray 120 (70.2)51 (29.8) 82 (59.0)57 (41.0)
Green 7 (28.0) 18 (72.0) 10 (27.8) 26 (72.2)
Yellow-green 0 (0) 7 (100.0)�� 0 (0) 5 (100.0)��
Yellow 3 (23.1) 10 (76.9) 4 (28.6) 10 (71.4)
Maroon 4 (28.6) 10 (71.4) 1 (20.0) 4 (80.0)
Orange 2 (66.7) 1 (33.3) 3 (37.5) 5 (62.5)
Purple 7 (58.3) 5 (41.7) 11 (91.7) 1 (8.3)
Pink 3 (60.0) 2 (40.0) 1 (33.3) 2 (66.7)
White 122 (92.4)10 (7.6) 121 (92.4)10 (7.6)
Miscellaneous 0 (0) 3 (100.0)�� 6 (26.1) 19 (73.9)
top five “match” scores
�� top five “non-match” percentages
https://doi.org/10.1371/journal.pone.0218428.t003
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 6 / 10
and red-brown) demonstrated poor inter-reliability (κ<0.60), suggesting differences in agree-
ment on such intermediary shades of colors.
Although an introductory session was conducted to calibrate the observers prior to sam-
pling, the variations in color determination could have been due to differences in their ability
to discriminate colors as well as personal experiences [19,21]. A significant increase in the
overall inter-reliability from 0.69 to 0.85 was observed when shades were discarded, resulting
in very good agreement. Higher reliability in single-color determination was also achieved by
disregarding the shade variations in basic colors. This observation was in agreement with Bae
et al. [22], who found that colors were easier described by a single term (e.g. gray) rather than
including shades (e.g. light gray, medium gray, or dark gray) since the boundaries between
these shades differ among individuals. The suggestion of the possibility of eliminating color-
specific biases by merging shades into one color ensures better appreciation of the categorical
boundaries of basic colors [22]. This was supported by the better scores and reliability obtained
when the different shades of a color were combined.
Since all six individuals in this study made their observations in similar conditions, the use
of only basic colors, which ensures good inter-rater reliability, is proposed for describing vehi-
cle topcoat colors during forensic investigations, particularly when recording witness state-
ments. In addition, such basic color-based enquiries would limit the use of jargon, thus
conforming to the suggestion that an observer feels more comfortable describing a color in a
few words rather than having to rely on a spectrum of colors with broad and hardly discrimi-
nable shades [23]. However, it is also important to note that the investigative team should
gather as much information as possible from a witness as he/she is able to provide.
Higher match frequencies for white, gray, and black could be linked to the on-the-road top-
coat color distribution found in an earlier survey [6]. A greater number of vehicles top-coated
with these colors could have been observed during the present survey, accounting for approxi-
mately 80% of the total matches. However, it has to be emphasized that the on-the-road distri-
bution of topcoat colors could not be linked to the inter-rater reliability in color determination
since certain colors such as pink, orange, and purple that have been reported to have very
good inter-rater reliability were not among the common topcoat colors of vehicles in the coun-
try [6]. The rarity of a color does not appear to influence inter-rater reliability of color
perception.
Good inter-rater reliability indicates greater consistency in the estimation of a phenome-
non; in this case, matches during color determination. The use of basic colors has been associ-
ated with more consistent determination and less confusion as compared to the use of
intermediate colors [24]. In this study too, basic colors like white and black had high inter-
rater reliability values of 0.98 and 0.96, respectively, during matches. Contrarily, in the case of
colors like green-blue, yellow-blue, red-orange, gold-bronze, and red-brown, poor inter-rater
reliability values were attributable to a large percentage of non-matches based on the criterion
that disagreement by even a single observer was categorized as a non-match (Table 3). The
identification of intermediate colors, such as the combination of red/red-brown, red/red-
orange, brown/gold-bronze, and red/red-orange/red-brown, was likely to be incorrect. How-
ever, it is important to be aware of the possibility of wrong matches even if a particular color
demonstrated good agreement regarding inter-rater reliability value. For example, the deter-
mination of the color gray was associated with a relatively large percentage of non-matches
(29.8%), wherein gray could be confused with blue, brown, or even white.
This study suggests that witness statements regarding intermediate colors should be care-
fully interpreted during forensic investigations to avoid following wrong leads. Additionally,
exact agreement among the observers which did not occur, particularly in determination of
intermediate color, could have been due to personal variations [18,19]. This is exemplified by
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 7 / 10
the non-matches for white (Table 3), which appears unique and the least confusing. While it is
highly likely that such a color will be correctly determined by a witness, it is not certain because
of differences in individual perception [17,18,22]. In fact, the outcome of this study could aid
in investigative procedures where a search for a vehicle of a specific color can be broadened to
other possibilities whenever a witness can provide more detailed color information.
According to Bae et al. [22], an observer’s visual system can spontaneously assign category
labels to signals that interact with encoded shade content to produce bias during response, par-
ticularly when an observer is required to describe the vehicle topcoat color after having seen it
just once. In other words, observations regarding moving vehicles could lead to delays in
encoding colors, which is unlikely to happen for stationary vehicles; the consequently greater
bias could be the cause for the slightly lower inter-reliability values among moving vehicles
[22]. In this study, although the observational results demonstrated a slight decrease in the
overall inter-rater reliability for moving vehicles, a significant association between the matches
in color determination was lacking for both stationary and moving vehicles.
Previous literature suggests that memory retrieval delays could affect observers’ color per-
ception [25,26], especially because of distractors such as surface illumination and the motion
of an object [27,28]. In this study, possible distortions or biases caused by delayed memory
were minimized through the provision of sufficient time to assign a color to the moving vehi-
cles, leading to no significant effect on observers’ color perception. Short-term memory could
be one factor leading to variations in color determination [28,29]; nonetheless, future studies
on its relationship with color perception, which could offer useful retrievable information to
law enforcement authorities, are recommended.
This study was conducted in optimal rating conditions with adequate illumination at a
fixed distance for the young observers to determine topcoat colors. However, further collation
of information, including vehicle color perception under different conditions, particularly
accounting for environmental factors and observers’ vision and attention toward color deter-
mination, is necessary for broader forensic intelligence. However, percentage of agreement
among colors with greater number of observers observing on the smaller number of objects
could be proposed, perhaps in subsequent studies upon the determination of suitable color sys-
tem and combination of “non-match” colors identified in the current study.
In general, the frequency of matches in topcoat color determination, both for stationary
and moving vehicles, could be related to their relative on-the-road distribution. This study
indicated that using basic colors without shade variations could lead to better determination of
color by an observer, resulting in a greater frequency of matches. The motion of vehicles did
not have much effect on the scoring a match, given that the environmental conditions were
adequate for an observer to encode the color. This study supports that for forensic intelligence
purposes, cases involving descriptions of vehicle topcoat colors shall need greater investigative
efforts including a more careful interpretation of witness testimony. It should also be empha-
sized that individual differences could lead to differences in color perception, especially when
involving intermediate colors.
Conclusion
A survey to investigate the inter-rater reliability of color determination among observers who
visually perceived the topcoat colors of both stationary and moving vehicles indicated that the
frequencies of matches, and subsequently inter-rater reliability of color determination among
observers, significantly increased when using basic color descriptions, disregarding their
shades. White and black had the greatest matches, while intermediate colors like green-blue,
yellow-green, red-orange, gold-bronze, and red-brown were considered confusing, and thus
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 8 / 10
require careful interpretation during forensic investigation. Information from this study can
prove useful in interpreting witness descriptions of vehicle topcoat colors for more reliable
statements.
Acknowledgments
Special thanks to Associate Professor PT Jayaprakash and Associate Professor Lim Boon Huat
for editorial assistance.
Author Contributions
Data curation: Khai Lee, Saiful Fazamil Mohd Ali.
Formal analysis: Poh Ying Lim, Kah Haw Chang, Ahmad Fahmi Lim Abdullah.
Investigation: Anis Amanina Abdul Fatah, Nuryuhanis Mohd Norizan, Zakiah Jefrey, Fatin
Hanani Md Nawi, Wan Fatihah Khairunisa Wan Nor, Huan Xin Wong.
Methodology: Kah Haw Chang, Ahmad Fahmi Lim Abdullah.
Supervision: Ahmad Fahmi Lim Abdullah.
Validation: Poh Ying Lim, Kah Haw Chang, Ahmad Fahmi Lim Abdullah.
Writing – original draft: Khai Lee.
Writing – review & editing: Saiful Fazamil Mohd Ali, Poh Ying Lim, Kah Haw Chang,
Ahmad Fahmi Lim Abdullah.
References
1. Pfanstiehl J. Automotive Paint Handbook: Paint Technology for Auto Enthusiasts & Body Shop Profes-
sionals. New York: HP Books; 1998.
2. Toda K, Salazar A, Saito K. Automotive Painting Technology: A Monozukuri-Hitozukuri Perspective.
Netherlands: Springer; 2012.
3. Buckle J, Fung T, Ohashi K. Automotive Topcoat Colours: Occurrence Frequencies in Canada. Can
Soc Forensic Sci J. 1987; 20(2):45–56.
4. Ryland SG, Kopec RJ. The Evidential Value of Automobile Paint Chips. J Forensic Sci. 1979; 24
(1):140–7.
5. Bentley J. Composition, manufacture and use of paint. In: Candy B, editor. Forensic Examination of
Glass and Paint: Analysis and Interpretation. New York: Taylor & Francis; 2001. p. 123–42.
6. Abdullah AFL, Chang KH, Mohd Ali SF. A Survey of Vehicle Top Coat Colour in Malaysia. Malaysian J
Forensic Sci. 2014; 5(2):27–30.
7. Lee CT, Sandercock PML. A survey of automotive topcoat colours in Edmonton, Alberta. Can Soc
Forensic Sci J 2011; 44(4):130–43.
8. Ryland SG, Kopec RJ, Somerville PN. The Evidential Value of Automobile Paint. PartII: Frequency of
Occurence of Topcoat Colors. J Forensic Sci. 1981; 26(1):64–74.
9. Stone H, Murphy KJ, Rioux JM, Stuart AW. Vehicle Topcoat Colour and Manufacturer: Frequency Dis-
tribution and Evidential Significance: Part II. Can Soc Forensic Sci J. 1991; 24(3):175–85.
10. Taylor MC, Cousins DR, Holding RH, Locke J, Wilkinson JM. A data collection of vehicle topcoat col-
ours. 3. Practical considerations for using a national database. Forensic Sci Int. 1989; 40(2):131–41.
11. Volpe GG, Stone H, Rioux JM, Murphy KJ. Vehicle Topcoat Colour and Manufacturer: Frequency Distri-
bution and Evidential Significance. Can Soc Forensic Sci J. 1988; 21(1&2):11–8.
12. Croucher JS. Assessing the statistical reliability of witness evidence. Aust Bar Rev. 2003; 23:173–83.
13. Fouweather C, May RW, Porter J. The application of a standard color coding system to paint in forensic
science. J Forensic Sci. 1976; 21(3):629–35. PMID: 956751
14. Jordan G, Mollon JD. Rayleigh matches and unique green. Vis Res. 1995; 35(5):613–20. PMID:
7900300
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 9 / 10
15. Neitz J, Carroll J, Neitz M. Color vision: Almost reason enough for having eyes. Opt Photonics News.
2001; 12(1):26–33.
16. Neitz J, Jacobs GH. Polymorphism of the long-wavelength cone in normal human colour vision. Nature.
1986; 323(6089):623–5. https://doi.org/10.1038/323623a0 PMID: 3773989
17. Harnad S. Categorical perception. Encycl Cognitive Sci. 2003; 67(4):1–5.
18. Goldstone RL, Hendrickson AT. Categorical perception. Wiley Interdiscip Rev-Cognitive Sci. 2010; 1
(1):69–78.
19. Stampouli D, Brown M, Powell G, editors. Fusion of soft information using TBM. 2010 13th International
Conference on Information Fusion; 2010 26–29 July 2010; Edinburgh International Conference Centre
(EICC), Edinburgh, United Kingdom: IEEE.
20. Altman DG. Practical Statistics for Medical Research. London: Chapman and Hall; 1991.
21. Lagouvardos PE, Diamanti H, Polyzois G. Effect of individual shades on reliability and validity of observ-
ers in colour matching. Eur J Prosthodon Restor Dent. 2004; 12(2):51–6.
22. Bae GY, Olkkonen M, Allred SR, Flombaum JI. Why some colors appear more memorable than others:
A model combining categories and particulars in color working memory. J Exp Psychol-Gen. 2015; 144
(4):744–63. https://doi.org/10.1037/xge0000076 PMID: 25985259
23. Linhares JMM, Pinto PD, Nascimento SMC, editors. The number of colors perceived by dichromats
when appreciating art paintings under standard illuminants. Society of Imaging Science and Technology
- 4th European Conference on Colour in Graphics, Imaging, and Vision and 10th International Sympo-
sium on Multispectral Colour Science, CGIV 2008/MCS’08; 2008.
24. Boynton RM. Eleven colors that are almost never confused. Proc SPIE Int Soc Opt Eng. 1989;
1077:322–32.
25. De Fez MD, Capilla P, Luque MJ, Pe
´rez-Carpinell J, Del Pozo JC. Asymmetric colour matching: Mem-
ory matching versus simultaneous matching. Color Res Appl. 2001; 26(6):458–68.
26. Jin EW, Shevell SK. Color memory and color constancy. J Opt Soc Am A: Opt Image Sci Vis. 1996; 13
(10):1981–91.
27. Hong SW, Kang MS. Motion Alters Color Appearance. Sci Rep. 2016; 6:1–11. https://doi.org/10.1038/
s41598-016-0001-8
28. Olkkonen M, Allred SR. Short-term memory affects color perception in context. PLoS ONE. 2014; 9(1).
29. Ling Y, Hurlbert A. Role of color memory in successive color constancy. J Opt Soc Am A: Opt Image Sci
Vis. 2008; 25(6):1215–26.
Inter-rater reliability of vehicle color perception
PLOS ONE | https://doi.org/10.1371/journal.pone.0218428 June 18, 2019 10 / 10
... Fragments of the vehicle such as license plates, bumper covers, and side mirrors can provide valuable information that can be used to identify the vehicle, if they have been left at the scene. Alternatively, eyewitness accounts [1] and images from surveillance cameras can also be informative. However, in many cases, only fragments of the automotive paint (i.e., paint chips) are recovered from the crime scene. ...
... Fortunately, these paint fragments can provide crucial information about the make, model, and year of the suspect's vehicle and act as a link to guide the investigation [2]. Paint analysis is therefore an important branch of forensic science as it is useful in these cases where only paint fragments remain [1,3]. ...
Article
The driver's criminal conduct of evading after a collision (hit-and-run) is a problem for law enforcement agencies worldwide. Witness reports or video images are not always reliable nor available. Fragments of paint left behind at the crime scene become essential for the criminal investigation process. This study reviewed publications about automotive paint analysis, published from 2010 to 2019. In each study, we evaluated: technique, paint layer, data analysis methodology, model validation, chemical characterization , samples degradation and origin (both their sources and their support bases-plastic vs. metallic). Of the techniques reviewed, infrared spectroscopy was the most commonly used technique for forensic paint analysis. The authors reported complementary or alternative techniques that have been used to analyze paint samples, such as optical examination, Raman spectroscopy and elemental analysis and even more rare or innovative techniques i.e. optical coherence tomography. We also detailed the various chemometric techniques that have been employed when analyzing the resulting spectra. In summary, the information collated in this study will provide researchers that intend to create automotive paint databases for forensic purposes, with a suite of instrumental techniques, analytical methodologies and data interpretation protocols that can be incorporated into a database.
... In this context, law enforcement teams need to rapidly identify the offending vehicle. Witnesses or images from surveillance cameras can be useful, however, witness reports involve subjective assessment [1], and images are not always available. Therefore, the analysis of the vehicle fragments left at the site becomes sometimes the only link to start the investigation. ...
Article
The consequences of a hit-and-run car crash are significant and may include serious injuries to the victims, health system overload and even victim's death. The vehicle and driver identification are often challenging for local law enforcement. The aim of this study was to develop a methodology to discriminate between automotive paint samples according to the make of the vehicle and its color shade. 143 white samples (collected at traffic accident scenes) were analyzed in situ by Fourier transform infrared spectroscopy with attenuated total reflectance (ATR-FTIR) and coupled microscopy. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were performed for data analysis. The samples were split into three groups: calibration set, validation set and external test set. The figures of merit were calculated to assess the quality of the model. Sensitivity, specificity, and efficiency rates were, respectively, 98,9%, 98.4% and 98.6%, for the calibration set. For the validation group, the classification accuracy was 100%. Correct classification rates for the internal validation set and external test set were 100% and 79.1% respectively. The technique is clean, fast, relatively low-cost, and non-destructive. Damaged regions of the samples were avoided by using the attached microscope. Limiting the age of the samples to a maximum of 10 years was enough to avoid misclassifications due to the natural degradation and weathering of the sample. Since the external test group is formed by underrepresented classes, its correct classification rate (79.1%) can be potentially improved at any time, by including and analyzing more samples.
... A study of the interrater reliability of vehicle color perception for forensic intelligence has discussed similar problems regarding variations in color perception by individuals with normal vision, leading to incorrect determinations. 17 Macroscopic findings will be more reliable if supported by other characteristic, such as lung weight, texture, and topography. In the medical records used for this study, not every sample included all macroscopic characteristics other than color, which limits our study to interpreting macroscopic finding to decide whether the infant was born alive or stillbirth. ...
Article
Full-text available
Background: Infanticide is a criminal act when a mother kills her child at or soon after birth. Considering whether a case has been decided as infanticide, determination of the life of the neonate at birth is imperative. Breath signs from macroscopic and microscopic approacesh and hydrostatic test are useful indicators. This study aimed to explore the lungs pathological findings at autopsy of neonates. Methods: This study was a retrospective descriptive observational study, using medical records of neonatal deaths in the Department of Forensics and Medicolegal Dr. Hasan Sadikin General Hospital, Bandung for the period 2016–2019. Total sampling method was used. Inclusion criteria were neonatal death with presumptive infanticide, had autopsied and microscopic examination as well as neonatal death without putrefaction. Data on macroscopic and microscopic findings as well as the hydrostatic test were presented. Result: In total, 12 of the 42 medical records with data on presumptive infanticide met the inclusion criteria. For macroscopic findings, 7 of the 12 samples had positive results, meaning the lungs had sign of breath. Meanwhile, in microscopic findings 8 of the 12 samples had positive results. For the hydrostatic test, 8 out of 12 samples had positive results. Of the 12 samples, there were four samples that had different results, at least on one variable. Conclusions: Most of the cases are matched with macroscopic, microscopic, and hydrostatic test, but some unmatched data are also found. In order to improve reliability, especially for legal purposes in infanticide, it is necessary to conduct all the examination.
... Paint smears or chips are found on the crime scene as trace evidences leading to establishing the identity of vehicle involved in the crime by analyzing with control sample of suspected vehicle. Investigation of such cases are influenced by many factors, such as information from CCTV footages, eye witness other possible physical evidences [1]. The composition includes one or more undercoats, topcoat and clearcoat on the surface and each layer have organic pigments, additive and binder. ...
Article
Scientific analysis of automotive paint is intended to provide details of the evidence and to establish the connection between culprit and victim through scene of crime. Paint examiners compares the physical features, chemical to match with the control sample of known source. In this study, examination has been done on the layers of automotive paint chips collected from Maruti Suzuki from Kottayam region of Kerala. The study also focusses on elemental identification and comparison with different samples of same brand using Scanning electron Microscope coupled to Energy Dispersive X Ray Spectroscopy. The result shows the presence of Thulium in white and grey paint samples and Aluminium in the red paint samples of automobile paint. Thus, profiling chemicals and elements in the sample may leads to individualization of the automobiles and can be include in the paint database.
Article
Hit and Run cases are reporting in almost all the countries with or without intention to cause hurt. In most of the cases eye witness or video documents (CCTV) may neither be visible nor reliable. Paint chips available at the scene of occurrence may only be the piece of information that is available to track down the culprit. The study focused on reviewing automotive paint analysis from 2015-2020 year and assessed various sample collection, technique used for analysing, advancement in technologies, challenges etc. Most commonly used instrument in our study selection is Raman spectroscopy and FT-IR in automotive paint analysis. The various advanced instruments and its challenges in the field of forensic investigation has also been reviewed. In summary the given study will provide basic details of paint analysis using various instruments and its effective significance in the field of Forensic investigation.
Article
Over the course of 19 months, West Virginia University collected reports from 70 footwear experts, each performing 12 questioned-test comparisons, resulting in a dataset that includes more than 1000 examiner attributes (education, training, certification status, etc.), 3500 impression features identified and evaluated (clarity, totality, and similarity), and 840 source conclusions. The results were used to estimate the performance of forensic footwear examiners in the United States, including error rates, predictive value (PV), and measures of inter-rater reliability (IRR). For the dataset and mate-prevalence (31.5%) used in this study, results indicate correct predictive value varies from 94.5% for exclusions, 85.0% for identifications, and between 70.1% and 65.2% for limited associations and association of class, respectively (with all other conclusions producing PVs between these extremes). After data transformation based on ground truth, the case study materials show a false-positive rate of 0.48%, a false-negative rate of 15.6%, a (correct) positive predictive value of 98.8%, and a (correct) negative predictive value of 93.3%. In addition to error rates and PVs, inter-rater reliability was likewise computed to describe examiner reproducibility; results indicate a Gwet AC2 agreement coefficient of 0.751-0.692 when using a six- and four-level reporting structure, respectively, which translates into "substantial" and "moderate agreement" for a benchmarked verbal equivalent scale. The reported performance metrics are further compared against past forensic footwear reliability studies, including a discussion of how the use of a six-level reporting structure impacts results.
Article
Full-text available
Chromatic induction compellingly demonstrates that chromatic context as well as spectral lights reflected from an object determines its color appearance. Here, we show that when one colored object moves around an identical stationary object, the perceived saturation of the stationary object decreases dramatically whereas the saturation of the moving object increases. These color appearance shifts in the opposite directions suggest that normalization induced by the object’s motion may mediate the shift in color appearance. We ruled out other plausible alternatives such as local adaptation, attention, and transient neural responses that could explain the color shift without assuming interaction between color and motion processing. These results demonstrate that the motion of an object affects both its own color appearance and the color appearance of a nearby object, suggesting a tight coupling between color and motion processing.
Article
Full-text available
Categorization with basic color terms is an intuitive and universal aspect of color perception. Yet research on visual working memory capacity has largely assumed that only continuous estimates within color space are relevant to memory. As a result, the influence of color categories on working memory remains unknown. We propose a dual content model of color representation in which color matches to objects that are either present (perception) or absent (memory) integrate category representations along with estimates of specific values on a continuous scale ("particulars"). We develop and test the model through 4 experiments. In a first experiment pair, participants reproduce a color target, both with and without a delay, using a recently influential estimation paradigm. In a second experiment pair, we use standard methods in color perception to identify boundary and focal colors in the stimulus set. The main results are that responses drawn from working memory are significantly biased away from category boundaries and toward category centers. Importantly, the same pattern of results is present without a memory delay. The proposed dual content model parsimoniously explains these results, and it should replace prevailing single content models in studies of visual working memory. More broadly, the model and the results demonstrate how the main consequence of visual working memory maintenance is the amplification of category related biases and stimulus-specific variability that originate in perception. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Article
A survey of 200 random automobile paint chips with a standard forensic laboratory examination procedure (microscopic examination, solvent reactivity tests, instrumental analysis of organic constituents, and instrumental analysis of inorganic constituents) served to differentiate all samples. Specimen characteristics were tabulated throughout the examination to evaluate their frequency of occurrence in samples similar to those encountered in case work. Ninety-four percent of the samples were differentiated by microscopic examination and solvent reactivity tests. Of the remaining 6% that were undifferentiated, none of the paint chips had more than three layers. This observation leads to the conclusion that the probability of two paint chips originating from different sources is extremely remote when they have numerous layers (six or more) consistent in color, tint, type of finish, layer thickness, and reaction to acetone, sulfuric acid, and diphenylamine test solution. Ninety-seven percent of the samples were differentiated without the use of elemental analysis. For maximum differentiation capability, at least one instrumental analysis technique for organic components and one for elemental components should be incorporated into the analytical scheme when sample size permits. A review of the literature reveals the lack of frequency of occurrence studies for automobile paint evidence. Additional studies, similar to the one presented here, are being initiated to further assess the statistical value of this type of class evidence.
Article
Frequency of occurrence data are valuable to forensic scientists in their quest to properly assess the evidential value of automotive paint exhibits. A survey of automobile topcoat paint colors is presented with a sample size of 43,000 vehicles covering six eastern states in the United States. In addition, the results of a second survey utilizing more discriminating color groups for 2000 vehicles in central Florida is given. Comments on statistical validity and sample size for frequency distribution studies are provided.
Book
This book offers unique and valuable contributions to the field. It offers breadth and inclusiveness. Most existing works on automotive painting cover only a single aspect of this complex topic, such as the chemistry of paint or paint booth technology. Monozukuri and Hitozukuri are Japanese terms that can be translated as making things and developing people but their implications in Japanese are richer and more complex than this minimal translation would indicate. The Monozukuri-Hitozukuri perspective is drawn from essential principles on which the Toyota approach to problem-solving and continuous improvement is based. From this perspective, neither painting technology R&D nor painting technology use in manufacturing can be done successfully without integrating technological and human concerns involved with making and learning in the broadest sense, as the hyphen is meant to indicate. The editors provide case studies and examples - Drawn from Mr. Toda's 33 years of experience with automotive painting at Toyota and from Dr. Saito's 18 years experience with IR4TD, the research-for-development group he leads at the University of Kentucky - That give details on how these two principles can be integrated for successful problem-solving and innovation in industry, in university R&D, and in the collaboration between the two. The book will bring readers up to date on progress in the field over the last decade to provide a basis for and to indicate fruitful directions in future R&D and technology innovation for automotive painting. © 2013 Springer Science+Business Media Dordrecht. All rights reserved.
Article
On rare occasions, the forensic chemist will have data to support an opinion. This study contains data relative to over 7000 vehicles in the Province of New Brunswick. The vehicles were classified by manufacturer and topcoat colour, and a frequency distribution of these two parameters was compiled. Finally, this study compares the data collected during the survey with similar data collected by the Registrar of Motor Vehicles for the Province of New Brunswick.RÉSUMÉEn de rares occasions, le chimiste judiciaire a à sa disposition les données nécessaires pour soutenir son opinion. Cette étude contient des données concernant plus de 7000 véhicules de la province du Nouveau-Brunswick. Les véhicules furent classifiés selon le manufacturier et la couleur extérieure et une fréquence de la distribution de ces deux paramètres fut compilée. Enfin, ces données furent comparées à d'autres reçues de la division des véhicules moteurs du Nouveau-Brunswick.
Article
The results of two vehicle colour surveys, one conducted in the Province of New Brunswick, and one in the Province of Prince Edward Island, are presented. Over 9800 vehicles in New Brunswick and 2500 in Prince Edward Island were classified by both manufacturer and topcoat colour. Discrimination was expanded from the previous study (1) to include cars and trucks with both metallic and non-metallic paint finishes. Frequency distributions were calculated for these categories.
Article
Statistical interpretations of automotive paint evidence are limited, in part, to the existence and accuracy of a colour distribution data base pertinent to the geographical area of concern. Efforts have been undertaken within the R.C.M. Police laboratories to compile a data base of “on the road” automotive topcoat paint colours. A total of 16,074 vehicles were assessed in Vancouver, Regina and Halifax. Colour distributions, as determined by these surveys, are compared to distribution statistics compiled provincially at the time of motor vehicle registration.