Conference PaperPDF Available

Automated Scoring of a Neuropsychological Test: The Rey Osterrieth Complex Figure.


Abstract and Figures

The Rey Osterrieth Complex Figure (ROCF) is a widely used neuropsychological test for visual perception and long term visual memory. Many scoring systems are used to quantify the accuracy of the drawings; these are currently implemented by hand in a subjective manner. The paper gives details of the current progress of a novel technique to locate the scoring sections of the most common of these systems (the Osterrieth Scoring System), with the ultimate goal of automating the scoring system. High levels of distortion are possible, making this an extremely difficult task; however location and perceptual grading of the basic geometric features (triangles, rectangles and diamonds) have been most successful. All but one section in the test data was located (99.3% success) and 78% of the perceptual grades calculated were within 5% of grades generated by independent raters. Unary spatial metrics have been implemented to reduce the possible section candidates by an average of 75% without the loss of a single section
Content may be subject to copyright.
Automated Scoring of a Neuropsychological Test:
The Rey Osterrieth Complex Figure.
R.O.Canham S.L. Smith and A.M. Tyrrell
Department of Electronics, University of York,
Heslington, York, England.
The Rey Osterrieth Complex Figure (ROCF) is a widely
used neuropsychological test for visual perception and long
term visual memory. Many scoring systems are used to
quantify the accuracy of the drawings; these are currently
implemented by hand in a subjective manner. This paper
gives details of the current progress of a novel technique
to locate the scoring sections of the most common of these
system (the Osterrieth Scoring System), with the ultimate
goal to automating the scoring system. High levels of dis-
tortion are possible making this an extremely difficult task;
however, location and perceptual grading of the basic ge-
ometric features (triangles, rectangles and diamonds) have
been most successful. All but one section in the test data was
located (99.3% success) and 78% of the perceptual grades
calculated were within 5% of grades generated by indepen-
dent raters. Unary spatial metrics have been implemented
to reduce the possible section candidates by an average of
75% without the loss of a single section.
1. Introduction
Neuropsychologists (concerned with the behavioural ex-
pression of brain dysfunction [19]) make use of many tests
when assessing neurological dysfunction in a subject. Much
information can be obtained by the use of advanced scan-
ning techniques (such as CAT and MRI scans); however,
there are still cases where a simple paper and pencil test can
give additional, valuable information. In many situations
the size of a lesion (a localised abnormal tissue change)
does not accurately reflect the degree of dysfunction. Neu-
ropsychological tests can provide valuable data concerning
the progress of patients through treatment and provide a
key tool for research into the organisation of brain activ-
Supported by the Engineering and Physical Sciences Research Coun-
ity and its translation into behaviour, brain disorders and
behavioural disabilities. One such test is the Rey Oster-
rieth Figure (ROCF), which was devised by Rey [24] and
standardised by Osterrieth [24] to test visual perception and
long term visual memory function. It is used as a neurologi-
cal evaluation toolfor both children and adults for a diverse
number of conditions from child developmental problems
to dementia, trauma and infectious processes. The test re-
quires the subject to copy, and later reproduce from memory
the diagram shown in figure 1.
Figure 1. The Rey-Osterrieth Complex Figure.
Typically 20cm in length.
The ROCF is widely used, both in a clinical and research
environment and numerous studies have been performed
upon it (see [19, 29] for extensive lists). The order and ac-
curacy in which the figure is copied and drawn from recall
provides useful information concerning the location and ex-
tent of any damage. To derive a more quantitative value for
the accuracy of a subject’s drawing various scoring systems
1: Cross
2: Large Rectangle
3: Diagonal Cross
4: Horizontal Line
5: Vertical Line
6: Small Rectangle
7: Small Segment
9: Triangle
10: Line
11: Circle with 3 Dots
12: Parallel Lines
13: Triangle
14: Diamond
15: Line
16: Line
17: Cross
18: Square
Figure 2. The Osterrieth Scoring System
Unit Correct Placed Properly 2
Placed Poorly 1
Unit Distorted, incomplete but recognisable Placed Properly 1
Placed Poorly
Absent or Unrecognisable 0
Table 1. Osterrieth marking allocation
have been employed. The most widely used is the Osterrieth
system [19, 24]; the figure is split in to eighteen identifiable
areas (see figure 2), each of which is considered separately
and marked on the accuracy of its position and the distortion
exhibited, using the scale shown in table 1.
Limitations of this scoring system include the lack of or-
ganisational information (such as whether the drawing was
produced in a piecemeal or a logical fashion) and failure
to differentiate the diagnostic importance of different sec-
tions. Consequently a number of scoring systems have
been developed, including those by Waber and Holmes
[31, 32], Bennett-Levy [2], Hamby [14], Fastenau [10] and
The Boston Qualitative Scoring System by Stern et al [28].
These scoring systems are currently performed by hand
in what tends to be a subjective manner, which is open to
interpretation. The Osterrieth system was accurately de-
fined by Taylor (reproduced in [27]); however this is not
universally adhered to. The individual scoring sections tend
to have very poor inter-rater reliability [29] and the system
has been criticised for its lack of thorough testing [10]. It
has also been noted that to aid marking, the criteria have
been set artificially strict or lenient [2]. Various aspects
of the Waber and Holmes’ system have also produced low
inter-rater reliability. The Boston Qualitative Scoring Sys-
tem makes use of guides and templates to produce a very
comprehensive score; however, one drawing takes between
five and 15 minutes to mark.
It is proposed to produce an automated implementa-
tion of the Osterrieth scoring system. Such an automation
would not only provide an objective and consistent result
but would also alleviate a highly skilled clinician from a te-
dious and time consuming task.
Recording a subjects’ drawing using a digitising tablet
would provide an unobtrusive method of recording the con-
structional sequence that is required in some scoring sys-
tems (no satisfactory process is currently available). How-
ever, the tablet also records dynamic data which has been
shown to contain valuable information on simpler neu-
ropsychological copying test [9], opening up an interesting
avenue of research.
The first step in this automation is the location of the
relevant scoring sections within an off-line, scanned image.
This paper gives an overview of the current progress of this
work with a detailed description of the ROCF to place the
work in context. Full technical details can be found in [6, 5]
by the authors.
Section 2 of the paper gives a review of previous work
and provides an overview of the difficultiesof the problem.
The proposed technique is given in section 3 and the results
are shown in section 4. Finally the paper’s conclusions are
given in section 5.
2. Overview of Problems and Previous Work
Automating the location of the ROCF scoringsections is
a very difficult problem. The ROCF is, by definition, a com-
plex figure and the reproductions by patients typically have
very high levels of distortion, many of which have clinical
implications. The figure can be drawn in a piecemeal fash-
ion, with sections misplaced, repeated or missing altogether.
Sections can have large gaps in the sides or corners, be con-
structed using multiple strokes, be squashed or twisted with
curved or stepped sides.
Hand drawn line figures and sketches are generated in a
number of applications and a great deal of work has been
performed to provide robust techniques to interpret them.
Both on and off-line applications have been considered, in-
cluding computer graphical user interfaces [7, 26] and con-
versions for CAD input or for tidying plans or schematics
[3, 22, 4, 23, 13, 1]. These images contain an inherent de-
gree of distortion and inaccuracy that the techniques must
be able to accommodate, however, the distortion produced
by the ROCF is beyond the capabilities of these systems.
Neural networks [20, 30] and other adaptive/learning tech-
niques [16, 11, 12, 17] have been applied to hand drawn
figures and similar applications but with much simpler data
sets with, by comparison to the ROCF, minimal distortion.
With such a vast degree of variation possible in the ROCF it
would be difficult to produce the large training set required
for such systems and no suitable technique has been identi-
Although much recent effort in computer vision tech-
niques has, understandably, been aimed at adaptive and au-
tomated systems, there is a large body of work that reduces
an image into a complex line image before applying a suit-
able knowledge based system (see [8] for a comprehensive
survey). Many make use of Gestalt psychology to iden-
tify features; a system that identifies perceptually signifi-
cant grouping properties in human vision based upon fea-
tures such as co-termination, continuation along a straight
or smoothly changing path, symmetry and closure [15]. A
survey of perceptual organisation in computer vision can
be found in [25]. There is a significant difference between
these computer vision systems and the application detailed
here; the computer vision system is inherently probabilistic
in nature, where a fundamentally correct image is distorted
by noise, optical imperfections or problems associated with
the feature extraction, while the ROCF data is inherently
fuzzy in nature.
3. Approach Adopted and Implementation to
The approach taken to locate the scoring sections is to
first identify all the suitable basic geometric shapes within
the figure. In order to facilitate development, only scoring
sections based on triangles, rectangles, diamonds and sim-
ple lines are to be considered at this time. However, this will
still demonstrate the suitability of the techniques employed.
With such large levels of distortion it is difficult to crisply
categorise a shape as being present or not and so a grade of
perceptual distortion is calculated which not only provides
a scale to which a cut off point can be assigned but can also
be used as a metric in further processing.
3.1. Basic Geometric Feature Location and Rating
The geometric features are located by the search of an
Attributed Relational Graph (ARG) that is generated to rep-
resent the vectorised binary scanned image. ARGs have
been used in similar applications such as [22, 21]. The vec-
torisation is a simple process of thinning the image and then
applying a line following algorithm. Line segments that
are collinear are then joined using a novel collinear metric
based upon fuzzy metrics of the lines closeness and differ-
ence in angle [6].
The ARG is constructed to represent the connectivity
of the collinear lines; each node of the ARG represent a
collinear line and the joining arcs represent a connected
line. However, due to the distortion the connectivity is of-
ten broken and so a ‘closeness’ metric is used. The ARG is
searched independently for each shape. Each node is used
as a starting point and all the connected nodes are tested. If
the test succeeds that line becomes the current line and the
process is repeated, while failure results in chronological
backtracking to perform an exhaustive depth first search.
The test is in the form of a set of production rules that
describes a corner, a line continuation and a corner trunca-
tion. The distortion can be so high that the lines construct-
ing a straight side can have such a large deviation that un-
der other circumstances it could be considered a corner and
so the joining of line segments to form collinear lines can
only be practised on strongly collinear lines. To accommo-
date continuous sides with greater distortions it is necessary
to include a definition of a straight line continuation in the
ARG search.
A common problem with the large rectangular scoring
section (section number 2) is the truncation of a corner as
shown in figure 3. Another rule set has been created to ac-
count for a truncated corner.
Figure 3. Example of a truncated rectangle
If the original node is encountered and the correct num-
ber of corners have been found then the shape data is stored.
Each candidate shape found is then ‘rated’ to describe
its level of perceptual distortion. This is naturally described
by fuzzy sets and linguistic variables [33] that allow an in-
tuitive description of a shape to be produced based upon
Gestalt principles of perceptually significant features. This
calculates a membership to rate how “good” the shape is.
The metric used is the collinearity of each side, based
upon the difference of angle and closeness of line termina-
tion. However, this gives a ‘local’ collinearity metric and
so a global metric of the maximum perpendicular deviation
from the straight path is also included. Corner properties
of the deviation from (for rectangles) and closeness of
line termination are used together with a symmetry metric
for diamonds. Each metric is combined using an appropri-
ately weighted Yager intersection function [18]. Many of
the metrics make use of relative sizes and distances (hence
the rating process could not be integrated in to the location
process since the size of the feature is required). However,
it was found that relative measure had a tendency to score
very large or very small structures either too leniently or
harshly; there is a more complex relationship due to the
figure’s overall size, and so an absolute element was also
included. The parameters used to control the fuzzy metrics
are currently set by hand; however,they are in a format that
can be automated using a suitable genetic algorithm.
Due to the multi-stroke nature of the drawn figures it is
possible for a number of very similar features to be found
and categorised as seperate shapes. A fuzzy closeness met-
ric is used to group similar features into a single element.
For full technical details see [6].
3.2. Identification of Scoring Sections
The collection of geometric shapes must now be exam-
ined to identify the correct features. This is a considerable
problem, since the variation of the figure is extreme. There
is no guaranteed datum within the figure and most metrics
must be relative to an uncertain base. The computational
expense of calculating these relative metrics is great and so
a first pass is performed to remove the most unsuitable fea-
tures using unary metrics where only absolute features are
considered. The goal of this process is to remove as many
unsuitable features as possible without the loss of any scor-
ing sections.
To locate the sections a number of basic spatial relations
are considered, grouped in to the approximate categories of
position, orientation, size and basic features. Once again
fuzzy logic is employed to accommodate the distortion and
variations possible in a natural and intuitive manner.
Basic Shape Features. The most fundamental basic
shape feature is the shape type itself, whether it is a triangle,
rectangle or diamond. Rectangles have an aspect ratio while
triangles can be a right angled triangle and have symmetry
in a given plane.
Size. The size of the feature is taken as the ratio of the
feature’s area and the total figure area to give an indication
of the size in both planes. However, the square root is taken
to give a more linear attribute.
Position. The position of a feature is considered indepen-
dently in the x and y plane since it is possible for it to be
correctly positioned in one but not the other plane. All cor-
ner points are considered and the worse case used.
Figure 4. Example of an incorrectly placed tri-
angle with symmetry
Orientation. The basic orientation of a feature is a func-
tion of its side angles compared to the orientation angle be-
ing considered. Diamond features used the angle of the
axes. Triangles also have directional orientations. A tri-
angle with a vertical side can be facing to the left or right.
Using triangular section 13 in figure 2 as an example, it is
clear to see that it should have a right facing orientation.
Triangles with a horizontal side can also have an up and
down facing orientation and a suitable right angle triangle
a. b.
c. d.
Figure 5. Example of pre-processing steps. a. original b. thinned image c. vector representation
(consists of 241 lines) d. collinear lines (consist of 119 lines)
can have both a left/right and up/down orientation (see sec-
tion 9 in figure 2).
The location of the features can be missed-placed but
must still be identified. Symmetry is identified by Gestalt
psychology [15] as an important factor. Hence a miss-
placed feature with symmetrical properties to its correct lo-
cation must be considered as more significant that one with-
out. Thus when considering the direction of a triangle, the
direction from the centre line is also include, again in the
appropriate plane. If the triangular section 13 is used as an
example and placed in a symmetrical position as shown in
figure 4 then it will face away from the centre line.
The metrics calculated for each section are aggregated to
form a single measure using a generalised mean [18]. See
[5] for further technical details.
4. Results
The techniques were tested using a random sample of
31 drawings of the ROCF produced by children attending
the Institute of Child Health, London, who displayed a typ-
ical spread of illnesses seen by the neuropsychological unit.
Of these, 16 drawings were produced by copying the figure
and 15 from recall. Only the rectangular, triangular and di-
amond scoring sections were considered, which constitute
scoring sections 2, 6, 9, 13, 14 and 18 (see figure 2).
Of the 31 drawings many had sections missing or so
highly distorted that it is not possible to locate them at this
stage in the processing and are only identifiable by a human
observer with use of contextual information. Hence a total
of 140 scoring sections were considered; their composition
is shown in table 2.
Section number
under consideration 2 6 9 13 14 18
Number of
sections present 28 20 23 26 22 21
Table 2. Composition of scoring sections
The pre-processing stages of thinning, vectorisation and
grouping of collinear line segments performed well, reduc-
ing the mean number of collinear lines to 126 from 286 orig-
inal lines. Some examples are given in figure 5.
The location process also performed very well, locating
all sections except one (99.3% success). The nature of the
distortion of the missed feature is such that it is very difficult
to locate and hence it is best located in a later processing
stage with the aid of contextual information. A selection of
example scoring sections found are given in figure 6.
The perceptual grades calculated by the process were
compared to scores generated by six independent raters. To
simplify this grading process the raters where asked to grade
each shape into a class that mapped onto a band of values
within the automated scale. If the calculated grade fell out-
side the band then the distance to the edge of the band was
expressed as a percentage error. The subjective results were
quite diverse and so the modal average was taken to remove
the extreme scores.
a. b.
c. d.
e. f.
Figure 6. Example of scoring sections located. Rectangle 2 and diamond 14 are highlighted in (a),
(c) and (e). Rectangle 6, triangle 9, triangle 13 and square 18 are highlighted (if present) in (b), (d)
and (f).
The rating process did prove to generate very good ap-
proximations for the degree of distortion. When compared
to the grades generated by the independent raters, some
78% of the calculated results had an error of 5% or less and
all but 2 features (98.6% of the data) had an error of 10% or
less. Table 3 shows the percentage of shapes with an error
of 5% and 10%, or less, broken down into the individual
scoring sections.
The average number of shapes found with a perceptual
grade higher than the working threshold was 53.5 rectan-
gles, 103.2 triangles and 48.8 diamonds per figure. The
unary metrics were used to discard features that were un-
suitable for each scoring section and hence discarded an av-
erage of 75% of these features without the loss of a single
scoring section. The breakdown for the individual scoring
sections is given in table 4.
It is noticeable that the rectangular scoring section 6 per-
formed less well compared to the other sections. This is
understandable when the data is examined; it is in a area of
high line density with a high degree of variability in size and
position possible for that feature. Hence it is not possible to
discard too many features without danger of discarding a
scoring section.
Section 2 6 9 13 14 18
Error 5% 75 75 77 81 77 86
Error 10% 100 95 100 100 95 100
Table 3. Percentage of features with calcu-
lated grades within given errors of indepen-
dent raters’ grades
number 2 6 9 13 14 18
reduction 68.0 50.4 71.6 81.8 89.0 88.9
Table 4. Percentage of features discarded, us-
ing unary metrics, for each scoring section
5. Conclusions
The Rey Osterrieth Complex Figure (ROCF) is a “pen
and paper” neuropsychological test used to evaluate neuro-
logical dysfunction in visual perception and long term vi-
sual memory. A subject is asked to copy the complex figure
and then reproduce it from memory. It is widely used in
research and clinical environments. The Osterrieth scoring
system is the most popular system of many scoring systems
available that produce a quantitative score for the accuracy
of the drawing. Currently the scoring is undertaken man-
ually in a subjective manner and has been criticised for its
unreliability in a number of publications. Automating the
scoring process, as described in the paper, produces an ob-
jective result and removes a time consuming and tedious
task from a skilled clinician (some schemes can take up to
15 minutes per figure).
The first stage of this automation is the ability to iden-
tify the scoring sections within the possibly highly distorted
figure. A novel process, that employs fuzzy metrics based
upon Gestalt psychology, has been described that locates
and grades the basic geometric shapes on a scale of per-
ceptual distortion. This process functioned extremely well,
locating all but one feature from a random set of test draw-
ings (99.3% success). The grading process also performed
well when compared to subjective grades produced by 6 in-
dependent raters; 75% of the features were within 5% of
the subjective grades and 98.6% within 10%. The process
to identify the relevant scoring section from within all the
geometric shapes found requires a computationally expen-
sive process using binary metrics. A set of unary metrics
have been implemented to remove unsuitable features and
hence speed the binary metric calculations. This process re-
moved an average of 75% of the features without removing
a single scoring section.
. Acknowledgements
The authors are extremely grateful to Dr Elizabeth Issacs
and her colleagues at the Institute of Child Health, London,
for access to example patient responses to the ROCF, and
their contribution to this work.
[1] A. Apte, V. Vo, and T. Kimura. Recognizing multistroke
geometric shapes: An experimental evaluation. Proc. 6th
ACM Symposium on User Interface and Software Technolo-
gies, 3:121 – 8, 1993.
[2] J. Bennett-Levy. Determinants of performance on the rey
osterrieth complex figure test: An analysis and a new tech-
nique for a single-case assessment. British Journal of Clini-
cal Psychology, 23:109 – 119, 1984.
[3] Y. Bin and Y. Baozong. A consistent attributed graph-based
hand-drawn circuit diagram reading system. Chinese Jour-
nal of Electronics, 4(1):1–11, 1995.
[4] H. Bunke. Attributed programmed graph grammars and
their applications to schematic diagram interpretation. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
4(6):574–582, 1982.
[5] R. Canham, S. Smith, and A. Tyrrell. Fuzzy spatial met-
rics for the identification of features from within a highly
distorted, complex line drawing. BMVC 2000. In Press.
[6] R. Canham, S. Smith, and A. Tyrrell. Recognition of
severely distorted geometric shapes from within a complex
figure. Pattern Analysis and Applications. In Press.
[7] C. Chen and S. Xie. Fuzzy freehand drawing system. IEEE
International Conference on Fuzzy Systems, II(367):1012 –
1017, 1994.
[8] D. Crevier and E. Lepag. Knowledge-based image under-
standing systems: A survey. Computer Vision and Image
Understanding, 67(2):161 – 185, 1997.
[9] M. Fairhurst and S. Smith. Application of image analysis to
neurological screening through figure-copying tasks. int J
Biomed Compt, 28:269 – 287, 1991.
[10] P. Fastenau, J. Bennett, and N. Denburg. Application of psy-
chometric standards to scoring system evaluation: Is ’new’
necessarily ’improved’? Journal of Clinical and Experimen-
tal Neuropsychology, 18(3):462 – 472, 1996.
[11] A. Fred, J. Marques, and P. Jorge. Hidden markov models
vs syntactical modeling in object recognition. Proceedings
of the International Conference on Image Processing, 1:892
– 6, 1997.
[12] R. Greenspan, R. Goodman, and C. Anderson. Learning tex-
ture discrimination rules in a multiresolution system. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
16(9):894 – 901, 1994.
[13] M. Gross. Recognizing and interpreting diagrams in design.
Proceeding of the Workshop on Advanced Visual Interfaces,
pages 88 – 94, 1994.
[14] S. Hamby, J. Wilkins, and N. Barry. Organisational quality
on the Rey-Osterrieth and Taylor Complex Figure Tests: a
new scoring system. Psychological Assessment, 5(1):27 –
33, 1993.
[15] E. Hilgard, R. Atkinson, and R. Atkinson. Introduction to
Psychology. Harcourt Brace Jovanovich Inc, 6 edition, 1975.
[16] M. Katagiri and M. Nagura. Recognition of line shapes us-
ing neural networks. IEICE - Transactions on Information
and Systems, 7:754 – 60, 1994.
[17] J. Keller and R. Kirshnapuram. Fuzzy decision models in
computer vision. In R. Yager and Z. Zadeh, editors, Fuzzy
sets, Neural Networks and Soft Computing, pages 213 – 232.
Van Nostrand, 1994.
[18] G. Klir and T. Folger. Fuzzy Sets, Uncertainty, and Informa-
tion. Prentice-Hall Internatinoal, 1988.
[19] M. Lezak. Neuropsychological Assessment. Oxford Univer-
sity Press, New York, 2 edition, 1983. pp395-402.
[20] Y. Liu. Encoding more difficult shapes using a neuron like
model. Neural Network World, 7(6):729 – 38, 1997.
[21] J. Llados, K. Lopez, and E. Marti. A system to under-
stand hand-drawn floor plans using subgraph isomorphism
and Hough Transforms. Machine Vision and Applications,
10:150 – 158, 1997.
[22] B. Messmer and H. Bunke. Automated learning and recog-
nition of graphical symbols in engineering drawings. Lec-
ture Notes in Computer Science: Graphics, Recognition,
Methods and Applications. First International Workshop.
Selected Papers, pages 123–134, 1996.
[23] B. Pasternak. Processing imprecise and structural distorted
line drawings by an adaptable drawing interpretation kernel.
International Association for Pattern Recognition systems,
pages 318 – 37, 1994.
[24] A. Rey and P. Osterrieth. Translations of excerpts
from Rey’s ‘Psychological Examination of Traumatic En-
cephalopathy’ and Osterrieth’s ‘The Complex Figure Test’.
The Clinical Neuropsychologist, 7:2–21, 1993.
[25] S. Sarker and K. Boyer. Perceptual organization in computer
vision: A review and a proposal for a classification struc-
ture. IEEE Transactions on Systems, Man and Cybernetics,
23(2):382 – 99, 1993.
[26] E. Saund and T. Moran. A perceptually-supported sketch ed-
itor. UIST ’94. The Seventh Annual Symposium on User In-
terface Software and Technology. Proceedings, pages 175–
84, 1994.
[27] O. Spreen and E. Strous. A Compendium of Neurological
Tests. Administration Norms and Commentary. Oxford Uni-
versity Press, 1991.
[28] R. Stern, L. Singer, N. Duke, C. Morey, E. Daughtrey, and
E. Kaplan. The Boston Qualitative Scoring System for the
Rey Osterrieth Complex Figure: Description and inter-rater
reliability. The clinical Neuropsychologist, 8(3):309 – 322,
[29] L. Tupler, K. Welsh, Y. Asare-Aboagye, and D. Dawson. Re-
liability of the Rey-Osterrieth Complex Figure in use with
memory - impaired patients. Journal of Clinical and Exper-
imental Neuropsychology, 17(4):566 – 579, 1995.
[30] F. Ulgen, A. Flavell, and N. Akamatsu. On-line shape recog-
nition with incremental training using binary weights algo-
rithm. Applied Intelligence, 6(3):225 – 240, 1996.
[31] D. Waber and J. Holmes. Assessing children’s copy produc-
tion of the Rey-Osterrieth Complex Figure. Journal of Clin-
ical and Experimental Neuropsychology, 7(3):264 – 280,
[32] D. Waber and J. Holmes. Assessing children’s memory
production of the Rey-Osterrieth Complex Figure. Journal
of Clinical and Experimental Neuropsychology, 8(5):563 –
580, 1986.
[33] L. Zadeh. Fuzzy sets. Information and Control, 8(3):338 –
353, 1965.
... ; /2022 One major limitation of this quantitative scoring system is that the criteria of what position and distortion is considered "accurate" or "inaccurate" may vary from clinician to clinician (Groth-Marnat 2000;Shin et al. 2006b;Canham, Smith, and Tyrrell 2000a). In addition, the scoring might vary as a function of motivation and tiredness or because the clinicians may be unwittingly influenced by interaction with the patient. ...
... Given the wide application of the ROCF, it is not surprising that we are not the first to take steps towards a machine-based scoring system. (Canham, Smith, and Tyrrell 2000b) have developed an algorithm to automatically identify a selection of parts of the ROCF (6 of 18 elements) with great precision. This approach provides first evidence for the feasibility of automated segmentation and feature recognition of the ROCF. ...
... Figure 1). Our analysis (see section: Clinicians' Scoring) and previous work (Canham, Smith, and Tyrrell 2000a) suggested that the scoring conducted by clinicians may not be consistent, because the clinicians may be unwittingly influenced by the interaction with the patient/participant or the scoring might vary as a function of their motivation and tiredness. For this reason, we have harnessed a large pool (~5000) of human workers (crowdsourced human intelligence). ...
Memory deficits are a hallmark of many different neurological and psychiatric conditions. The Rey-Osterrieth complex figure (ROCF) is the state-of-the-art assessment tool for neuropsychologists across the globe to assess the degree of non-verbal visual memory deterioration. To obtain a score, a trained clinician inspects a patient's ROCF drawing and quantifies deviations from the original figure. This manual procedure is time-consuming, slow and scores vary depending on the clinician's experience, motivation and tiredness. Here, we leverage novel deep learning architectures to automatize the rating of memory deficits. For this, a multi-head convolutional neural network was trained on 20225 ROCF drawings. Unbiased ground truth ROCF scores were obtained from crowdsourced human intelligence. The neural network outperforms both online raters and clinicians. Our AI-powered scoring system provides healthcare institutions worldwide with a digital tool to assess objectively, reliably and time-efficiently the performance in the ROCF test from hand-drawn images.
... Several automated scoring systems based on multi-stroke sketch recognition and machine learning have been published previously (Canham et al., 2000;Davis et al., 2015;Niemann et al., 2018;. Such systems typically score assessments based on their scoring schemes, thereby simulating the work of a clinician, and need to include task dependent knowledge for each type of assessment. ...
... MoCA (Nasreddine et al., 2005) 10 min 17% Clock, digits, lines ROCF (Canham et al., 2000) 15 min 100% Circles, rectangles, triangles, ... ...
... An even more complex example of a neuropsychological assessment, that is rated entirely based on pen input, is the ROCF (Duley et al., 1993;Canham et al., 2000). A printed Rey-Osterrieth figure template is presented to the subject, who is then asked to copy it onto a blank piece of paper (see bottom of Figure 1). ...
Full-text available
Digital pen features model characteristics of sketches and user behavior, and can be used for various supervised machine learning (ML) applications, such as multi-stroke sketch recognition and user modeling. In this work, we use a state-of-the-art set of more than 170 digital pen features, which we implement and make publicly available. The feature set is evaluated in the use case of analyzing paper-pencil-based neurocognitive assessments in the medical domain. Most cognitive assessments, for dementia screening for example, are conducted with a pen on normal paper. We record these tests with a digital pen as part of a new interactive cognitive assessment tool with automatic analysis of pen input. The physician can, first, observe the sketching process in real-time on a mobile tablet, e.g., in telemedicine settings or to follow Covid-19 distancing regulations. Second, the results of an automatic test analysis are presented to the physician in real-time, thereby reducing manual scoring effort and producing objective reports. As part of our evaluation we examine how accurately different feature-based, supervised ML models can automatically score cognitive tests, with and without semantic content analysis. A series of ML-based sketch recognition experiments is conducted, evaluating 10 modern off-the-shelf ML classifiers (i.e., SVMs, Deep Learning, etc.) on a sketch data set which we recorded with 40 subjects from a geriatrics daycare clinic. In addition, an automated ML approach (AutoML) is explored for fine-tuning and optimizing classification performance on the data set, achieving superior recognition accuracies. Using standard ML techniques our feature set outperforms all previous approaches on the cognitive tests considered, i.e., the Clock Drawing Test, the Rey-Osterrieth Complex Figure Test, and the Trail Making Test, by automatically scoring cognitive tests with up to 87.5% accuracy in a binary classification task.
... Visual analysis techniques include studies that attempt to analyze completed responses of graphomotor-based neuropsychological tasks such as the Necker's cube [52], ROCF [53], BGT [54], and CDT [55]. Most of these tests include geometrically inspired shapes that involve linearity, circularity, curvilinearity and angularity components, and aim at the visual-perceptual orientation of the subjects. ...
... Heuristic-based techniques have also been employed in the literature [53,58,63,69,91]. Their prime objective is to provide explainable solutions for the target users, i.e. the clinical practitioners. ...
... We have observed that studies [52,53,91] that evaluate the quality of primitive components or analyze their spatial organization [58,63] are characterized by the challenges of localization and segmentation. The localization of the intended segment is mainly affected by the inherent imprecision and ambiguity of a free-hand drawing. ...
Full-text available
To date, Artificial Intelligence systems for handwriting and drawing analysis have primarily targeted domains such as writer identification and sketch recognition. Conversely, the automatic characterization of graphomotor patterns as biomarkers of brain health is a relatively less explored research area. Despite its importance, the work done in this direction is limited and sporadic. This paper aims to provide a survey of related work to provide guidance to novice researchers and highlight relevant study contributions. The literature has been grouped into 'visual analysis techniques' and 'procedural analysis techniques'. Visual analysis techniques evaluate offline samples of a graphomotor response after completion. On the other hand, procedural analysis techniques focus on the dynamic processes involved in producing a graphomotor reaction. Since the primary goal of both families of strategies is to represent domain knowledge effectively, the paper also outlines the commonly employed handwriting representation and estimation methods presented in the literature and discusses their strengths and weaknesses. It also highlights the limitations of existing processes and the challenges commonly faced when designing such systems. High-level directions for further research conclude the paper.
... The ROCF test [6] is based on a 36-point scoring system meaning that the final score is based on 18 different features of the figure. In order to commit the image to memory [1], the users should simplify it in their minds in simpler and easy-to-recall shapes. ...
... The users see three buttons. The [6] image divided into simpler shapes [1] "Home" button returns the user to the original image, shown in the first step. The "Clear" button cleans the canvas and erases the user's drawing. ...
... The ROCF test [6] uses a 36-point scoring method [7]. That is the reason why, as mentioned, the final score is based on 18 different features of the figure [1]. The test's scoring method is based on the users' ability to memorize an image and to recreate it as accurately as possible. ...
Neuroeducation of the human brain can be achieved with modern means and even with the use of computer applications. Thus, the idea for a painting application was born. This application focuses on the recreation of a two-dimensional geometric shape with fingers. Users are asked to draw a simple image with geometric shapes, from memory with the primary goal of correctly displaying the shapes of the original one. Based on the Rey-Osterrieth complex figure test, users are confronted with an image for a few seconds and are asked to locate the shapes in that image as well as to represent it as accurately as possible. The main concern is the categorization of users based on their score with the aim of mentally stimulating and training their brain. This categorization depends on how many shapes of the original image they have identified as well as how they have been designed. The main purpose of the application is the neuropsychological evaluation of its users, where it could be used to study information such as the degree of cognitive development in patients.
... However, methods for directly predicting RCFT scores based on the 36-point scoring system, which is widely used in clinical elds, have been very limited. A method to score the RCFT was rstly developed by segmenting six relevant scoring sections [27]. However, it offered only six of the 18 scoring sections, so could not be applied to the 36-point scoring system. ...
... However, little attention has been given to direct automatic scoring that was comparable to human experts' scoring. Although several attempts have been made to develop the RCFT scoring systems, none of the studies have been reported scoring systems comparable to human experts and su ciently validated [27,28]. Generally, DL methods have been proven to outperform other methods in terms of prediction and improve generalization if a su ciently large dataset was guaranteed [20,31]. ...
Full-text available
Background: The Rey Complex Figure Test (RCFT) has been widely used to evaluate neurocognitive functions in various clinical groups with a broad range of ages. However, despite its usefulness, the scoring method is as complex as the figure. Such a complicated scoring system can lead to the risk of reducing the extent of agreement among raters. Although several attempts have been made to use RCFT in clinical settings in a digitalized format, little attention has been given to develop direct automatic scoring that is comparable to experienced psychologists. Therefore, we aimed to develop an artificial intelligence (AI) scoring system for RCFT using a deep learning (DL) algorithm and confirmed its validity. Methods: A total of 6,680 subjects were enrolled in the Gwangju Alzheimer’s and Related Dementia cohort registry, Korea from January 2015 to June 2021. We obtained 20,040 scanned images using three images per subject (copy, immediate recall, and delayed recall) and scores rated by 32 experienced psychologists. We trained the automated scoring system using the DenseNet architecture. To increase the model performance, we improved the quality of training data by re-examining some images with poor results (mean absolute error (MAE) 5 [points]) and re-trained our model. Finally, we conducted an external validation with 150 images scored by five experienced psychologists. Results: For five-fold cross-validation, our first model obtained MAE = 1.24 [points] and R-squared ( ) = 0.977. However, after evaluating and updating the model, the performance of the final model was improved (MAE = 0.95 [points], = 0.986). Predicted scores among cognitively normal, mild cognitive impairment, and dementia were significantly differed. For the 150 independent test sets, the MAE and between AI and average scores by five human experts was 0.64 [points] and 0.994, respectively. Conclusion: We concluded that there was no fundamental difference between the rating scores of experienced psychologists and those of our AI scoring system. We expect that our AI psychologist will be able to contribute to screen the early stages of Alzheimer’s disease pathology in medical checkup centers or large-scale community-based research institutes in a faster and cost-effective way.
... The order and accuracy in which the RCF is copied and drawn from the recall can provide information on the location and extent of neuropsychological disorder. 6 Studies using RCF have revealed visual memory disturbances in individuals with dementia. Poorer copying of a figure by a given patient in comparison with that by normal healthy controls could suggest Alzheimer's disease (AD). ...
... 11 In the 36-point version of the RCF scoring system, the figure is split into eighteen identifiable areas. 6,8 Each line category is considered separately and marked on the accuracy of its position and the exhibited distortion with scales shown in Appendix 1. The clinical decision was a major criterion for differentiating MCI and mild dementia subjects. ...
Background and purpose: Interpreting the Rey complex figure (RCF) requires a standard RCF scoring system and clinical decision by clinicians. The interpretation of RCF using clinical decision by clinicians might not be accurate in the diagnosing of mild cognitive impairment (MCI) or dementia patients in comparison with the RCF scoring system. For this reason, a machine-learning algorithm was used to demonstrate that scoring RCF using clinical decision is not as accurate as of the RCF scoring system in predicting MCI or mild dementia patients from normal subjects. Methods: The RCF dataset consisted of 2,232 subjects with formal neuropsychological assessments. The RCF dataset was classified into 2 datasets. The first dataset was to compare normal vs. abnormal and the second dataset was to compare normal vs. MCI vs. mild dementia. Models were trained using a convolutional neural network for machine learning. Receiver operating characteristic curves were used to compare the sensitivity, specificity, and area under the curve (AUC) of models. Results: The trained model's accuracy for predicting cognitive states was 96% with the first dataset (normal vs. abnormal) and 88% with the second dataset (normal vs. MCI vs. mild dementia). The model had a sensitivity of 85% for detecting abnormal with an AUC of 0.847 with the first dataset. It had a sensitivity of 78% for detecting MCI or mild dementia with an AUC of 0.778 with the second dataset. Conclusions: Based on this study, the RCF scoring system has the potential to present more accurate criteria than the clinical decision for distinguishing cognitive impairment among patients.
... The standard Rey-Osterrieth complex figure task will assess local and global processing in participants differing in autistic and ADHD traits. Although tablet devices have been used previously for the task in autism and ADHD (Canham et al., 2000;Hyun et al., 2018), none of them have yet combined spatial and temporal drawing measurements. Therefore, we cannot predict how temporal measures will be different between our participant groups. ...
Full-text available
This paper describes a smart tablet-based drawing app to digitally record participants’ engagement with the Rey-Osterrieth complex figure (ROCF) task, a well-characterised perceptual memory task that assesses local and global memory. Digitisation of the tasks allows for improved ecological validity, especially in children attracted to tablet devices. Further, digital translation of the tasks affords new measures, including accuracy and computation of the fine motor control kinematics employed to carry out the drawing Here, we report a feasibility study to test the relationship between two neurodevelopmental conditions: autism spectrum disorder (ASD) and attention-deficit/hyperactivity disorder (ADHD). The smart tablet app was employed with 39 adult participants (18-35) characterised for autistic and ADHD traits, and scored using the ROCF perceptual and organisational scoring systems. Trait scores and conditions were predictor variables in linear regression models. Positive correlations were found between the attention-to-detail, attention-switching and communication subscales of the autistic trait questionnaire and organisational scores on the ROCF task. These findings suggest that autistic traits might be linked to differential performance on the ROCF task. Novelty and future applications of the app are discussed.
Mild Cognitive Impairment (MCI) is a condition which may lead to a more serious neurodegenerative disease called dementia, affecting between 12% and 18% of total Global population aged 60 or older. Neuropsychological tests conducted by professionals allow for early detection of MCI and early treatment of this condition to prevent further development. Several authors have attempted to automate the assessment process of these types of tests, which enables a faster screening of the population and therefore a better prevention of the symptoms of neurodegenerative diseases. However, most of the works published by previous authors rely on classical Machine Learning techniques, which require handcrafted features and their effectiveness depends on the quality of these features. Also, more advanced Deep Learning models used in the automation of these tests require high amounts of training data in order to be accurate, and they are also weak to noise and variability in the data. In this work, we propose a novel approach to automating one of these test called Rey-Osterrieth Complex Figure (ROCF) test, using Recursive Cortical Networks (RCN). The RCN framework provides an improvement over the disadvantages of previously mentioned techniques, presenting resilience to noise and variability, using an automatic hierarchical feature construction instead of hand-crafted features, while using a very small amount of training data. This work describes the properties of RCN and how they can be of use in the development of an automatic scoring algorithm for the ROCF test.
Full-text available
During early stages of development, the loss of a sensory system can lead to profound neural reorganization, specically leading to an enhancement of the remaining modalities, a phenomenon termed as cross-modal plasticity. Thus in the absence of hearing, the vision is put under great demand, resulting in use-dependent plasticity pertaining to visual functions. The objective of the current study is to assess visual memory among congenitally deaf children and compare the results with those obtained from normal hearing individuals. The study included 60 congenitally deaf children and 50 normally hearing subjects. 30 subjects from each group with intelligence scores between 25-75 percentile were tested for visual memory. The results of the present study disclosed that deaf individuals were superior to normally hearing subjects with respect to immediate recall and delayed recall. Since deaf individuals demonstrate increased sensitivity to visual stimuli, their visual strengths can be utilized for better communication and academic achievement.
Full-text available
A new system for measuring organizational quality on the Complex Figure Test (Rey, 1942; Taylor, 1969) is described. This system extends the traditional use of the test as a measure of constructional ability and figural memory. The new system is easy to learn, quick to score, and shows very good interrater reliability. Organizational quality was found to correlate moderately with copy accuracy, half-hour recall, and percentage retained. In an initial application of the system (N = 63), organizational quality of the Rey-Osterrieth Complex Figure successfully discriminated between symptomatic (those with acquired immunodeficiency syndrome [AIDS] or AIDS-related complex [ARC]) and asymptomatic subjects positive for human immunodeficiency virus (HIV), but organizational quality ratings of the Taylor figure did not. The results suggest that the Taylor figure is easier to organize than the Rey-Osterrieth. Thus, the Taylor figure may not be an appropriate alternative to the Rey-Osterrieth for the assessment of organizational quality.
Previous neural studies on recognizing multiple shapes have found that adequately organized neural networks have explored the problem to some degree, but these explorations are still limited. One of the major limitations is that the neural procedure work adequately only with closed figures, not with line drawings. This paper proposes a method which generalizes the combined procedure of neural networks and visual attentions in order to recognize line drawings or a primary shape consisting of single lines. Depending upon the computational flexibility, the mechanism of searchlight attention provides the neural networks with the potential to address practical problems of shapes including different scales, different proportions and even simply line drawings.
We apply neural networks to implement a line shape recognition/classification system. The purpose of employing neural networks is to eliminate target-specific algorithms from the system and to simplify the system. The system needs only to be trained by samples. The shapes are captured by the following operations. Lines to be processed are segmented at inflection points. Each segment is extended from both ends of it in a certain percentage. The shape of each extended segment is captured as an approximate curvature. Curvature sequence is normalized by size in order to get a scale-invariant measure. Feeding this normalized curvature data to a neural network leads to position-, rotation-, and scale-invariant line shape recognition. According to our experiments, almost 100% recognition rates are achieved against 5% random modification and 50%-200% scaling. The experimental results show that our method is effective. In addition, since this method captures shape locally, partial lines (caused by overlapping etc.) can also be recognized.
This report describes a qualitative scoring method for the Rey-Osterrieth Complex Figure (ROCF) that provides scores for fragmentation, planning, organization, presence and accuracy of various features, placement, size distortions, perseveration, confabulation, rotation, neatness, symmetry, and immediate and delayed retention. Interrater reliability is reported for productions drawn by 60 adolescents and adults. Each production received 16 initial scores based on specific criteria. Six additional summary scores were then calculated using scores from all three conditions (i.e., copy, immediate, and delayed recalls). Kappa statistics and intraclass correlations indicated excellent interrater reliability. In sum, the Boston Qualitative Scoring System for the ROCF appears to be a highly reliable and informative method for scoring this widely used instrument.