ThesisPDF Available

Knowledge construction in typography : the case of legibility research and the legibility of sans serif typefaces.

Authors:
The University of Reading
Knowledge construction in typography:
the case of legibility research and the
legibility of sans serif typefaces
Ole Lund
Thesis submitted for the degree of Doctor of Philosophy
Department of Typography & Graphic Communication
October 1999
2
Abstract
This thesis is first of all an epistemological study preoccupied with
typographic knowledge construction. The thesis is also in its own right
an idiographic historical study about legibility research and about the
discourse on the legibility of sans serif typefaces.
Changing theoretical and operational definitions of legibility are
discussed, an historical outline of legibility research is presented, and
critiques of legibility research that from time to time have surfaced are
analysed around recurring topics. Central to the thesis is a detailed and
critical review of 28 typeface legibility studies based on a wide variety of
rationales and operational methods, the first published in 1896, the last
in 1997.
The thesis effectively reveals that nearly all of the 28 reviewed
studies (of a surprising total of 72 identified studies) lack internal
validity (the intra-paradigm sine qua non of experimental research).
Other methodological flaws are also revealed. The detailed survey
provides a thorough empirical substantiation of criticism that has been
raised only in a sweeping manner at earlier occasions. The thesis shows
that traditional experimental legibility research has provided a non-
productive approach to typographic knowledge construction.
However, the thesis reveals that an increasing number of typeface
legibility studies have been carried out during the last two decades. This
stands in contrast to prevailing notions that legibility research to a large
extent vanished in the early 1980s, or alternatively, that ‘too few’
legibility studies have been carried out during the last two decades. The
thesis also shows that dubious and even seriously flawed legibility
studies are frequently and indiscriminately cited in the field of
information design.
Although the thesis primarily contributes knowledge to a self-
reflective conversation in typography and information design, it also
contributes knowledge of value to design history, design epistemology,
reading research history and social science history.
3
Contents
Abstract 2
Contents 3
Acknowledgements 7
Author’s declaration 8
1 INTRODUCTION 9
2 THE CONSTRUCT OF LEGIBILITY 15
Terminology and theoretical denitions 15
Ergonomics as a framework 21
Operational definitions 21
Experimental performance studies 23
Continuous reading 23
Threshold visibility 28
Search task 30
Subjective preference studies 31
Typeface topology studies 32
3 A CENTURY OF LEGIBILITY RESEARCH 34
Introduction 34
The hectic activity in the 1960s and 1970s 37
Changing paradigms: from ‘legibility’ to ‘usability’ 41
Legibility research today 46
Psychology 46
Graphic design 47
Digital typography 48
Ergonomics and information design 50
Vision research 51
Many recent studies 52
Contents
/ 4
4 CRITIQUES OF LEGIBILITY RESEARCH 54
Introduction 54
Post-positivistic critique, and notions of tacit craft knowledge 55
Lack of internal validity 61
Peripherality to the reading process 63
Lack of theory 65
‘The hypothesis of habit’ empiricism vs. rationalism 67
Critique from design practitioners 71
Postmodernist critique 78
5 A REVIEW OF EMPIRICAL STUDIES 80
Approach 80
Terminology 82
Bibliographic sources 84
Selection criteria 84
Studies assessed 86
Studies not assessed 87
A note on ‘semantic’ studies 89
Griffing and Franz 1896 92
Roethlein 1912 94
Legros and Grant 1916 96
Legros 1922 100
Pyke 1926 102
Crosland and Johnson 1928: the serifs that never were 105
Moede 1932 107
Paterson and Tinker 1932 111
Webster and Tinker 1935 114
Luckiesh and Moss 1937 116
Luckiesh and Moss 1942 118
Tinker 1944 118
English 1944 120
Burt, Cooper and Martin 1955 / Burt 1959 122
Contents
/ 5
Christie and Rutley 1961: The legibility of traffic signs and the
public debate on Jock Kinneir’s Motorway alphabet 126
The Motorway alphabet 127
The public controversy 130
The experiments 133
Discussion 136
In retrospect 140
The aftermath 143
Poulton 1965 147
Zachrisson 1965 153
Wendt 1969 155
Robinson, Abbamonte, and Evans 1971: Why serifs
are (still) important 161
The content of ‘Why serifs are important 162
A cognitive science approach 165
A multi-level chain of theory 169
Feature detection theory 170
Degradation theory 173
Implementation, interpretation, and reasoning 175
Theoretical assumptions or physiological facts? 176
The paper’s reception 177
Harris 1973 179
Hvistendahl and Kahl 1974 182
Vande rpla s a nd Vande rpla s 1 980 185
Suen and Komoda 1986 187
Taylor 1990 190
de Lange, Esterhuizen and Beatty 1993 199
Silver and Braun 1993 205
Silver, Kline, and Braun 1994 210
Wheildon [1984] 1995 215
van Rossum 1997 220
Contents
/ 6
6 DISCUSSION: KNOWLEDGE PRODUCTION AND TECHNICAL
RATIONALITY 224
A comment on the reviewed studies 224
Domain knowledge 226
An extreme position 227
Negative knowledge 228
Research that breaks with ‘received wisdom’ 229
Shadow positivism 234
Operationalism 236
Translation of ‘findings’ 238
Parallels between legibility research and humancomputer
interaction research 239
Epistemic alternatives to legibility research 240
The epistemic alternative of ‘design-based theory’ 242
7 CONCLUSION 247
REFERENCES 249
7
Acknowledgements
My thanks go to Gjøvik College for awarding me a research fellowship,
and especially to the former rector Einar Flaten and the former faculty
dean Jarle Nordengen. My thanks go also to the Committee of Vice-
Chancellors and Principals of the Universities of the United Kingdom for
awarding me the substantial Overseas Research Student Award for a
period of three years.
I am grateful to everyone in the Department of Typography &
Graphic Communication at the University of Reading for all their
support and warm friendship. I am indebted to my supervisor, Paul Stiff,
who on no occasion can be held responsible for the inadequacies of my
thesis.
I must thank Chris Burke and Mary Dyson at the Department of
Typography & Graphic Communication, and Peter Hacker at St John’s
College, Oxford, for reading a draft of the section ‘Robinson, Abbamonte,
and Evans 1971: Why serifs are (still) important’ in chapter 5.
My thanks go to Robin Kinross, London, for making me aware of,
and helping me to access, important source material, and for reading a
draft of the section ‘Christie and Rutley 1961: The legibility of traffic
signs and the public debate on Jock Kinneir’s Motorway alphabet’ in
chapter 5. I must also thank Margaret Calvert, London, who gave me
access to various documents, and permission to cite letters written to or
by Jock Kinneir.
Furthermore, my thanks go to Hans Jakob Ågotnes, University of
Bergen, Rolf Petter Amdam, The Norwegian School of Management,
Oslo, Ove Bjarner, Molde College, and Jan Michl, The Oslo School of
Architecture, for reading a draft of the introduction.
My thanks are also due to Linda Reynolds at the Department of
Typography & Graphic Communication, who helped me to access a
couple of articles important for my thesis.
The kind and efficient library staffs at Reading University and
Gjøvik College also deserve my thanks.
At last, thanks to my patient wife Sigrun, my two children Martin
and Marie, and my supportive parents.
8
Author’s declaration
The section ‘Robinson, Abbamonte, and Evans 1971: Why serifs are
(still) important’, in chapter 5, ‘A review of empirical studies’, has
previously, except for a few minor amendments, been published as
‘Why serifs are (still) important’, in Typography Papers, no. 3, 1997,
pp. 91104 (Lund 1997a).
The section ‘Wheildon [1984] 1995’, in chapter 5, ‘A review of
empirical studies’, has previously, except for a few minor amendments,
been published as a book review, in Information Design Journal, vol. 9,
no. 1, 1997, pp. 7477 (Lund 1997b).
About this PDF file:
Except for the correction of some typos, the addition of two missing entries
in the list of references (referred to in the main text, i.e. Bigelow 1989 and
Stiff 1994), and the deletion of two duplicate entries in the list of references,
this PDF file appears as an identical facsimile of the pages of the original
hardbound thesis deposited at the British Library in 1999.
9
1 Introduction
This thesis is first of all an epistemological study preoccupied with
typographic knowledge production. The thesis is also in its own right an
idiographic1 historical study about legibility research and about the
history of a discourse, that is, the discourse about the relative legibility
of serif and sans serif typefaces, while focusing on the marked category
sans serif typefaces.
Changing theoretical and operational definitions of legibility are
discussed, and an historical outline of legibility research is presented.
The main focus is on the hectic activity in the 1960s and 1970s, the
relative loss of legitimacy and the changing paradigm from ‘legibility’ to
‘usability’ in the 1980s and 1990s, and furthermore, the status of
legibility research today. Critiques of legibility research that from time
to time have surfaced are analysed around recurring topics. Central to
the thesis is a detailed and critical review of 28 typeface legibility studies
based on a wide variety of rationales and operational methods, the first
published in 1896, the last in 1997.
The thesis is in one respect a design historical study. However, the
focus is not on designed artefacts, individual designers, period, style,
movement, or for that matter, design in a broader socio-economic
context, it is rather about a discourse about the legibility of a particular
1. The term ‘idiographic’ appears in historiography and science philosophy. It was coined
by the German historian Wilhelm Windelband in 1884. It refers to individual matters
and the unique and nonrecurrent, as opposed to the generalizing or lawful, described
by the term ‘nomothetic’ (or ‘nomological’). ‘History’ is often referred to as
‘idiographic’, as opposed to the ‘nomothetic’ natural sciences, while social science (and
especially positivist social science) has ‘nomothetic’ aspirations. However, these
distinctions are certainly debatable. See for example Nagel 1961, pp. 547551, and
Baxandall 1985, pp. 1215.
Introduction / 10
kind of designed artefact, sans serif typefaces. Thus, if the thesis is
regarded as design history, then it is different from much design history
with regard to subject matter, and it is also different in that, through the
reviews, it engages in a critical dialogue with the empirical material
which constitutes the discourse.
The primary source material that is reviewed consists of theses,
dissertations, reports, papers in conference proceedings, and articles
published in academic journals, in design journals, and in printing trade
journals.2 I am not implying that this material necessarily represents a
‘complete’, well defined and coherent historical discourse, if such a thing
ever existed. Nevertheless, from the perspective of this thesis the
dispersed texts in question all orbit around the same or similar
questions, employ the same or similar concepts and methods, and
furthermore, intersect each other in various ways.
My aim has been to combine the approach of a more traditional
‘disinterested’ historical study with an ‘engaged’ critical dialogue with
the source material. However, the bias is towards the latter approach,
by focusing more on ‘worklike’ aspects of the source-material than on
‘documentary’ aspects.3
2. Sometimes the same piece of research appears in several guises. However, by using
the expression ‘in several guises’ I do not imply that meaning relies on linguistic
content alone and that concrete instantiations of a research ‘paper’ are arbitrary and
without importance for interpretation. It makes of course a great difference whether a
research paper is published or not, how it is published, when it is published, and by
whom it is published. The source material’s bibliographic history is therefore of great
importance. Jerome McGann cogently makes this point in his writings on literary
studies, textual criticism and bibliography. See for example McGann 1988a, as well as
the condensed argument in his review essay ‘Theory of texts’ (McGann 1988b).
3. For a s ti mu lati ng adv oc acy of s uch a co mbin ed app ro ach to h is tori ca l wr it in g that is,
especially in the context of intellectual history or history of ideas see Dominick
LaCapra’s essay ‘Rethinking intellectual history and reading texts’ (1983, esp.
pp. 2733, 6169). ‘Documentary’ refers to the referential function of historical sources
with regard to empirical reality. ‘Worklike’ refers to the intellectual content of
historical sources and questions of interpretation and dialogue. LaCapra points out
the duality between the ‘documentary’ and the ‘worklike’ both dimensions being
present in parish registers as well as in philosophical texts. It i s wor th po intin g out
that it is exactly a combined approach LaCapra advocates; he does not advocate
abolishment of documentary or empirical procedures. His position might be
understood as a negotiation between more traditional history writing and intellectual
history’s more traditional interpretation-biased approach. There is however the
possibility that focusing too much on ‘worklike’ aspects of the source material will
Introduction / 11
The thesis is idiographic, in that it approaches the ‘legibility
discourse’ in question in its historical uniqueness. That is, by historically
tracing, documenting, describing and discussing the socio-behavioural
research in question, and further by contextualizing4 the research by
looking into its dissemination and reception.
The thesis is also to some extent generalising,5 that is, by
suggesting regularities based on a systematic dialogue with the primary
source material in question 28 typeface legibility studies. In these
studies the typical mode of discourse is to refer briefly to the face results
of a few pieces of previous research on the same topic, shown to be either
heterogeneous or homogeneous, followed by a few critical remarks about
technical aspects of the predecessors’ operational methods, before
presenting the author’s own research on the same topic, which purports
to straighten out the matter.
The systematic dialogue with the primary source material is
performed not only by explanation and contextualisation but also by
critically examining the research on its own terms with regard to
construct validity, internal validity and external ecological validity,6
as well as by questioning its appropriateness or relevance from a
perspective ‘outside’ the research paradigm. However, I am primarily
focusing on the internal validity (the essential intra-paradigm condition
for meaningful experimental research), by focusing on the typographic
yield something ahistoric, showing disregard for the source material’s ‘pastness’. For
criticism of LaCapra, see for example Thompson 1993, where the author accuses
LaCapra of conveying a confused concept of historicity and advocating techniques of
interpretation that are not properly historical. In a critical assessment of what the
author perceives as pros and cons of literary reception theory history (cf. Hans Robert
Jauss) on one hand and the so-called ‘new’ history of political thought (cf. Quentin
Skinner) on the other hand, Thompson identifies one of the cons of the reception
theory history with LaCapra’s approach reception theory history’s attempt to
combine literary history with literary history. Nev erthele ss, in contras t to T hompson,
Femia ’s (19 81 ) attack on the historiographical position of Quentin Skinner, can be
read, although implicitly only, as a vigorous support of LaCapra’s position.
4. For in tere st in g di sc ussi on s on t he rol e of ‘con text in his to rical wr itin g pointing to
difficulties posed by the textuality and constructedness of ‘context’, see Bal and Bryson
1991, pp. 176180; Bryson 1994, as well as LaCapra 1983, pp. 2627.
5. Note however that the thesis has no ambitions of synthesising arguments about
typeface legibility or ‘findings’ from experimental research.
6. For defi ni tion s of the se con ce pt s, see th e se ct io n ‘Ter mino lo gy’, in cha pt er 5, ‘A re view
of empirical studies’.
Introduction / 12
‘stimulus material’. This strategy is based on an assumption that
undetected confounding factors that pose a threat to the internal validity
of the experiments easily reside in the ‘stimulus material’; for example
typographic variables that un-intentionally vary systematically with the
independent experimental variable. The reason for focusing on the
stimulus material is that this is exactly where I can utilise my expertise
as a typographer.
By questioning both the technical validity and the appropriateness
of this legibility research, the thesis becomes an epistemological study
about knowledge construction in typography. Thus the question of ‘the
legibility of sans serif typefaces’ becomes a manageable lens (or put
another way: an extensive empirical and historical instrumental case
study)7 for posing important epistemological questions about a certain
kind of scientific knowledge, its claim to truth, and its interaction with
quality judgements in the world of design. This knowledge is
presupposed to be not only valid but also superior to craft-based and
designerly domain knowledge, and it is very often produced with the
explicit aim of guiding practical design activities within the fields of
typography and information design.
It is somewhat unusual that a large part of a thesis is organised in
such a way that the primary source material (the legibility studies) is
reviewed as discrete units (one by one in a chronological order) and that
the review is not organised as a category-based analysis around
recurrent topics. The approach is not chosen as an easy way out, and
should not be confused with something encumbering the reader with
tedious ‘raw data’ or a tedious record of the researcher’s way through his
source material. The intention has been to do justice to the intellectual
content and possible ramifications of each piece of research in a dialogic
way. Furthermore, the intention has also been to avoid on one hand a
more or less critical but reductive ‘literature-review-like’ summarisation
of research ‘findings’; and on the other hand to avoid a reductive
treatment of this historical source material as mainly referential
documentary evidence.
7. For a u se fu l di sc ussi on o n th e charac te rist ic s of ‘c as e st ud ies’ , s ee S ta ke 1 99 4.
Introduction / 13
The approach will admittedly to some extent benefit (or suffer) from
the advantage of hindsight. Nevertheless, as numerous writers of both
historiography and hermeneutics have repeatedly and forcefully pointed
out: any writing based on material of the past necessarily projects values
or knowledge of the present, and further, that historico-temporal
distance does not necessarily imply myopic ‘presentism’, but rather
represents a productive source of understanding. It is partly the
intellectual content of the source material, partly its claim to truth and
partly the fact that the source material is frequently treated as having
more or less transhistorical validity, that demands the approach I have
chosen (the prolific legibility researcher Miles Tinker once admitted that
‘the findings are not necessarily good for all time’ (1965, p. 122)). That
means appearing judgemental in many instances; however, not without
an awareness of the danger of displaying overt enthusiasm or righteous
indignation in an anachronistic or ahistoric manner. Although entering
into a ‘dialogue’ with the source material, my aim has been to avoid
examining it as in a dehistoricised vacuum; independent of time and
conditions of production.8
The thesis represents an attempt to contribute to a self-reflective
conversation about certain aspects of the history of modern typography
and information design, examining epistemological and theoretical pre-
suppositions that may have relevance to design work or that even in
some instances may have guided design work.
Legibility research has during the last few decades been rejected by
several academics as well as by several designers, although most often in
a sweeping manner only. The usefulness and adequacy of a construct like
‘legibility’ has also to a certain extent been questioned. However, this
8. Examples of an emerging body of historiographical writing on the method, philosophy,
content, role and purpose of design history are: Dilnot [1984] 1989; Marg olin 1988;
Wal ker 19 89; M arg oli n 1 992 [1 995 ], 1998; Visible Language, v ol. 28 , no. 3, 19 94
(special issue on graphic design historiography); Design Issues, vol. 1 1, no. 1, 1 995
(special issue on design historiography); as well as the introduction to Robin Kinross’
Modern typography: an essay in critical history, 1992, pp. 714. For a reasonably
balanced but also vivid account of contemporary historiographical discourse, history
writing and the so-called ‘linguistic turn’, see Samuel 1991/1992. Samuel’s account
deals with problems of historical representation but nevertheless refuses to accept a
‘post-modern’ denial of the existence of a past independent of written representation
or claims that reality is a product of discourse.
Introduction / 14
relative loss of legitimacy should not force researchers working on the
history and theory of type and typography to exclude this phenomenon
just in order to avoid guilt by association. It should be pointed out that
although most legibility studies can be seen as atheoretical in their scope
from a behavioural science perspective, legibility studies have, at least in
quantitative terms, constituted an important if not major part of 20th-
century attempts to bring some kind of theory to typography.9 So, there
should be a place for a historical and critical involvement with the
question of legibility a question that researchers, designers, and
perhaps not least design educators, have paid, and still pay, much
attention to. Furthermore, the word ‘legibility’ today often surfaces in
graphic design discourse, with a variety of notions and motives attached,
often revealing a salient lack of historical knowledge. Aside from that
the past authority of legibility research and its methodology still lives on
in some academic and other contexts. It is therefore necessary to place
this research into a larger context in order to gain understanding of its
past and current status, validity and function.
TIMES He that plays the king shall be welcome; his majesty shall
have tribute of me; the adventurous knight shall use his foil and
target; the lover shall not sigh gratis; the humorous man shall end his
BASKERVILLE He that plays the king shall be welcome; his
majesty shall have tribute of me; the adventurous knight shall use
his foil and target; the lover shall not sigh gratis; the humorous
ARIAL He that plays the king shall be welcome; his majesty shall
have tribute of me; the adventurous knight shall use his foil and
target; the lover shall not sigh gratis; the humorous man shall end
GILL SANS He that plays the king shall be welcome; his majesty shall
have tribute of me; the adventurous knight shall use his foil and
target; the lover shall not sigh gratis; the humorous man shall end
FIGURE 1. Text set in the serif (roman) typefaces Times and Baskerville
(11 points nominal size), and the sans serif typefaces Arial (10 points
nominal size) and Gill Sans (11 points nominal size).
9. Michael Macdonald-Ross even equates ‘legibility research’ with ‘typographic research’
(1994, p. 4691).
15
2 The construct of legibility
Terminology and theoretical denitions
Legibility has to do with the effect of design variables on reading
performance, that is, ‘the effect of different typographical arrangements
on the reader’s ability to carry out the reading task most easily, comfort-
ably and effectively’ (Katzen 1977, p. 8). This seems as good a summary
as any, and as wide in scope. However, before going any further, an
attempt to clarify the sometimes confusing term ‘legibility’ will be made.
The term has been used when measuring (under a variety of
conditions): the speed of reading of continuous text (combined with
various kinds of comprehension tests), and the visibility or perceptibility
of words or letters (either grouped or isolated, in either sense-making or
non-sense combinations, at a variable distance, under variable illumi-
nation, or under variable time of exposure). Thus, when assessing the
literature on the ‘legibility’ of sans serif typefaces we have to discrimi-
nate between studies where ‘legibility’ denotes the speed or ease of
reading of continuous texts under ‘normal’ reading conditions; and where
‘legibility’ denotes the visibility or perceptibility of isolated displays of
letters or words read at a distance or under other constrained reading
conditions; and where ‘legibility’ denotes the comparative perceptibility
of individual letters within the same typeface; and where ‘legibility’
denotes something else. R.L. Pyke, in his extensive review of early
legibility research, suggested in 1926 that the term ‘legibility’ should be
The construct of legibility / 16
reserved to the reading of continuous text. However, this point of view
alone does not answer the question of what operational criteria to apply:
To read means to obtain meaning from written or print ed symbols. But
does ‘easily’ mean accurately, or rapidly, or both? Or does it mean with
little effort, and if so, does that mean without eye-strain or without
fatigue … ? It does not follow, because you read accurately, or rapidly, or
both, that you do so without effort …; nor that you read accurately
or fast because over a long period you read without eye-strain or fatigue.
To decide which of these te rms to retain as elements of our d efinition
apply the test of reductio ad absurdum. (Pyke 1926, p. 2526)
From the late 1930s the term ‘readability’ has to a certain extent
been used more or less interchangeably with the term ‘legibility’.10
Around 1940 the influential legibility researcher Miles Tinker as well as
the somewhat less influential legibility researcher Matthew Luckiesh
preferred the term ‘readability’ to the term ‘legibility’ to describe their
prolific work even though they based their ‘readability’ construct on
different theoretical and operational criteria. In correspondence between
Tinker and Luckiesh in 1943, quoted by Sutherland (1989, pp. 7781),
both claimed to have used the term ‘readability’ first. In this argument
Tinker explicitly pointed to his and Donald Paterson’s book How to make
type readable, published in 1940.11 Nevertheless, Luckiesh and Moss
had already used the term ‘readability’ in the title of an article published
in 1938 (i.e. Luckiesh and Moss 1938).
It is likely that Tinker wanted to use the term ‘readability’ as a
narrower term as opposed to the then ambiguous and broadly embracing
term ‘legibility’ ‘readability’ denoting ease or speed of reading of
continuous text under ‘normal reading conditions’ only (Tinker 1944,
p. 385). The fact that Luckiesh and Moss used the term ‘readability’ as a
broad embracing term complicated the picture (see Luckiesh and Moss
1942, pp. 47, 390, 393).
However, formulas based on lexical, syntactic and semantic
variables for measuring comprehensibility of educational and
10. I am here disregarding occasional historical precedence, in literature and elsewhere,
for various usages of the two terms, including interchangable usage (cf. OED 1989).
11. In Tinker and Paterson’s book the terms ‘legibility’ and ‘readability’ were used
interchangeably, both terms referring exclusively to the reading of continuous text
(see Paterso n and Tinker 1940, p. x vii).
The construct of legibility / 17
instructional reading material in order to match the abilities of specific
groups of readers, which first had appeared in the 1920s, and which from
the mid 1930s more and more often were referred to as ‘readability
formulas’,12 had by the 1940s and 1950s become very influential. This
made it difficult for Tinker and like-minded researchers to maintain the
term ‘readability’ for their area of research (Tinker 1963, p. 4;
Sutherland 1989, p. 108). In the summary of his oeuvre, Legibility of
print (1963), Tinker acknowledged the confusion and stated that he had
now confined himself to the term ‘legibility of print’. This acknowledge-
ment was accompanied by the fact that both his Legibility of print and
George Klare’s influential The measurement of readability was published
the very same year in the same series by the same publisher. Tinker,
who himself mainly relied on the speed of reading (of continuous text,
combined with a comprehension test) in his legibility studies, in 1963
suggested that ‘legibility’ was best used as a broad comprehensive term
and that ‘each of several research techniques contributes something of
value to the study and understanding of legibility.’ Tinker suggested that
legibility ‘is concerned with perceiving letters and words, and with the
reading of continuous textual material’ (Tinker 1963, pp. 5, 7). Thus, in
this respect, Tinker was back at square one, in exactly the ambiguous
situation he tried to avoid some twenty years earlier.
In spite of all this, Herbert Spencer’s research unit established at
the Royal College of Art in London in 1966 was named the ‘Readability of
Print Research Unit’. However, Spencer did not refer to ‘readability’ in
his book The visible wordwhere he exclusively used the term
‘legibility’, that is, as a broad embracing term (see Spencer 1969). Later,
sometimes between 1977 and 1979, the research unit at the Royal
College of Art changed its name to the ‘Graphic Information Research
Unit’.
In 1965, the researcher Christopher Poulton seems to have
preferred the term ‘readability’ to refer to the reading of continuous text
(i.e. in accordance with Tinker’s earlier preferred usage) and ‘legibility’ to
refer to the recognizability of individual characters (cf. Cheetham and
12. See Klare 1963, especially the extensive bibliography; and Zakaluk and Samuels
1988.
The construct of legibility / 18
Grimbly 1965, p. 68; and Poulton 1965). However, only three years later,
in 1968, Poulton seems to have changed his mind, and by then he used
the term ‘legibility’ to refer to the reading of continuous text (i.e. in
accordance with Pyke 1926), as well as for search tasks (Poulton 1968).
In 1968 Jeremy J. Foster pointed out that the terms ‘legibility’,
‘visibility’ and ‘readability’ needed to be standardised and that ‘only
recently have research workers bothered to make the distinction explicit
(for example: Zachrisson, 1965).’ (Foster 1968, pp. 279280). Foster
suggested that ‘legibility’ should be reserved for ‘the ease with which
running text matter can be understood under normal reading conditions’
(i.e. in accordance with Pyke’s (1926) and Poulton’s (1968) suggestions;
as well as Zachrisson 1965), and that ‘visibility’ should be reserved for
the ‘identifiability of a printed character or form’ (Foster 1968, p. 279).
However, only five years later, still hanging on to his definition of
‘visibility’, Foster now, reflecting the then many new areas of
investigation, uses the term ‘legibility’ as a very broad inclusive and
more figurative term embracing not only ‘running text matter’ on paper
but of information on all kinds of graphic displays; and not only
embracing with regard to medium substrate, but also with regard to the
various modes of symbolisation and configuration of various media types:
Legibility research is concerned with studying the effects of visual
information format on the responses made to it by the reader. It
therefore embraces not only typography, but also the use and design of
signs, illustrations, maps, symbols, colour-coding systems’ … (‘Visibility’
is used in a more restricted sense, to refer to the ease with which a form,
or character can be identified, and ‘readability’ usually denotes stylistic
complexity of text). (Foster 1973, p. 20)
Foster acknowledges that the definition oflegibility’ istoo wide,
and he therefore, in order ‘to make sense of the field’ tabulated a
classification system to map legibility research: Here, either
characteristics of the message (text, alphanumerics, or non-verbal
marks) or characteristics of the environment (e.g. illumination or state of
motion) represent the independent variables of most interest to the
experimenter. Foster leaves out characteristics of the reader as an
independent variable and places research where characteristics of the
reader are the independent variable in an even wider context than
The construct of legibility / 19
‘legibility research’ which he calls ‘visual communication research’.
Foster pointed out that ‘traditionally, most legibility research has
compared typographic designs in terms of their effects on recognition of
characters or reading of texts’ (p. 21). This kind of legibility research is
what I hereafter will refer to as ‘traditional legibility research’. The
studies into the relative legibility of sans serif typefaces as compared to
serif (roman)13 typefaces, which are examined in this thesis, can all be
described as ‘traditional legibility research’.
In 1980, Foster in his discursive bibliography of ‘legibility research
19721978’ still uses the term ‘legibility’ as a broad term not only
embracing the reading of continuous text and the visibility of words and
letters, but also ‘reading’ and comprehension of symbols and signposting
systems, illustrations in text, typographical factors in cartography,
graphs, numerical tables, forms and algorithms. However, on page 8,
where he equates ‘symbol identifiability’ with ‘visibility’, he equates ‘text
reading performance’ with ‘legibility’ (Foster 1980).
As late as 1987, the International Standards Organisation in a
proposal for an international standard for computer screen ergonomics,
which included recommendations for ‘legibility testing of visual display
screens’, used the terms legibility to refer to intrinsic characteristics of
typefaces and readability to refer to the quality of the typographical
arrangement on the page (Östberg et al. 1989, p. 146). This is in accord
with the usage of the English typographer Geoffrey Dowding and the
American typographer Marshall Lee (cf. Dowding 1957, pp. 18; Lee
1979, pp. 8990).
An illegible type, set it how you will, cannot be made readable. But the
most legible of types can be made unreadable if it is set to too wide a
measure, or in too large or too small a size for a particular purpose.
(Dowding 1957, p. 5)
In accordance with the ISO 1987 (and Dowding and Lee’s) definition
of legibility, but not in accordance with their definition of readability,
type designer Walter Tracy with subtlety (and in accordance with
Poulton’s 1965 definitions), refers to legibility as denoting the clarity of
13. The expressions ‘serif typeface’ and ‘roman typeface’ are used interchangeably in this
thesis.
The construct of legibility / 20
single letters of a typeface, whereas readability refers to the ‘visual
comfort’ provided by the quality of typefaces when applied in long
stretches of continuous text what he refer to as ‘two different aspects of
visual effectiveness’ (Tracy 1986, pp. 30–32). Tracy exemplifies his
definitions the following way:
The difference in the two aspects of visual effectiveness is illustrated by
the familiar argument on the suitability of sans-serif types for text
setting. The characters in a particular sans-serif face may be perfectly
legible in themselves, but no one would think of setting a popular novel
in it because its readability is low. (Those typographers who specify a
sans-serif for the text columns of a magazine may be running the risk of
creating discomfort in the reader to the ultimate benefit of a rival
journal.) (Tracy 1986, p. 31)
To make things even more complicated: some typographers are
using the term ‘readability’ to refer to how pleasurable and interesting
(but also how comprehensible) the subject matter of a text is, while using
‘legibility’ as a broad embracing term like in Tinker’s 1963 usage (see for
example Palmer 19791981; Rannem 1981). This usage of the term
‘readability’ is in accordance with lay everyday usage (think of the word
‘readable’), as well as historical precedence (see OED 1989).
As we can see from the exposition in this section, the use of the term
‘legibility’, and ‘readability’ for that matter, are far from straightforward
and agreed upon.14
Nevertheless, it is not unique to legibility research to have problems
with a mass of different definitions of what the object of study is, and
how to measure it. There is a great diversity of theoretical definitions of
constructs in the sociobehavioural sciences, and this applies not least to
empirical operational definitions. It follows from this that different
measures generate different results (see Pedhazur and Schmelkin 1991,
p. 171).
14. Not to mention an ever more figurative use of the word ‘legibility’, particularly with
regard to the built environment (paralleling the ever more figurative use of ‘to read’).
Cf. also Abraham Moles’ modernist and socially concerned notion of graphic design as
a means for making the man made world as accessible, understandable, and
intelligible as possible, introducing his concern with the phrase ‘the legibility of the
world’ (Moles 1989).
The construct of legibility / 21
Ergonomics as a framework
In the above mentioned article (Foster 1973) ‘Legibility research: the
ergonomics of print’ the author suggested that legibility research can
be best described as research into the ergonomics of print. He also
suggested that legibility research ‘may be seen as one branch of
ergonomics’.
Since the mid 20th century the term ‘ergonomics’ (alternatively
‘human factors’; especially in the USA) has been used to describe
research and other activities related to the design, adjustment, and
organisation of tools, procedures, systems, and work environment but
also consumer products and any man made environment in accordance
with the biological characteristics of humans. Since legibility research is
preoccupied with the interaction between the physical appearance of
graphic displays and the perceptual capabilities of humans, the term
‘ergonomics’ seems appropriate to broadly describe and contextualize
legibility research.
Another useful contexualization of legibility research is provided by
Robert Waller. He borrows a framework of three ‘domains’ from
educational theory related to the articulation of objectives for an
educational outcome. While outlining typographic research undertaken
by applied psychologists he suggests that legibility can be labelled as
belonging to ‘the psycho-motor domain’ as opposed to semantic and
aesthetic dimensions of typography belonging to ‘the affective domain’,
and typographic cueing belonging to ‘the cognitive domain’ (Waller 1987,
pp. 28ff; Waller 1991, pp. 341ff).
Operational definitions
The existence of multiple operational criteria (methods) indicates that a
construct (for example ‘legibility’) and its operational definition15 (the
actual method employed) are not synonymous. Operational constructs
15. ‘Operational definitions’ are sometimes referred to as ‘empirical definitions’ in the
research literature.
The construct of legibility / 22
are by nature circular. A construct is defined in terms of the operations
necessary to measure it, and the measurements are defined to be
measures of the psychological construct.16 The emerging criticism of
positivism in the social sciences in the 1960s focused among other things
on easy going use in psychology of operational definitions divorced from
theoretical definitions, and furthermore, that ‘operationalism’ limited
psychology to be nothing but experimental psychology.17 Nevertheless,
useful psychological experiments for evaluating legibility must in the
last instance be based on convenient operational criteria, and a too
persistent insistence on theoretical definitions of all constructs carries
the seeds of infinite regress.18
Tinker (1963, pp. 931) and Zachrisson (1965, pp. 4471) outline
and thoroughly discuss various operational definitions, that is methods,
techniques, or procedures applied by researchers to measure legibility.
Shorter but useful overviews and discussions can be found in Spencer
1969 (pp. 2124), in Reynolds 1984, and in Wendt 1994 (pp. 277283,
291293). Foster presents a discursive outline of a wide variety of
operational methods employed by legibility researchers in the 1970s,
including reading speed combined with comprehension tests (1980,
pp. 48). A critical discussion on how the results of various legibility
methods relate to each other, is given by Salcedo et al. (1972).
The various methods will be discussed or touched upon when
necessary while examining each individual legibility study in the next
section. However, a discursive overview of some of the most common
methods of traditional legibility research, as well as some ad hoc
‘typeface topology’ methods almost exclusively used in typeface studies,
is provided below in order to serve as a clarifying point of reference.
The various methods can be grouped as depending on either
‘outcome measures’ or ‘process measures’ (Schumacher and Waller 1985;
Dillon 1994). However, I find these labels awkward to use in a
meaningful and consistent manner when trying to classify typeface
16. For a ccou nt s of t he his to rica l ro le o f ‘ oper at iona li sm ’ (o r ‘opera ti on ism’ ) in p sy cholog y,
see Rogers 1992; Bickhard 1992; Green 1992; and Koch 1992.
17. See Skjervheim 1963.
18. See Pedhazur and Schmelkin for a useful discussion on ‘theoretical’ and ‘empirical’
definitions (1991, pp. 164179).
The construct of legibility / 23
legibility studies. These labels are better used for describing a facet of
individual methods. For a broad categorisation of operational methods
and types of legibility studies, I find the following categories helpful:
‘Experimental performance studies’, ‘subjective preference studies’, and
‘typeface topology studies’.
Experimental performance studies
Continuous reading
Speed of reading. This method (the quicker the better, the slower the
worse) is probably the most frequently used method for measuring
legibility and it was also the preferred method of Tinker and Paterson.
‘Reading speed is the best measure of type quality that has been found’
(Rubinstein 1988, p. 36). The method has been applied in various ways:
with either time or amount of reading as invariants, and has often been
combined with various forms of comprehension checks. Most often it has
dealt with silent reading. Aside from criticism of Tinker’s methods by his
‘arch-rival’ Luckiesh, both Zachrisson and Poulton critically discuss
various aspects of Tinker’s combined method based on the Chapman-
Cook reading test. Zachrisson and Poulton are particularly critical of
Tinker’s comprehension measure. Zachrisson thinks it is too much a
measure of comprehension, while Poulton thinks that it is hardly a
measure of comprehension at all, since comprehension is only required
above a minimum level.19 Even if substantial differences in speed can be
found in a legibility study, such differences may not decide the picture.
For example: Don Bouwhuis suggests in an interesting study that elderly
readers may need larger letters, but that they will read more slowly as
a result (Bouwhuis 1993).
19. See Zachrisson 1965 (p. 46); Poulton 1960, 1968 (p. 72). Rayner and Pollatsek,
although in the context of ‘readability formulas’, present a useful discussion on
problems of what comprehension is and how possibly to measure it (1989,
pp. 316320). For a recent study on comprehension as a ‘psychophysical measure of
reading performance’, see Legge et al. 198 9. The authors conclude that compr ehension
is a poor ‘psychophysical measure’.
The construct of legibility / 24
Eye movements. Measurements of oculomotor behaviour have also been
used in legibility studies. For the sake of order, eye movements refer
here only to eyeball motion and not contraction and expansion of the
lens. Several automatic unconscious eye movements during reading have
been identified: saccades (frequent ballistic movements or jumps along
the line), fixations (when the eye is nearly20 immobile and visual
information is being extracted by foveal vision making up 9095% of
the total reading time), regressions (alternatively: regressive saccades;
when the eye goes back and re-read characters or words), and return
sweeps (when the eyes move from somewhere near the end of one line to
somewhere near the beginning of the next line).
The perceptual spanaccording to Rayner and Pollatsek (1989)
refers to the effective visual field; that is, the amount of text or number
of characters perceived in one fixation (520 characters, depending on
contrast from low to high). In O’Regan’s and Legge’s alternative measure
the visual span, the corresponding number of characters (that are
recognised at each glance) are 210, depending on contrast from low to
high. Nevertheless, the point is that the eye’s acuity is increasingly
diminished just outside the centre of fixation (that is, starting to
diminish already within the effective visual field).21
Parafoveal information extraction refers to the repeated glimpses of
text at a coarser scale ahead of foveal fixation. The psychologist
Raymond Dodge already in 1907 theorised that ‘It seems probable that
the normal reading pause [i.e. the fixation] represents a comparatively
late moment in the total process of perception of the fixated object’
(Paterson and Tinker 1947, p. 388).
The eye movements in question were first identified and described
in the latter part of the 19th century by the French ophthalmologist
Emile Javal. His ‘observation that eye movements in reading occurred in
jumps (“per saccades”) contrasted sharply with the prevailing wisdom of
the time’ (Venezky 1984, p. 7). Today, cognitive psychologists working in
the field of reading psychology attempt to use eye movement data as
20. The eye oscillates continuously, something which is assumed to be necessary for
seeing (Rubinstein 1988, p. 27).
21. See Morrison and Inhoff 1981; Rayner and Pollatsek 1989, pp. 113 152; O’Regan
1990; and Legge et al. 1997.
The construct of legibility / 25
indices of the cognitive processes believed to be involved in reading, as
well as for studying visual perception in general (O’Regan 1990).
However, researchers are also aware of that since eye movements are
sequential and cognition is likely not to be sequential, there can be no
simple correspondence.22
The length of saccades (the longer the better), the frequency of
regressions (the fewer the better), the frequency and duration of
fixations (the fewer and shorter the better), and the width of the
perceptual span (the longer the better), have been viewed as indices of
the relative ease or difficulty of discriminating characters and reading
the test material. For example will a text of low contrast make the visual
span shrink, reading speed will slow down because fewer characters are
recognised at a glance, and the reader will advance in smaller but more
frequent steps (Legge et al. 1997). In other words, the oculomotor
measures mentioned are determined by the visual features of text.
Tinker, who extensively recorded eye movement data while
performing his impressive range of legibility experiments, which
involved more than 30.000 subjects, claimed ‘that eye movement
measures provide an excellent supplement to reading performance’
(1963, p. 26). Paterson and Tinker actually photographed23 the eye
movements and recorded the number of fixations, the numbers of words
per fixation, the duration of pauses, the perception time, and the number
of regressions. They also published many papers explicitly on eye
movement measures.24 In a thorough survey of work on ‘visual factors
22. Two collections of papers on eye movements in reading are Rayner 1983, and Kowler
1990. Recently the journal Visio n Re se ar ch hosted a heated debate on whether
wordspaces guide eye movements or not and for that matter, whether wordspaces
are important in facilitating easy reading or not (see Epelboim et al. 1994 ; Rayn er an d
Pollatsek 1996; Epelbo im et al. 1996). A nyway, see Paul Sa enger ’s inter estin g acc ount
of the historical role of wordspaces for the facilitation of easy reading (1997).
23. However, as Andrew Dillon points out, use of eye movement recording equipment is
rarely non-intrusive (1994, p. 38). See for example the picture of a subject and the
‘Minnesota eye movement camera’ in Paterson and Tinker 1940a (inserted between
pp. 4 and 5). However, current methods are less intrusive.
24. For e xamp le Pate rson a nd Ti nker 1 940b ( ‘I nflue nc e on l ine wi dt h on e ye m ov e-
ments’); Paterson and Tinker 1947 (‘The effect of typography upon the perceptual
span in reading’); and Tinker and Paterson 1955 (‘The effect of typographical
variations upon eye movement in reading’).
The construct of legibility / 26
and eye movements in reading’, published in 1981, Robert E. Morrison
and Albrecht-Werner Inhoff conclude that:
Although comprehension requirements (e.g., skimming, reading for
detail) and text difficulty may also have effect on eye movement
measures within readers, and despite wide differences in eye behaviour
between individuals, the oculomotor effect of the physical characteristics
of text are undeniable and should not be ignored. (p. 143)
The explanatory and theoretical potential of focusing on eye
movement measures in legibility studies becomes clear when the
typographic variable line length is considered. Morrison and Inhoff, like
researchers before them, focus on the important role of parafoveal
information extraction in reading, and Tinker and Paterson accordingly
theorised that short lines prevented the full use of parafoveal vision.
This was supported by the observation that in the reading of lines with
‘optimal line length’ the fixation duration decreases over the last half of a
line, presumably due to advance parafoveal information extraction. With
line lengths beyond the ‘optimum’, regression frequency increases, and
with extra long lines Tinker found problems with the execution of the
return sweeps to next line.25 Tinker also found that interlinear spacing
expands the acceptable range of line widths. Mackworth (1965) theorised
that extended interlinear spacing probably reduces lateral masking and
widens the perceptual span. ‘Lateral masking’ refers to the decreased
perceptibility of stimuli when surrounded by adjacent contours. This is
supported by the observation that performance increases drastically
when letters are presented in isolation within parafoveal vision, and
decrease when not in isolation. (Morrison and Inhoff 1981, pp. 135136).
Mary Dyson and Gary Kipping have recently carried out three
studies on screen legibility.26 These studies involved the variable paging
versus scrolling and typographic variables such as number of columns
and line length. Surprisingly, Dyson and Kipping found that reading was
faster at longer line lengths.27 They point out that this is in line with
25. See Paterson and Tinker 1940b; Tinker and Paterson 1955; and Morrison and Inhoff
1981.
26. Interestingly, these legibility studies are among several legibility studies that
recently have been supported by the Microsoft Corporation.
27. See Dyson and Kipping 1997, 1998a, 1998b.
The construct of legibility / 27
early findings on reading from screen (i.e. Duchnicky and Kolers 1983),
but contradict traditional findings from print legibility research, such as
Tinker’s (the authors refer to Tinker 1963). It also contradicts common
craft knowledge, as well as the theoretical arguments of Morrison and
Inhoff, referred to above. While discussing their findings Dyson and
Kipping suggest that it may be the screen medium that causes the sur-
prising result; we tend to sit further away when we read on screen, thus
allowing for a wider visual angel (1998b, p. 151152).28 Dyson and Kipp-
ing also suggest that the time spent in many more return sweeps for text
set in short line lengths can partly explain their findings (1998a, p. 303).
Nevertheless, in one of their studies they admit that very narrow line
lengths have been compared with very long line lengths, and that none of
them have been compared with moderate line lengths (1997, p. 711).
Introducing moderate line lengths (as Paterson and Tinker did in their
1940b study) could very well have altered the results or put them in a
different light. Interestingly, the authors point out that the longer lines
were (subjectively) judged by the subjects as least easy to read.
Anyhow, it is not my intention to dwell on line lengths or eye
movements. My intention is simply to foreground that eye movement
recordings could seem to provide a basis for a more fertile and theory-
inclined approach to legibility studies than for example theory-exempt
speed of reading studies (in spite of contradictory findings).
Blink rate (the rate of involuntary eyelid blinking). This method was
used extensively by Luckiesh and Moss. Tinker, in numerous writings,
as well as other researchers,29 strongly contested the validity and
reliability of this method. Foster seems to reflect a consensus when he
refers to the method as ‘the now-discredited blink-rate method used by
Luckiesh and Moss’ (1980, p. 8). The controversy between Tinker and
Luckiesh has recently been described, although biased in Tinker’s
favour, by Sandra Sutherland who provides an interesting look into the
personal and professional relationship between Tinker and Luckiesh.
Their non-amicable and strained relationship was based on the exchange
28. See also Dillon 1994, p. 47, and further references there.
29. See for example English 1944, p. 219, and Carmichael and Dearborn 1947.
The construct of legibility / 28
of letters and polemic articles, where they argued back and forth against
each others’ methods.30
Nevertheless, a paper recently published in Human Factors (Stern
et al. 1994), which includes a thorough review of the literature on blink
rate as a measure of fatigue (47 items, old and new, typically related to
either reading, driving, or aircraft pilot performance), contains a severe
criticism of how Tinker as well as Carmichael and Dearborn (i.e. 1947),
and others, reached their conclusions. The paper rehabilitates the
possibility of linking increase in blink rate to fatigue and time on task.
However, the authors also show that blink rate is a function of how
demanding a task is: the more demanding the task, the lower the blink
rate, and vice versa. Thus, the blink rate is significantly lower when
reading than when not reading. In order to embrace both blink rate
phenomena described here, the authors suggest that increase in blinking
(as well as increase in blink closure duration and decrease in blink
amplitude) during task performance, is a result of decrease in inhibitory
control.
Threshold visibility
Var iable time of exposure (also referred to as ‘speed of perception’, ‘brief
exposure’ and ‘time-threshold’). By this threshold method, test material
is exposed to the subjects for only very short periods of time either
incrementally increased (e.g. from 1/10 of a second), or by counting short
successive exposures (e.g. each of 1/1000 of a second). This is performed
with the help of specially designed instruments tachistoscopes, also
called t-scopes. This method is mainly used for ‘investigating the
legibility of individual letters and symbols, and in the field of word-
perception research’ (Rehe 1984, p. 18).
Var iable distance (also referred to as ‘perceptibility at a distance’ and
‘distance threshold’. This method, too, has found application primarily in
investigation into the legibility of individual letters and symbols, and
30. See Sutherland 1989 (pp. 7287). The polemical articles in question appeared during
the 1940s, most of them in the Journal of Experime ntal Psy cho logy and the Journal of
Applied Psychology.
The construct of legibility / 29
more obviously, road signs. The method is certainly applicable for
display purpose situations, such as road signs and instrument panels.
The results from visibility tests often contradict results from
experiments where speed of reading is measured. ‘Keeping the difference
between these two measures in mind helps avoid apparent contra-
dictions and paradoxes in reporting the quality of text’ (Rubinstein 1988,
p. 176).
Visibility is a function of visual angle, but within limits. Visibility
raises steadily with visual angle, though the rate of rise diminishes with
larger angles (Rubinstein 1988, p. 174ff). With sufficient magnification,
interactions with the visual field will occur, that is, details of letterforms
will start to fill up the visual field (Bouwhuis 1993).
Note however that traditional typeface design did not (necessarily)
treat as linear the transformation of shape and proportions when scaling
from size to size. That is, the different typeface sizes had individual
designs (in traditional metal technology a ‘font’ equalled one typesize, as
opposed to today when one ‘font’ may generate all applicable sizes). This
design feature is referred to as optical scaling: a smaller type size results
in heavier serifs, thicker strokes, bigger x-height with shorter ascenders
and descenders, a more open form (bigger counters), wider letters, and
larger default space between the letters. And the opposite applies for
larger typesizes. This was done in order to compensate for side-effects of
linear scaling; for example will a linear reduction of size into a smaller
size create brittle, weak and anemic letterforms clogged together, posing
technical reproduction problems, perception problems, and visual
inconsistency. Based on research in visual perception, it has been
theorised that ‘such proportional changes are necessary because the
human visual system has a non-linear sensitivity to visual features of
different spatial frequencies’ (Bigelow 1989, p. 77).31 Observe that the
sophisticated knowledge of optical scaling is knowledge that is
represented in or by artefacts, that is, typefaces. This ‘within artefact
31. For g ood in trod uc tion s to opt ic al sca li ng in ty pe d es ign, an d to c onte mp orar y
approaches to optical scaling, see the following complementary papers: Harry Carter’s
classic paper [1937] 1984; Bigelow 1981; Seybold 1991; Benedek 1991; André and
Vat to n 19 94 ; and Burke 1995/1996.
The construct of legibility / 30
knowledge’32 is not knowledge that could be invented and generated on
the basis of experiments. It is of course possible that experiments could
validate or falsify the underlying ‘theory’. However, to find procedures
fine and accurate enough, or procedures that really measures the effect
of optical scaling and not something else, would be difficult. Note also,
that the concept of optical scaling suggests that to measure type size
‘objectively’ by referring to the visual angle subtended (as in some recent
legibility research) is not necessarily as relevant as suggested.
‘Optical scaling’ can explain why in some comparative legibility
studies (on proportional width typefaces like Times vs. fixed width or
so-called mono-spaced ‘typewriter’ typefaces like Courier) the fixed width
typefaces score best in very small sizes (see the results in Ariditi et al.
1970 and Mansfield et al. 1996). Axed width typeface like Courier has
exaggerated serifs, a more uniform stroke thickness, more space between
the characters, as well as bigger default spaces between the words.
These features of Courier parallells the features of the small size ver-
sions of typefaces generated on the basis of optical scaling, and Courier is
therefore better suited than Times for linear scaling down to small sizes.
Other threshold visibility measures. Based on a variety of unfavourable
conditions, such as focus variation, presenting the stimulus material at
an angle, poor illumination, with non-optimal brightness contrast
between the text and the substrate on which it appears, and testing of
peripheral vision.
Search task
Search task. A measure very much directly related to the task of the
reader, by measuring the time it takes to look up something in non-
continuous text like dictionaries, timetables, telephone directories,
bibliographies, or tables (Spencer 1969, p. 24). This method has not been
used much in ‘traditional’ legibility performance studies, but became
more common in the 1970s in studies that focused more on macro level
aspects of typography.
32. See Carroll 1990, pp. 277284; Carroll and Campbell 1989; and Cross 1999. See also
the section ‘The epistemic alternative of “design-based theory”’, in chapter 6.
The construct of legibility / 31
Subjective preference studies
Non-experimental preference studies constitute a particular genre of
comparative typeface studies. Some of these studies are labelled as
legibility studies, however, not all of them. Preference studies are most
often individually published, but sometimes appear as subsidiary
supplements to behavioural experimental legibility studies.33
Paterson and Tinker published a typeface preference study in order
to ‘determine the trustworthiness of readers’ opinions’ in their 1940
monograph How to make type readable: a manual for typographers,
printers and advertisers (pp. 1820). The ten typefaces employed were
the same as had been employed in an earlier speed of reading study
(Paterson and Tinker 1932b), and were now ranked according to ‘ease
and speed of reading’. The results showed great differences among the
typefaces. However, with one exception, the textura typeface Cloister
Black, the results showed no agreement with ‘the objectively determined
results’ from the 1932 speed of reading study. Paterson and Tinker
actually suggested that ‘reader preference’ are of greater value for ‘the
printing industry’ since legibility differences according to ‘speed of
reading results’ are only slight for most ordinary typefaces. Thus, by
choosing a preferred typeface among equally legible typefaces, the text
‘will give the impression of being easily and speedily read’. However, this
argument only applies to ‘equally legible typefaces’, and the authors also
conluded that
The failure of the students to recognize that American Typewriter is less
legible than standard type faces in common use may be accepted as
impressive evidence of the difficulty and probably the impossibility of
deciding questions of legibility by mere inspection of printed material.
The results of this special study should be accepted as evidence that
mere opinions concerning matters of typography are unsafe guides
(pp. 1920)
In a paper published two years later, ‘Reader preferences and
typography’, the authors reported some of the results on other
typographic variables from the same series of preference studies as the
1940 typeface study cited above (Tinker and Paterson 1942). On these
33. See in chapter 5, ‘A review of empirical studies’.
The construct of legibility / 32
other typographic variables, contrary to the results in the typeface
preference study, the authors found that in many cases there was
agreement between speed of reading studies and preference studies.
What is more, here they also reported that they had obtained preference
data on both ‘judged legibility’ (‘opinions on legibility’) and ‘judged
pleasingness’ (‘aesthetic values’). They found ‘in all cases’ a close
agreement between the two preference measures.
The typeface results reported in these studies by Tinker and
Paterson have been misinterpreted by some writers, who claim that the
preferred typefaces in Tinker and Paterson’s studies were also the most
legible. They seem to have wrongly extrapolated to typefaces, the
agreement between performance and preference reported on other
variables in the 1942 article (see for example Kostelnick 1995, p. 187;
and Schriver 1997, p. 302). Nevertheless, there seems to be near
consensus in the literature that, regardless of typographic variable, user
preference or perception of ease of reading tend to be inconsistent with
user performance (see for example Spencer 1969, p. 23; Salcedo et al.
1972; and Dyson and Kipping 1998b). For empirical instances of typeface
studies where preference and performance have been measured in the
very same study, see several studies that include subsidiary preference
studies and which are assessed in the next chapter of this thesis (for
example Hvistendal and Kahl 1974; and Taylor 1990).
Typeface topology studies
By ‘typeface topology studies’ I refer to some scientific studies that with
regard to method differ widely. These studies are not based on
behavioural experiments, but rather on a priori theories of typeface
legibility, based on properties of the shape of the glyphs34 of individual
34. The common term ‘character’ is to some extent used interchangeably with the term
‘glyph’ in this thesis. However, the term ‘glyph’ has been introduced in digital
typography to distinguish the physical instantiation (the mark on a substrate) from
the abstract ‘character’. There is no one to one relationship between ‘glyph’ and
‘character’. The glyph A in a particular typeface can represent the character A in the
Latin alphabet and the character A in the Greek alphabet. Accordingly, the character
The construct of legibility / 33
typefaces (belonging to either the serif or sans serif category). These
shapes are then measured, processed and compared by the researcher
according to particular operational constructs.
For example, Legros and Grant’s typeface legibility study from
1916, where the degree of similarity in the area covered by certain
superimposed glyphs (based on physical measurements) constituted the
legibility construct.35
Another example is Robinson, Abbamonte and Evans’ study from
1971, where they used a computer model of human vision, and where the
computer processed input information about the shape of individual
glyphs, and output resulting shapes. These shapes were then visually
inspected by the researchers.36
A third example is based on contemporary information processing
theories about human vision and the sensitivity of the human eye to
‘spatial frequencies’ of typographic material. This interest has led to
legibility studies and theories about how (the objective and measurable)
spatial frequencies of typographic material determines its legibility.37
The relative legibility of serif and sans serif typefaces has also been
theorised on the basis of their ‘spatial frequencies’, albeit briefly and in a
vague and not very convincing manner.38
A fourth example is a recent typeface legibility study (van Rossum
1997). Van Rossum has devised an operational method based on an
operational construct of legibility where images of text are blurred in a
particular way. The resulting blurred images are not exposed to
‘subjects’, but visually inspected by the researcher in order to determine
the relative legibility of the typefaces involved.39
A in the Latin alphabet can be represented by different glyphs, that is, by As from
different typefaces (see Bigelow and Holmes 1993). Compare with the similar
linguistic distinctions between ‘allograph’ and ‘grapheme’ and between ‘allophone’ and
‘phoneme’.
35. See the section ‘Legros and Grant 1916’, in chapter 5.
36. See the section ‘Robinson, Abbamonte and Evans 1971’, in chapter 5.
37. See the section ‘Legibility research today’, in chapter 3. See also Morris 1988;
Rubinstein 1988; Clark 1989; Morris 1989; Morris, Berry and Hargreaves 1993;
Hallberg 1992.
38. See the section ‘Legibility research today’, in chapter 3. See also Bigelow and Day
1983, p. 102; Morris 1988; Rubinstein 1988, pp. 4347; Clark 1989; Hallberg 1992,
pp. 106108; Gelderman 1999, pp. 101102.
39. See the section ‘van Rossum 1997’, in chapter 5.
34
3 A century of legibility research
Introduction
The history of legibility research goes back two hundred years and
constitutes the oldest strand of reading research, although serious
research first appeared in the late 19th century (Venezky 1984,
pp. 2223). This large body of mostly experimental research most often
seem to have been carried out by psychologists under labels such as
applied or educational psychology, and more recently, under the label
ergonomics. Research has also been carried out by ophthalmologists,
engineers, and since the 1940s under the labels ‘journalism’ and ‘mass
communication studies’, and further, since the 1960s to some extent by
typographers in co-operation with trained researchers.40
Until the 1970s legibility research was to a large extent preoccupied
with typography on a detailed level, and not the overall organisation of
larger groups of elements on the page or within a document. Thus,
research has been performed to answer questions about the relative
legibility of: 1) single characters compared to each other regardless of
typeface; 2) certain typefaces or typeface categories compared with each
other; 3) type size, interlinear spacing, and line length i.e. the influence
of typographic micro and meso variables. However, the colour and
qualities of paper, the colour of ink, and illumination, were also part of
40. Fos te r (1 97 8) p re se nts an ove rv ie w wh ich includes statistical information, over
scholarly journals containing papers on legibility research in the 1970s. This overview
gives an indication of the affiliations of legibility research then.
A century of legibility research / 35
the repertoire of traditional legibility research. After World War II,
especially in the United States, in addition to involvement with print,
legibility research became more and more involved with, for example,
road traffic signs and military applications like trans-illuminated cockpit
displays (cf. the bibliography Cornog and Rose 1967), and since the 1960s
new text-carrying media substrates like microforms and computer
screens.
Early legibility research is admirably reviewed by Pyke (1926). The
work of Miles Tinker and Donald Paterson, the most prolific legibility
researchers ever, is reviewed by Tinker (1963). The whole area of
legibility research is eclectically and lucidly surveyed by Spencer (1969).
Legibility research carried out in the 1970s is reviewed by Foster (1980).
Foster provides a comprehensive assessment of the vivid eld of
legibility research in the 1970s, although the ‘field’ by then was both
moving astray in various directions and including many new approaches.
Legibility research seems to some extent to have grown, peaked and
dwindled together with the behaviourist paradigm of psychology of this
century. It is however not possible to link legibility research directly to
behaviourist psychology. As Venezky points out while describing the
development of reading research in the early part of the 20th century,
legibility research existed before and was unaffected by the advent of
behaviourism (1984, p. 23). However, one thing behaviourism and
legibility research have in common is the focus on the controlled
behavioural experiment.
Keith Rayner claims that the emergence of behaviourist psychology
in the early 20th century unfortunately cut off an interesting strand
within experimental psychology a strand preoccupied with the process
of reading, and memory and language processing linked by Rayner to
modern cognitive psychology. According to Rayner this activity reached
its peak with Huey’s acclaimed book The psychology and pedagogy of
reading, published in 1908. ‘Since cognitive processes involved in skilled
reading cannot be observed and directly measured, interest in reading
waned between 1920 and 1960’ (Rayner and Pollatsek 1989, p. 6).41
Although Rayner acknowledges Tinker’s work on eye movements in
41. See also Rayner 1981 and Venezky 1984.
A century of legibility research / 36
reading, he dismisses most of Tinker’s work, and thus legibility research,
as dealing with ‘purely peripheral components of reading’.
The rise and fall of legibility research as a ‘research program’42 can
be described as having gone through four phases. In the first phase, from
around the turn of the century, legibility research was a visible part of
reading research. Richard Venezky points out that around the turn of
the century reading research (which then included legibility research)
was prominent in the psychological literature, and that very few
psychology journals existed Psychological Review and American
Journal of Psychology in the USA, Mind in England, and a few in
Germany and France (1984, pp. 7, 23, 28). It is no coincidence that the
first two experimental legibility studies involving serif and sans serif
typefaces appeared in the journals Psychological Review and American
Journal of Psychology.43
The second phase is the ‘Tinker and Paterson phase’; that is, the
period from the 1920s until the 1960s. This phase is very much
dominated by Tinker and Paterson’s extremely prolific output of
typography-oriented research papers in mainstream psychology journals
like Journal of Applied Psychology and Journal of Experimental
Psychology.44 In addition; during and after the second world war
legibility research expanded within new areas such as ergonomics and
engineering, where studies of the legibility of instrument displays and
road signs were typical research topics.45
The third phase is represented by the hectic and expanding activity
in the 1960s and 1970s, with a wider range, and at the same time also a
more dedicated range, of journals functioning as outlets for the studies
(described below).46 The fourth phase is represented by the dwindling
but far from vanished research activity since the 1970s and until today
(this is also described below).
42. Lakatos’s term ‘research program’ (1970), is here used in a loose way.
43. That is, Griffing and Franz 1896; and Roethlein 1912. Both papers are assessed in
this thesis.
44. See for example Sandra Sutherland’s biographical study of Tinker (1989).
45. See for example Cornog and Rose’s bibliography (1967).
46. See Foster 1978.
A century of legibility research / 37
The hectic activity in the 1960s and 1970s
Curiously, in the 1960s at the same time as behaviourist psychology
was in decline and cognitive psychology was emerging and reaching for
its later dominant status, but independently of this shift interest in
legibility research grew rapidly. It was nevertheless a time with a spirit
of optimism and when positivism not only as method but also
philosophically still reigned in the social sciences. In the period from the
mid 1960s to the early 1980s legibility research witnessed a culmination
with a huge output of research papers, before it more or less vanished
from the mainstream.
The hectic activity in this period may have several explanations.
One is that many new type-carrying media appeared in this period,
many with high constraints like low resolution capitals-only displays
and print-out devices, typewriter-based ‘typesetting’, n’th generation
photocopies, microfilm, etc. Another reason, pointed out by Foster, is
that this period also coincided with the tremendous growth in the area of
ergonomics (Foster 1973, p. 20). Nevertheless, in this climate, traditional
typographic micro-variable legibility studies related to the print medium
also flourished.
Interestingly, the culminating period of the 1960s and 1970s
coincided with other comparable activities which may all be described as
attempts to rationalise typography and graphic design. I am referring to
the attempts to standardise and metricate typographic measurement.47
I am also referring to the many efforts in the making of classification
systems for typefaces.48 Furthermore: attempts to theorise typography
by applying metaphorical models from mathematical information theory
to typography; see for example Gui Bonsiepe’s article bearing the
suggestive title: ‘A method of quantifying order in typographic design’
(1968).49 The article appeared in the same year in the journal Ulm and
in the Journal of Typographic Research. This happened around the same
47. See Boag 1996 for a comprehensive introduction to the history of typographic
measurement systems.
48. See for example Lund 1993.
49. For p erce pt ive re tr ospe ct ive co mm ents on Bo ns iepe ’s a rt ic le, see K inro ss 1 985, and
Wal ler 19 87, p p. 2 324.
A century of legibility research / 38
time when Abraham Moles and others developed aesthetic theories
based on mathematical information theory in order to quantify style,
originality and harmony. These highly imaginative, but nevertheless
scientistic theories, used terms from information theory such as
‘redundancy’, ‘information’, ‘entropy’ and ‘noise’; terms that made more
sense as a set of eye-opening metaphorical labels than the attempts to
quantify style or originality.50
It is perhaps not a coincidence that the 1960s also was a period
when the style of the ‘cool and rational’ idiom of minimalist and high
modernist ‘Swiss typography’ was influential and popular throughout
the western world. And as well, the 1960s and 1970s were also the
heyday of the ‘design methods’ movement,51 and furthermore, the
heyday of the conception of design as a ‘science’, with Herbert Simon’s
The sciences of the artificial (1969) as a basic and influential text.52
And further, another relevant example could be mentioned:
‘Educational technology’ and ‘programmed learning’, were pedagogical
methods and movements developed in the hey-day of positivism.
‘Programmed learning’ is an extremely goal oriented pedagogical theory,
focusing on stringent predetermined goals of learning, ‘knowledge’
dissemination to the learner in small and measured doses (stimulus),
answers (response), and instant reply (feedback) from the teacher
(knowledge-administrator), and permission to progress to the next step
granted only after mastering the current step (Kvernbekk 1995, p. 23).
The basic idea behind ‘educational technology’ and the like is that all
learning can be organised as an algorithmic pre-programmed production
process with effective product evaluation and quality control during the
process, not unlike the management theory of ‘Management By
Objectives’, also developed in the 1950s and 1960s (Hellesnes 1975,
pp. 141157; 1998). Not only did these rationalistic and technocratic
pedagogical methods appear and gain status in parallel to the other
50. See for example Abraham Moles’ Information theory and esthetic perception [1958]
1966.
51. For a bal an ced in tr oduc ti on t o th e ‘de sign m etho ds ’ mo ve ment , see Cr os s 19 80 .
52. For a cri ti ca l as se ssme nt of th e conc ep tion o f desi gn as ‘ sc ienc e’ still alive in many
quarters see Cross, Naughton and Walker 1981. The authors suggested
‘technological activity’ as an alternative and more fertile conception of design.
A century of legibility research / 39
‘rationalistic’ phenomena mentioned above, there were actually links to
legibility research and the emerging field of ‘information design’. In
England this had something to do with the establishment of the Open
University in the early 1960s, and its need to develop effective pro-
cedures and methods for ‘distance learning’. It is an irony that Michael
Macdonald-Ross and Robert Waller at the Institute of Educational
Technology at the Open University published their post-positivistic
‘manifesto’, ‘Criticism, alternatives and tests: a conceptual framework for
improving typography’, in exactly the journal Programmed learning and
educational technology (1975).53
The high level of activity related to legibility research was
manifested in numerous ways: In 1967, ATypI (Association Typo-
graphique Internationale), set up a ‘Legibility research committee’
(Zachrisson 1968). The list of members included Prof. G. Willem Ovink in
Amsterdam, Dr E.C. Poulton at the Medical Research Council’s Applied
Psychology Unit in Cambridge, François Richaudeau in Paris, Alison
Shaw at the Library Association in London, Herbert Spencer at the
Royal College of Art in London, Dr Dirk Wendt at the department of
psychology at the university of Hamburg, and the chairman of the
committee, Dr Bror Zachrisson at ‘Grafiska Institutet’ in Stockholm.
In the same year as ATypI set up its legibility research committee,
the direct predecessor of the scholarly journal Visible Language was
established as The Journal of Typographic Research. Four of the five
members on the editorial board listed in the first issue were well known
‘legibility researchers’: Dr Ovink from Holland, Dr Poulton from Great
Britain, Dr Tinker from the United States and Dr Zachrisson from
Sweden. This surely reflects the great optimism for legibility research of
that time.
In addition to the large amount of research carried out by many
researchers scattered around in several countries, some researchers or
research teams stood out as especially prolific during this peak period.
Tinker published two books in the mid 1960s, mainly surveying his
and Paterson’s work (i.e. Tinker 1963; Tinker 1965). In Germany Dirk
53. This paper is mentioned below, in the section ‘Post positivistic critique, and notions of
tacit craft knowledge’, in chapter 4, ‘Critiques of legibility research’.
A century of legibility research / 40
Wendt performed a series of studi es. In Great Britain Christopher
Poulton published many studies on legibility. In Sweden Bror Zachrisson
was heavily engaged in legibility research; his doctoral thesis on the
subject, written in English, was published in 1965 (i.e. Zachrisson 1965).
In France, François Richaudeau was involved in legibility research (see
Richaudeau 1984), and he published a facsimile edition of Emile Javal’s
Physiologie de la lecture et de l’écriture (Javal [1905] 1978).
Legibility studies related to the problems of certain categories of
readers were published in the form of monographs, with topics like
legibility for the partially sighted (i.e. Shaw 1969), and legibility in
children’s books (i.e. Watt and Nisbet 1974). Also several extensive
bibliographies on legibility54 and related55 areas were published in the
United States, Great Britain and elsewhere, during this relatively short
period.
Legibility research did however not only exist within the boundaries
of academe in this period. Research was also mediated to designers and
to craftsmen in the typesetting and printing industry. Design magazines
and printer’s trade journals every now and then either summarised
legibility research or published actual studies. This was certainly the
case in the United States, Great Britain, Germany and the Scandinavian
countries. What’s more: The preoccupation with legibility research at
educational institutions, like the Royal College of Art in London (see
below) and ‘Grafiska Institutet’ in Stockholm, may have had an impact
on the institutions’ graphic design students. And in 1972, Rolf F. Rehe
published the first of at least five editions in English and at least one
edition in German of his compilation of findings from legibility research,
Typography: how to make it most legible (see Rehe 1984). This book was
distributed in Europe for its members by the International Association
for Newspaper and Media Technology (IFRA), intended for use by
editors, journalists, designers and compositors.56
54. Cornog and Rose 1967; Foster 1971; Foster 1972; Tomaszewski 1973; Foster 1980.
55. Macdonald-Ross and Smith 1977; Felker 1980.
56. Rehe also published in American and German printing industry trade journals; e.g.
Rehe 1970, and 1971.
A century of legibility research / 41
Changing paradigms: from ‘legibility’ to ‘usability’
In Great Britain the typographer Herbert Spencer published his
acclaimed and influential review of legibility research, The visible word,
in 1968 (Spencer 1969). For Spencer this book marked the start of a
large number of legibility studies carried out together with Linda
Reynolds and Brian Coe at the ‘Readability of Print Research Unit’
(established in 1966) at the Royal College of Art during the 1970s (see
Spencer 1974; and Reynolds 1979a). Interestingly, the research carried
out by this team moved away from legibility researchers’ preoccupation
with details in print, to the structuring and articulating of information
segments on actual57 pages, text more complex than continuous prose
(for example bibliographies), and operational methods such as search
tasks. It also moved further away, from print to microform and screen
legibility (areas with a high degree of constraints), and even further to
environmental wayfinding graphics (directional signing).
The typographer Peter Burnhill and the psychologist James
Hartley, making up another prolific research team, published many of
their papers in Applied Ergonomics and other journals in the 1970s. This
team did not only include a typographer as was the case with the team at
the Royal College of Art, but it also moved in the same direction as the
RCA team away from legibility researchers’ preoccupation with details,
to the structuring and articulating of information on actual pages. To use
the framework suggested by Waller (1987), cited above58 in this thesis:
Both teams moved away from research primarily related to the ‘psycho-
motor domain’ to research related to the ‘cognitive domain’.
This move away from preoccupation with micro-features to macro-
features parallels a contemporaneous development within linguistics,
expanding from the earlier formalist preoccupation with elements within
the sentence level as units of investigation, to contextualist macro-
approaches such as discourse analysis, pragmatics and text linguistics.
Today, it might be said that the much broader field ofinformation
design’ which emerged in the 1970s, with the Information Design
57. As opposed to the former common practice where the stimulus material often
consisted of isolated letters, words, or isolated lines of text, or isolated chunks of text.
58. See the section ‘Ergonomics as a framework’, in chapter 2.
A century of legibility research / 42
Journal launched in 1979 and the first ‘Information design conference’
held in Holland in 1979,59 as important landmarks, has in one sense, but
only in one sense,60 replaced the former field of ‘legibility’ studies.
Information design as it understands itself today61 is an expanded area
with fuzzy borders having its roots not only in legibility research, but
also in technical writing, graphic design, instructional technology,
humancomputer interaction, psychology, and linguistic areas of
investigation like stylistics, rhetorics, pragmatics, discourse analysis and
text linguistics.62
Information design claims to consist of theory and academic
research on the one hand and design practice on the other hand.
Information design studies are to a large extent based on other methods
and concerns than legibility studies. Where legibility research was
preoccupied with typographic micro and meso variables and regarded
research in the form of laboratory experiments as the sole method of
generating valid (and universal) knowledge, information design takes
quite another approach.
First: Information design is more concerned with the design of
documents on a macro level, and it has a much greater scope with regard
to media substrate and mode of graphic symbolisation and mode of
graphic configuration.63 Information design is not only preoccupied with
verbal graphic media types and pictorial graphic media types, but also
with ‘schematic’ graphic media types like diagrams and maps.
Information design prefers to focus on relatively complex media
interfaces (or systems) intended for use in relatively demanding
situations, like static graphic artefacts such as forms, contracts, public
59. The papers presented at the conference was published under the title Information
design in 1984 (i.e. Easterby and Zwaga 1984).
60. See the next section, ‘Legibility research today’.
61. For s elf-reflective contributions on the roots of ‘information design’ and what it is or
might be, see Waller 1979; Kinross 1985; 1992, pp. 141143; Schriver 1997,
pp. 13149; 1998; Sless 1994; 1998; Wright 1998; Horn 1998; 1999; Taylor 1998; and
Burke 1998. For a highly relevant description of antecedents to the overlapping field
of ‘humancomputer interaction’, and its development during the last two decades, see
Carroll 1997.
62. See for example Waller 1991; Campbell 1995; and Schriver 1997.
63. The concept-pair ‘mode of symbolisation’ and ‘mode of configuration’ appears in
Twyman 1979.
A century of legibility research / 43
transport timetables, warnings, user documentation and environmental
wayfinding systems. And more vigorously,64 on interactive graphic
artefacts like the graphical user interfaces of computer software,
multimedia applications and Internet web sites.
Second: Information design involves not only laboratory
experiments as a behavioural methodology but strongly advocates the
alternative epistemology of iterative usability testing (process oriented
formative evaluation)65 as a pragmatic methodology for giving ‘real-time’
support during actual design processes (Stiff 1995). In other words, focus
is on the user’s behaviour at the interface, and not on traditional
experimental research. Not only contextual usability testing, but also the
advantage of focusing on the conversational metaphor,66 as well as user
participation in the design process,67 has recently been forcefully
advocated. ‘Usability’ and ‘the user’ are perhaps the two most central
rhetorical concepts in ‘information design’; as they are in ‘human
computer interaction’.68
Information design does not only advocate a multi-disciplinary
approach with regard to research, but also advocates a multi-disciplinary
participation in the activity of designing while regarding authoring,
editing, information structuring, and graphic design as an integral
process.
64. Wright 1994, p. 1; Stiff 1995, p. 65.
65. For t he s ak e of o rd er: ‘Usab il it y te st in g’ i s no t li ke hyp ot hesi s te stin g in sci en ce, but
rather like fault diagnosis in technology and engineering. Although the ideal is
formative evaluation (iterative testing) during the design process, most usability
evaluations (in software design and development) are still primarily summative
(Landauer 1997, p. 204). The concepts of ‘formative’ and ‘summative’ evaluation were
introduced by Scriven (1967). For a widely used textbook on usability testing (in the
context of the design and development of computer software), see Jacob Nielsen’s
Usability engineering (Nielsen 1993). John Carroll, in the context of humancomputer
interaction, places usability evaluation in an historical and comparative context
(1997).
66. See Sless 1996, 1998.
67. See for example Kyng and Greenbaum 1991 (in the context of ‘computer systems’);
and Buur and Bagger 1999 (in the context of ‘solid user interfaces’).
68. For t wo p ap ers th at p ro bl emat is e th e co ncep ts ‘usa bili ty ’, ‘u se r’, a nd ‘use r in te rface’ ,
see Grudin 1993; and Cooper and Bowers 1995.
A century of legibility research / 44
Actually, as early as 1965 the researcher Christopher Poulton and
co-authors, in connection with typeface design, suggested an iterative
process oriented approach to legibility research:
Using such an experimental technique … type designers could compare
the high with the low scoring designs, attempt by insight to discover
which factor in the designs had made them more or less readable, and
redesign the high scoring faces to make them even better. A second
round of precise experimental evaluation could then be undertaken.
This process of scientific evaluation followed by intuitive design followed
by scientific evaluation would continue until some extremely readable
type faces had been developed. (Cheetham, Poulton, and Grimbly 1965,
p. 51)
A similar idea (and empirical approach) had actually been pioneered
by the famous industrial designer Henry Dreyfuss in the 1940’s (see the
chapter ‘The importance of testing’ in Designing for people (Dreyfuss
1955, pp. 6269). John Carroll points out that Dreyfuss’ approach
incorporated four central ideas: early prototyping, the involvement of
real users, introduction of new functions through familiar ‘survival
forms’, and many cycles of design iteration (Carroll 1997, pp. 503504).
Michael Macdonald-Ross and Robert Waller anticipated the
usability testing trend in information design in their short but
substantial article published in 1975 with the telling title ‘Criticism,
alternatives and tests: a conceptual framework for improving
typography’.69 Besides criticising legibility research as impotent, they
forcefully advocated an approach for improving practice by moving away
from the experimental paradigm of ‘discovering universal truth’. This, by
exchanging the experimental approach with iterative test procedures
prefaced with informed criticism based on the tacit knowledge of
typographers and designers. They also stressed the value of parallel
testing of alternatives. They drew on Michael Polanyi’s concept of ‘tacit
knowledge’70 and on a particular feature of Karl Popper’s philosophy of
science stressing the fruitfulness of an approach of ‘conjectures and
refutations’.71 Popper’s philosophy (and ‘falsificationism’) was based on a
69. Macdonald-Ross and Waller 1975. See also Schumacher & Waller (1985).
70. See Polanyi 1958, and 1966.
71. See Popper 1963.
A century of legibility research / 45
deductive logic, where induction (generalisations or theory solely based
on observations) was rejected as an appropriate scientific method.
A line of thought similar to Macdonald-Ross and Waller’s was in the
early 1970s advocated in theory on the architectural design process and
design education, most notably in a substantial paper by Bill Hillier,
John Musgrove and Pat O’Sullivan.72 They advocated the iterative
‘conjecture-test’ model of design, i.e. a deductive procedure, where form is
generated first (based on a repertoire of already conceived solution types)
and then checked against constraints. This model stands in contrast to
the positivist principle of inductive analysis-synthesis and to what is
claimed to be the modern movement’s conception (or rationalisation) of
design as an inductive process where form hardly has a relative
autonomy and is believed to be primarily generated by the specific job’s
pre-existing requirements and constraints (purpose, specification,
content, available materials and available production technology).73 It is
realised today that even highly formalised engineering design processes
with many participants and where scale and complexity demands that
pre-articulated specifications to a large extent drives the problem solving
(and problem definition) process are deductive processes, characterised
by frequent iterations and reformulations of aims and objectives.
Two prominent names in 20 century legibility research are linked to
the (British) Medical Research Council; R.L. Pyke who published The
legibility of print in 1926, and the researcher E.C. Poulton, at the
Medical Research Council’s Applied Psychology Unit in Cambridge, who
was very active in the field of legibility research in the 1960s. One of the
leading figures in the ‘information design’ movement for two decades,
Patricia Wright, was for many years affiliated with the Medical Research
72. Hillier et al. [1972 ] 198 4; Gel ernter 1981, pp. 271292; 1993. See also Cross et al.
1981; Buchanan 1992; and Jonas 1993. Hillier and his co-authors focus on the role of
‘preconceptions’ or ‘prestructures’ in designing (the designers repertoire of solution
types and his knowledge of genres). This resembles Heidegger’s concept of ‘fore-
structure’ and Gadamer’s concept of ‘prejudice’. However, no references to Gadamer’s
Wahrheit un d Me thode (1969) or the translation Truth and method (1975) is given in
their [1972] 1984 article.
73. See also Jan Michl’s relevant paper on the historical role of the inductivist, and
impossible, slogan ‘form follows function’ as a designer-centred carte blanche for
modernist aesthetic preferences (1995); as well as his paper: ‘On the rumor of
functional perfection’ (Michl 1991).
A century of legibility research / 46
Council’s Applied Psychology Unit in Cambridge. I choose to interpret
Patricia Wright’s work as an index of the paradigm shift from ‘legibility
to ‘usability’, and from legibility research to information design, that has
gone on in the last few decades.
Legibility research today
Psychology
Interest in legibility research is today far removed from mainstream
psychology. An indication of this can be found in the four-volume
Encyclopaedia of psychology published by John Wiley in 1994. The
encyclopedia does not contain any entry or information on ‘legibility’ or
on Miles Albert Tinker who for four decades published prolifically in
journals like The Journal of Applied Psychology and The Journal of
Experimental Psychology. Tinker’s fellow researcher Paterson was even
the editor of the Journal of Applied Psychology for twelve years, as well
as being the Secretary of the American Psychological Association for six
years (Sutherland 1989, p. 131). Furthermore, in The thesaurus of
psychological index terms, published by the American Psychological
Association, ‘legibility’ appeared as an index term from 1978 until
1993.74 The fact that interest in legibility studies is nearly non-existent
in psychology today does not necessarily say anything about the
prospects for or value of legibility studies, but probably more about the
aversion to getting associated with such highly practical matters as the
legibility of reading matter; in order not to be marginalized within
psychology and cognitive science. Nevertheless, although legibility
research has dwindled and lost status and legitimacy since its hey-day in
the 1960s and 1970s,75 it has certainly not vanished.
74. See the thesaurus, i.e. Walker 1994, pp. 120, 1282.
75. See the previous section, ‘Changing paradigms: from “legibility” to “usability” ’, and
chapter 4, ‘Critiques of legibility research’.
A century of legibility research / 47
Graphic design
Legibility research is today less visible in typographic and graphic design
discourse than any time before. There are some scattered exceptions, for
example a popular summary by Linda Reynolds about legibility and
legibility research published in a fairly recent issue of the typography
journal Baseline (Reynolds 1988). Legibility research is rarely mentioned
in textbooks on typography or graphic design. One exception is Ruari
McLean’s widely read manual of typography (1980), of which the 1992
edition is still in print. However, McLean is sceptical about legibility
research. Baird and Turnbull’s widely read textbook on typography for
journalism students, The graphics of communication, used to extensively
report legibility research in earlier editions; however, in its sixth edition
(1993) all references to legibility research are removed. A scholarly
journal like Visible Language, which to some extent addresses the
international scholarly graphic design community, might occasionally
carry a legibility research paper (for example Wendt et al. 1997, and
Dyson and Kipping 1998b).76 However, the influential and leading
graphic design journals Eye and Print hardly ever carry articles about
legibility research. The same applies to printing industry trade journals;
journals that earlier, and especially in the 1970s, every now and then
carried articles about legibility research.77 One conspicuous exception in
the context of graphic design is a book on legibility research published by
the Graphic College in Denmark a couple of years ago (Pedersen and
Kidmose 1993).78 Another exception is Colin Wheildon’s Type & layout:
how typography and design can get your message across or get in the
way (1995).79
76. It is not often that legibility research is used in humanities texts; however, a recent
example is Paul Saenger’s historical study Space between words: the origins of silent
reading (1997, pp. 2629, 305314).
77. This assessment is not based on a systematic quantitative study of any kind, but it is
neither impressionistic; it is rather based on acquaintance with relevant primary
sources and some searches in relevant bibliographical databases.
78. Recommended by the graphic designer Colin Banks in Visible Language, v ol. 28 ,
no. 1, 1994, p. 88; and reviewed by the writer of this thesis in Information Design
Journal (Lund 1995).
79. Assessed in chapter 5; and reviewed by the writer of this thesis in Information Design
Journal (Lund 1997).
A century of legibility research / 48
Digital typography
There are three broadly defined areas where legibility research has
legitimacy and appeal today. Within ‘digital typography’, within ‘vision
research’, and within the ‘paradigm’ that in some sense took over from
legibility research, ‘ergonomics and information design’.
Legibility issues are a recurring concern within ‘digital typography’
(and to some extent within the wider area of ‘computer graphics’). The
canonical texts (e.g. Tinker 1963; and Zachrisson 1965) and individual
empirical studies are every now and then discussed or briefly referred
to,80 or individual studies are published.81
A particular strand of legibility research within ‘digital typography’
is the evaluation of the legibility of ‘greyscale’ representations of fonts on
computer screens.82 ‘Greyscale bitmapping’ is today performed by special
applications or by the operating system. It represents one of several
strategies for compensating the relative coarse resolution of the screen
when representing high resolution outline fonts on the screen. That is,
compensation beyond the enhancing rasterization managed by some
fonts due to their format, or managed by special applications like Adobe
Type Manager, or managed by the operating system.83 Alternative
methods to ‘greyscaling’ are: ‘binary bitmapping’ of particular size
instances, like the complementary screenfonts to PostScript Type 1
outline fonts; the alternative or complementary method of ‘hinting’;84
and not least, the design of typefaces where the primary font is the
bitmap font dedicated for the screen, while the outline version for high
resolution output is generated on a subsidiary basis; for example Apple’s
80. See for example Rubinstein 1988; Clark 1989; Bowden and Brailsford 1989; Morris
1989; Naiman 1991; Farrell 1991; van Nes 1991; André 1993; Bigelow and Holmes
1993; Dyson 1993; Kahn and Lenk 1993; Karow 1997.
81. See for example Morris 1989; Morris et al. 1991; de Lange et al. 199 3; Mor ris, Berr y
and Hargreaves 1993; Wendt 1994; Dyson and Kipping 1997, 1998a.
82. See for example Naiman 1991; Black and Boag 1992; Hersch et al. 1995; O’Reg an
et al. 1996; Morris et al. 1998; a nd Bo yarsk i et al. 1998.
83. There is an ever growing literature on the www on this dynamic area of
digital typography. A good starting point for information is the ‘Microsoft Typography
Home Page’: http://www.microsoft.com/typography/ (15 September, 1999).
84. See for example Deach 1992; Hersch 1993; and Stamm 1998.
A century of legibility research / 49
system font Geneva,85 Zuzana Licko’s typeface Base (from Emigre),86
and Matthew Carter’s typeface Verdana (from Microsoft).87 Interest-
ingly, the research and development carried out in this area has involved
first of all computer scientists and typographers. Furthermore, the
efforts in this clearly critical area of making typefaces as legible as
possible on the relatively coarse screen has primarily been based on
type design domain knowledge,88 advances in software engineering, and
visual inspection, and less on behavioural experiments.89
Another particular strand of legibility research affiliated with
‘digital typography’ is based on contemporary information processing
theories about human vision and the sensitivity of the human eye to
‘spatial frequencies’ of typographic material. Here, one starting point
was to investigate the necessary spatial resolution for output devices
such as imagesetters, printers and computer screens in order to satisfy
the requirements of human vision.90 This interest has led to legibility
studies and theories about how (and at least in one respect, the objective
and measurable) spatial frequencies of typographic material determine
its legibility.91 As mentioned above, the relative legibility of serif and
sans serif typefaces has also been theorised on the basis of their ‘spatial
frequencies’.92
85. See Bigelow and Holmes 1991.
86. See Licko 1996a.
87. See Boyarski et al. 1998.
88. For e xamp le : The (sc al ab le) bi nary b it map sc reen f on t Verd an a is c ha ra cter is ed b y
extra large x-height, extra space between glyphs, particularly even space between
glyphs, special care with the design of glyphs such as 1, I, l, i and J, and special curve
treatment.
89. Exceptions are: Black and Boag 1992; O’Regan et al. 1996; M orris et al. 1998; and
Boyarski et al. 1998.
90. See for example Bigelow 1982; Bigelow and Day 1983; Rubinstein 1988; and Naiman
1991. See Farrell 1991 for a good overview of basic sources.
91. See for example Legge et al. 1985a, 1985b, 1987; Morris 1988; Rubinstein 1988; Clark
1989; Morris 1989; Morris, Berry and Hargreaves 1993; Hallberg 1992.
92. See Bigelow and Day 1983, p. 102; Morris 1988; Rubinstein 1988, pp. 4347; Clark
1989; Hallberg 1992, pp. 106108; Gelderman 1999, pp. 101102.
A century of legibility research / 50
Ergonomics and information design
Importantly, legibility research is alive within the broadly designated
area ‘ergonomics and information design’. I here refer to areas such as
ergonomics and human factors,93 humancomputer interaction (where
an eager interest in computer screen legibility fits in),94 technical docu-
mentation and information design,95 instructional technology,96 safety
science and warning design research,97 and transportation research.98
93. Ergonomics at work, a s tanda rd te xtbook, devot es a whole chap ter t o ‘Man man
communication: words and symbols’ (Oborne 1987). A subsection of this chapter is a
presentation of findings from legibility research (pp. 7380). In Ergonomic Abstracts,
under the main category ‘visual communication’ several subcategories are today open
to what can be described as traditional legibility studies (for example subcategory 8.1.
‘design of alphanumeric characters’, and sub-sub category 8.1.2. ‘shape of characters’).
94. A comprehensive introduction to many studies of various aspects of screen
ergonomics (including differences of reading from screen vs. paper) is found in
Andrew Dillon’s Designing usable electronic text (1994, pp. 2858); earlier published
in Ergonomics (Dillon 1992). For a comprehensive review of the ‘legibility of
continuous text on computer screens’, see Frenckner 1990. For examples of recent
individual studies of the legibility of various typographic formats on screen, see Dyson
and Kipping 1997, 1998a, 1998b. For studies that compare serif and sans serif
typefaces on computer screens, see Engman 1988; Smedshammar et al. 199 0;
Williams 1990; Holleran 1992; Garcia and Caldera 1996; Boyarski et al. 1998 ; and
Stone et al. 1999. A type face screen legib ility stud y by Geske (199 6) comp ares vario us
sizes, weights and postures of sans serif typefaces only.
95. See for example the prominent textbooks by Schriver (1997) and by Kostelnick and
Roberts (1998); the annotated bibliography by Kempson and Moore (1994); Sidney
Berger’s The design of bibliographies (1991); overview articles like Benson 1985 in
Tec hni cal Co mmu nica tor , and Gr ibbon s 1991 in IEEE Transactions on Professional
Communication; and G allag her a nd Jacob son’s ov ervie w art icle (1993 ). For exa mp les
of brief references given to the legibility research literature in prominent ‘information
design’ publications, see Kosslyn 1994 (pp. 90‚ 283); and Campbell 1995 (p. 4).
96. See for example the PhD dissertations Stahl 1989; Taylor 1990; and Kravutske 1994.
97. See for example the studies by Braun and Silver 1995; Silver 1993; Silver et al. 1994.
98. See for example the comprehensive review by Zwalhen et al. 1995.
A century of legibility research / 51
Vision research
And further, a vigorous strand of legibility research exists within the
field of ‘vision research’ where the interest is not only preoccupied with
the understanding of human vision, but also with human vision’s role in
the task of reading, with the design of optimal reading conditions, with
the design of improved reading material, with developing aids for
readers with impaired vision, with the aim to understand how visual
disorders affect or hinder reading, and with the design of clinical reading
acuity tests.99 Examples of legibility research carried out within the field
of vision research can be the typeface legibility study by Yager et al.
(1998) published in Vision Research; a study by Roger Watt published in
the collection Computers and typography (1993); Arnold Wilkins’
intriguing monograph Visual stress (1995);100 as well as parts of the
literature on ‘spatial frequencies’ referred to above in the context of
‘digital typography’.
And importantly; the impressive publishing program of influential
articles under the series title ‘Psychophysics of reading’, published by
Gordon Legge and various co-authors starting in 1985 and still
running,101 with the eighteenth article published in 1998.102 Many of
the articles in this series have appeared in the journal Vision Research,
while some of them have appeared in other ophthalmology and vision
research journals. Gordon Legge’s affiliation with Miles Tinker’s
Department of Psychology at the University of Minnesota, is maybe not
without some significance.
99. See for example Legge et al. 19 85a; 1 987, p. 1165 ; 1989 , p. 51; Ma nsfield et al. 1996 ,
p. 1492.
100. For men ti on o f Wil kins book , s ee t he sec ti on ‘R esea rc h that b re aks wi th “r ecei ve d
wisdom” ’, in chapter 6, ‘Discussion: knowledge production and technical rationality’.
101. To a la rg e ex te nt bas ed on ‘ fast es t read in g ra te as a m ea sure o f pe rf or manc e.
102. See for example Legge et al. 1985 a (1: ‘ Normal vis ion’); 1985b (2: ‘Lo w vis ion’) ; 1987
(5: ‘The role of contrast in normal vision’); 1989 (7: ‘Comprehension in low
vision’); 1997 (16: ‘The visual span in normal and low vision’); Mansfield et al. 1996
(15: ‘Font effects in normal and low vision’); and Chung et al. 1998 (18: ‘ The e ffect of
print size on reading speed in normal peripheral vision’).
A century of legibility research / 52
Many recent studies
Although the absolute number of ‘traditional’ legibility studies seems to
be much lower today than in the 1960s and 1970s, and furthermore, that
the relative number of these studies has diminished due to a much larger
total research output today, a surprisingly large number of legibility
studies are still carried out at least a surprisingly large number of
studies that compare serif and sans serif typefaces. These studies appear
as PhD theses or as papers published in journals and conference
proceedings. They are carried out under a variety of labels, and some of
them are, as already mentioned, about computer screen legibility. Some
of the studies appear in clusters, like three PhD theses under the label
‘instructional technology’ from Wayne State University,103 a row of
papers on ‘warning design’ by Clayton Silver and Curt Braun and co-
authors published in journals like Safety Science and Ergonomics as well
as in human factors conference proceedings.104
A flurry of computer screen legibility studies and reviews were
published by the Royal Institute of Technology in Stockholm in the late
1980s and early 1990s (for example the following studies and reviews,
solely or in part on typeface legibility: Engman 1988; Smedshammar
et al. 1990; Frenckner 1990; and Frenckner et al. 1991). Interestingly,
Hans Smedshammar represents a link back to Bror Zachrisson.
Together, they published papers and reports on legibility for the
partially sighted in the early and mid 1970s (e.g. Zachrisson and
Smedshammar 1971, 1973a, 1973b).
If the number of studies comparing serif and sans serif typefaces
identified for this thesis (see the table on next page) is representative of
the number of legibility studies carried out today, (although excluding
computer screen studies) then Robert Waller was wrong in 1991 when he
claimed that ‘since the late 1960s, research on ‘simple matters of
legibility’ has tended to be undertaken only in special circumstances
such as for new display technologies’ (p. 343). Likewise, Karen Schriver’s
complaint that there ‘sadly … has been very little empirical research on
103. Stahl 1989; Taylor 1990; and Kravutske 1994.
104. Two of these papers are reviewed in chapter 5 in this thesis (i.e. Silver and Braun
1993; and Silver, Kline and Braun 1994).
A century of legibility research / 53
typography during the 1980s and 1990s’ (1997, p. 276n) ought at least to
be modified. And Susan Roth’s plea in a recent issue of Design Issues
(1999, p. 21), for more legibility studies to be performed, seems ill-
informed. As many as two thirds of the 72 typeface legibility studies
identified for this thesis have been carried out in the relative short
period between 1970 and 1999. And the record number for any decade is
the 20 studies carried out in the 1990s.105
1890s 1900s 1910s 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
1 0 2 2 4 4 3 8 10 18 20
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
TABLE 1. Number (per decade) of legibility studies identified for this thesis on
the relative legibility of serif and sans serif typefaces (or on the relative legibility
of individual typefaces where both serif and sans serif typefaces are included).
The total number is 72. The studies are predominantly experimental, except for
7 preference studies.
105. See the introduction to chapter 5, ‘A review of empirical studies’.
54
4 Critiques of legibility research
Introduction
Traditional experimental legibility research has been attacked from
various positions, especially during the last few decades. This, first of all
is a reflection of the general reaction against positivism in the social
sciences, but later also as a reflection of postmodernist thought. These
criticisms have come from ‘internal’ positions (limited criticisms of
certain methodological aspects only), and from ‘external’ positions
(a rejection of legibility research, or a rejection of prevalent notions of
legibility, or a rejection of the need for legibility per se). It is also worth
pointing out that criticism of legibility research has not been confined to
academe only, but has also been voiced in an articulate manner by type
designers and typographers. The material below shows that serious
criticism is not something new, and only made possible in a climate of
post-modern problematising scepticism, but is in fact much older.
Whether or not the current ‘low status’ of legibility research and its
demise as a ‘research program’ is partly a reflection of these criticisms is
difficult to establish.
Critiques of legibility research / 55
Post-positivistic critique, and notions of tacit craft
knowledge
The conventional view of ‘technical rationality’ (see Schön 1983) is well
epitomised in the 1965 article ‘The case for research’ by Christopher
Poulton and co-authors (of whom one is a typographer). While describing
designers’ knowledge as mere ‘guesses’ as opposed to knowledge
generated from research as ‘facts’, the authors state:
But how much is really known about the effectiveness of printed graphic
design? There are plenty of opinions, plenty of guesses but how much
soundly based fact? … there is no real body of knowledge about graphic
design slogans substitute for fact. (Cheetham, Poulton, and Grimbly
1965, p. 48)
The overall validity of an approach to typography where
professional knowledge, tacit craft experience, visual sensibility, and
even careful common sense reasoning, is totally discarded in favour of a
positivist belief in ‘science’ and where experiments are regarded as the
only method of generating valid knowledge, has been questioned, most
notably by Michael Macdonald-Ross, Robert Waller, and Joycelyn
Chapman.106
Reviewing the educational psychology literature on text design, it is
quite common to find what at first sight appears to be a staggeringly
naïve view of what counts as knowledge. In effect, a game is being
played where a new ‘fact’ is admitted to the circle of those playing only
when an experiment has appeared in the literature to support it. No
other knowledge counts. The game is played in code: ‘nothing is known
about ...’ or ‘we do not know ...’ means ‘no one has published an
experiment about ...’ (Waller 1987, p. 73)
Macdonald-Ross and Waller partly based their arguments on Michael
Polanyi’s concept of ‘tacit knowledge’, which he attached to the silent
un-articulated and informal knowledge of experienced craftsmen and
designers (see Polyanyi 1958; 1966).
106. See Macdonald-Ross and Waller 1975; Macdonald-Ross and Smith 1977; Macdonald-
Ross 1978, 1989; Waller 1987, pp. 7095; and Chapman 1978.
Critiques of legibility research / 56
Notions of ‘tacit knowledge’107 have, during the last three
decades108 gained considerable ground in social theory and philosophy,
and have even opened up several discursive domains. These notions take
part in contemporary epistemological discourse on a broad spectrum: As
a part of the critique of conceptions about human knowledge in ‘artificial
intelligence’ research and ‘expert system’ engineering;109 within ‘Marxist
labour process sociology’ preoccupied with the social power embedded in
technology and the influence of skills and de-skilling on the degree of
workers’ control of the labour process versus various forms of
managerial control seen as important for collective bargaining strength
and political influence;110 in the same vein, ‘action research’ for
empowering vulnerable groups in the labour market, like the ‘working
life’ sociology at ‘Arbetslivscentrum’ in Stockholm;111 as a rhetorical aid
in the professionalisation efforts of semi-professionalised occupations
such as nursing (that is, in addition to the dual strategy of establishing a
‘nursing science’ and conquering management positions in the health
services); in writings on the roles of skill versus self-expression and
individuality, and on skill as bodily competence, in fine art, craft, and
design;112 in the ‘sociology of science and technology’ concerned with the
construction of scientific knowledge;113 and not least within pedagogy
related to the education, training, and practice of professionals.114
Critical voices have been raised,115 and disputes on defining what
exactly ‘tacit knowledge’ is and is not, whether ‘tacit knowledge’ can be
articulated or not, and what ‘tacit knowledge’ types there exist, are also
107. First of all relying on Michael Polanyi (see Rolf 1991), but also to some extent on
Ludwig Wittgenstein (see Janik 1988, 1990), and on Gilbert Ryle (1949; cf. his
distinction between ‘knowing how’ and ‘knowing that’).
108. Since Thomas Kuhn referred to Michael Polanyi’s ideas about tacit knowledge in the
second and enlarged edition of The structure of scientific revolutions in 1970 (see
Kuhn 1970, pp. 44, 191ff).
109. See e.g., Dreyfus and Dreyfus 1986; and Göranson and Josefson 1988.
110. See e.g., Manwaring and Wood 1984.
111. See e.g., Göranzon and Florin 1990.
112. See e.g., Dormer 1994; and McCullough 1996.
113. See e.g., Collins 1987; and Cambrosio and Keating 1988.
114. See e.g., Schön’s extremely influential books (1983, 1987); Dreyfus and Dreyfus 1986;
Polkinghor ne 1992; and Mola nder 1996.
115. See e.g., Rolf 1991; and Fuller 1992.
Critiques of legibility research / 57
present.116 A related concern can be found in writings on the role of
visual or non-verbal thinking in the design process.117
The focus on tacit knowledge has meant that what counts as
knowledge has undergone a kind of democratisation (for example:
nurses’ tacit knowledge should count as knowledge: nursing embodies a
distinctive competence that nobody else has). However, one thing is to be
aware of the tacit knowledge of, for example design practitioners, and its
important role in the design process, and to be aware of the rich source of
knowledge about process and product that it embodies. Another thing is
the limitations in the explanatory potential of tacit knowledge. Tacit
knowledge can be the practitioner’s accumulated knowledge, but it can
also be highly contingent and situated action-oriented knowledge.
Furthermore, practitioners’ tacit knowledge may differ even more than
discursive knowledge among individual practitioners; tacit knowledge is
by definition local knowledge. There are, for example, practitioners who
find sans serif typefaces ‘illegible’, and practitioners who do not hesitate
in using sans serif typefaces for whatever application, as well as
practitioners who prefer to use sans serif typefaces only for specific
purposes.
The philosopher Steve Fuller is a rare example of someone who
vigorously questions the explanatory power of the epistemic concept
‘tacit knowledge’. He points out that by relying too much on a concept
like ‘tacit knowledge’ one can easily mistake passable for optimal
performance.
The claim that a coal shoveler is an expert in what she does remains
persuasive only as long as coal shovelling is treated as a sui generis
activity with no basis for comparison or examination outside the
vicinity in which the shovelling normally takes place. However, once the
observer expands her horizons, even if it merely involves comparing
shovelers at different locations, then the coal shoveler starts to lose the
patina of expertise. The same may be said of the scientist, though
several conceptual obstacles must be overcome before seeing the full
implications. (Fuller 1992, p. 414)
116. See e.g., Janik 1988, 1990; Cambrosio and Keating 1988; Rolf 1991; Pleasants 1996;
and Molander 1996.
117. See e.g., Ferguson 1992; Goldschmidt 1994; and Henderson 1995.
Critiques of legibility research / 58
Accordingly, the philosopher Bertil Rolf claims that tacit knowledge
has become a magic concept of honour that may function to conserve
current practice. As he points out, it may be difficult to discriminate
between ‘tacit knowledge’ and ‘tacit intoleranceor for that matter
between ‘tacit knowledge’ and ‘tacit stupidity’ (Rolf 1991, p. 53). Thus,
does the apprentice internalise expert knowledge or deep-rooted
inadequate patterns of behaviour? The German philosopher of pedagogy
Erich Weniger (1990) has in a similar vein introduced the concept of ‘the
tyranny of practice’ to describe educational theory where, according to
Weniger, practic e is held to b e self-sufficient (Kvernbekk 1995).
Nevertheless, attempts to solicit design practitioners’ knowledge
on which they base their designing through discursive reflection, (in
addition to soliciting knowledge and theories from ‘within’ artefacts118),
might possibly produce more relevant and interesting data than yet
another of countless legibility experiments. Macdonald-Ross and Smith,
and Macdonald-Ross and Waller, proposed that designers’ under-
estimated knowledge should become the starting point for ‘more fruitful
typographic research’, thus far from advocating the abandonment of
theory and research. I quote extensively:
How well ‘pre-scientific’ crafts solved difficult technical problems
without the benefit of formal empirical tests! This is something of a
mystery, but a mystery which can be dispelled by learning to use our
eyes! The eye is a stupendous device for making comparisons; we must
entertain the possibility that an expert craftsman can by the intelligent
use of his eyes reach better solutions to particular problems than the
researcher in his laboratory. (MacdonaldRoss and Smith 1977, p. 40)
Any casual acquaintance with typographers and designers will show
that their skills are backed by tacit knowledge which is the product of
experience rather than empirical tests or ‘book learning.’ Usually
researchers dismiss such personal knowledge or even deny its existence
simply because it has not been developed by orthodox experimental
methods. But actually, as Polanyi (1962) has shown, all scientific
knowledge (indeed all human knowledge) has its roots in the ‘tacit
dimension.’ … So we should value the personal skills of typographers
and designers, and take them as the starting point for more fruitful
typographic research. (Macdonald-Ross and Waller 1975, p. 7778)
118. See Carroll and Campbell 1989. See also chapter 6, ‘Discussion: knowledge
production and technical rationality’.
Critiques of legibility research / 59
Tinker showed that too little or too much ‘leading’ (nowadays, inter-
linear spacing) would affect legibility adversely. Fair enough. But this
kind of knowledge has been the stock-in-trade of the master-printer for
centuries! So is empirical research just a kind of ‘aid to rhetoric’
whereby educated people (who don’t understand or won’t accept the
experiential basis of craft knowledge) will take heed of advice if it has a
‘scientific’ seal of approval? (MacdonaldRoss and Smith 1975, p. 42)
However, Macdonald-Ross do not want to throw the baby out with the
bath water:
Tinker’s work on the cumulative effect of suboptimal arrangements and
his ideas about the ‘hygienic reading situation’ are both relevant and
practical. Imagine a text full of conceptual muddle, unclear expression,
visual confusion, and suboptimal setting being read by someone who is
tired and disinterested, sitting in poor light, with the text not at right
angles to the line of sight, and a noisy room full of interruptions and
distractions. This gives you some idea of a real-life reading situation
(we might get fewer no-difference results in conditions like these).
Suboptimal settings may not produce large effects when tested one by
one in clinically perfect reading conditions, but they do produce most
significant effects when several occur together and the text is read
under poor conditions. (MacdonaldRoss 1978, p. 77)
We see here two ep ist emolog ies (k nowledg e f rom co ntr oll ed experi men ts
vs. knowledge from personal trial and experience) and two tasks
(universal knowledge vs. particular solutions). The question is, shall
these differences continue to cause misunderstanding, or can they be
mutually supportive? (Macdonald-Ross and Smith 1975, p. 43)
Not only the notion of expert practitioners’ tacit knowledge but also
a notion of careful common sense reasoning is central to Macdonald-Ross’
argument.
Also, reflection (thinking) is a powerful research tool, much under-
estimated in my opinion. (Macdonald-Ross 1989, p. 147)
Thus, Macdonald-Ross is not advocating (as a platform for practice)
the use of layman’s ‘contradictory words of wisdom’, as in the caricature
of non-scientific knowledge that often appears in introductory social
science textbooks (see Furnham 1988, p. 24). Macdonald-Ross is
advocating utilisation of professional expertise and ‘common sense’ in
the sense of careful reflection, and then brought together with research
methodologies for mutual benefit.
Critiques of legibility research / 60
To put this in perspective: Aphoristicwords of wisdom can be
viewed not firstmost as silly contradictory ‘folk wisdom’, but pragmatic
‘theories’ that are valid in certain circumstances. Similarly, theories,
findings and guidelines of ‘scientific psychology’ are also often
contradictory and only valid under certain circumstances (see Kelley
1992, p. 14). With regard to Macdonald-Ross’ insistence on the
importance of careful common sense reasoning, several psychologists
have during the last few decades voiced similar concerns, of which some
have gone one step further and claimed that the technical language of
scientific psychology is in fact often nothing but jargonized common
sense.119 In 1974 psychologist R.B. Joynson cogently formulated this
kind of criticism the following way:
it is altogether too easy to assume that, when the psychologist’s conclu-
sions run counter to common sense, it is the psychologist’s conclusions
which are correct and common sense which was wrong; and it is
altogether too easy to assume that, when the psychologist’s conclusions
agree with common sense, what was previously only guessed has now
been reliably established. These assumptions are not necessarily
justified. … The layman’s understanding, though often imperfect, is not
to be universally dismissed as intuitive guesswork, necessarily inferior
to the special methods of the scientific psychologist. On the contrary, the
layman’s conclusions may well be based on long and varied experience,
frequently interpreted, of course by a highly trained experience.
Experiment in psychology, by contrast, typically operates over short
periods of time, in very restricted environments, and on narrow
segments of behaviour. It would not be surprising if common sense often
proved to be as reliable as experiment, and sometimes more reliable.
So we should be prepared for the possibility that when the psychologist’s
conclusions differ from common sense, it is common sense which is
correct; and also for the possibility that, when the psychologist’s
conclusions agree with common sense, it is the psychologist who has
made the lucky guess. (Joynson 1974, pp. 89)
119. See Lee 1988, pp. 1726; Furnham 1988, pp. 2246; Kelley 1992; and Harré and
Gillett 1994, pp. 5ff.
Critiques of legibility research / 61
Lack of internal validity
Legibility research has also been accused very much on its own terms
of inappropriately relying on single variable laboratory experiments.
That is, on experiments where only one typographic variable120 is
manipulated as the independent experimental variable while other
typographic variables are ‘controlled’ and treated as invariants
whereas in the real work of typographic design a large number of
typographic variables are interdependent and interact simultaneously in
a complex way.121 The experiments may thus become confounded by
‘nuisance variables’ which influence the experimental ‘dependent’
variable as much as or more than the experimental ‘independent’
variable (thus allowing for plausible alternative interpretations of the
results). There is a strong case for claiming that most legibility
researchers have simply been unaware122 of this specific threat to
internal validity, residing in the ‘stimulus material’.
Take for example Pedersen and Kidmose’s experiment set up to
measure the importance of the x-height for the relative legibility of
typefaces (1993; reviewed by the writer of this thesis, i.e. Lund 1995).
They measured reading speed of text set in ITC New Baskerville with its
relatively large x-height and Monotype Baskerville with its relatively
small x-height. However, the two Baskerville typefaces are, as is clearly
shown by an illustration in their study,123 two rather different typeface
120. Typographic ‘variables’ have often (perhaps most often) been referred to as ‘factors’
in the literature. Although the word ‘variable’ is ambiguous and used differently in
various contexts, it seems to me that the connotations yielded by the ‘variable’
metaphor better convey and reveal the functional plasticity of interdependent
typographic variables than the ‘factor’ metaphor.
121. See Buckingham’s early and poignant paper on this issue (1931, pp. 103106). He
also proposed, at least in principle, a solution to get around the problem. See also
Carmichael and Dearborn 1947, pp. 113116ff; Tarr 1949, p. 31; Cheetham, Poulton,
and Grimbly 1965, pp. 103104; Wendt 1972; Watts and Nisbett 1974, pp. 38ff;
Burnhill and Hartley 1975; Macdonald-Ross and Waller 1977; Wright 1978, p. 291;
Rubinstein 1988, p. 176; Naiman 1991, pp. 1415, 132133; Waller 1991; Macdonald-
Ross 1994, p. 4691).
122. ‘Hidden variables are ignored because they are not perceived.’ (Ions 1977, p. 4).
123. In some cases, researchers do not even bother to specify the values of important
typographic variables (such as interlinear spacing); and illustrations that could at
least have provided the opportunity for visual inspection of the stimulus material, are
also omitted in the published material.
Critiques of legibility research / 62
designs. It is thus difficult to say whether the reported differences in
reading speed are caused by the different x-heights or by other design
differences. Furthermore, since the same nominal interlinear space is
used for both texts, it is hard to tell whether the differences in reading
speed are caused by the different x-heights or by the different ratios of
x-height to interlinear space (a possible optimal versus a possible non-
optimal ratio). Thus, we have two typographic variables that
unintentionally vary systematically with the independent experimental
variable.
This example suggests that a solution is not as easy as suggested by
Dirk Wendt (1969‚ p. 6; 1994, pp. 295296). He suggested the creation of
two nearly identical typefaces (by computer manipulation) where ‘under
completely controlled conditions’ only the parameter to be tested vary
while all other parameters are constant. If we look at Pedersen and
Kidmose’s experiment, we realise that Wendt’s proposal eliminates one
part of the problem: the x-height vs. other design differences. However,
we also realise that we are still stuck with the confounding problem
posed by the different ratios of x-height to interlinear space.
Recently, a Dutch student of psychology, in a dissertation on the
merits of legibility research, poignantly suggested:
Perhaps researchers need to consider that the ultimate aim of
typographic design is not to optimize the usability of every single
typographic aspect. This is simply impossible, because all aspects need
to be balanced and optimizing some will leave a choice of non-optimal
options for the other aspects. (Wijnholds 1996, p. 84)
This thoughtful remark by Wijnholds has only one minor weakness: it
implies that there really exist optimal options for ‘each aspect’,
regardless of context. However, the whole point about interdependent
interacting typographic variables is that isolated optimal options just do
not exist. As Buckingham pointed out back in 1931: ‘No length of line, for
example, can be said to be most desirable, independent of … other
characteristics of the type page.’ (p. 103).
Critiques of legibility research / 63
Peripherality to the reading process
One of the most serious objections which has been raised against
experimental legibility research, claims that ‘factors’ such as typeface
are very much peripheral to the reading process, and thus that it will be
very difficult to quantify any meaningful differences in legibility, for
example by employing an operational method which measures the speed
of reading. Put another way, that the typography of reading material on
a micro-variable level (as long as it is reasonably legible) can only play a
peripheral part in the whole complex process of reading.124 This may
explain the many results in legibility studies that shows little or no
difference. David Sless puts it the following way:
It is therefore hardly surprising that experiments conducted to measure
differences in typographic variables as they relate to reading should be
so disappointing. The subjects are at best operating at the fringes of
their normal awareness of the reading task as it is socially defined. Put
another way, the reading schema is one which ignores typography.
(Sless 1981, p. 170)
A somewhat different argument has been made that support this
view: The reading researcher Ronald Carver points out that the reading
rate (i.e. of ‘normal reading’, as opposed to other kinds of reading
processes or what he calls ‘gears of reading’ like either ‘scanning’,
‘skimming’, ‘learning’ or ‘memorising’) is more or less constant for an
individual and is only limited by their comprehension or ‘thinking rate’.
He suggests that individuals move their eyes at the fastest rate they can
think, what he refers to as their ‘cognitive speed’ (Carver 1992).125
124. Although rarely seen, this point has been made by several researchers and
typographers; among them, W.C. Ruediger as early as 1907; R.L. Pyke in 1926
(pp. 6061); Tempte 1961, p. 6, while addressing an audience of prominent
Scandinavian book designers, educators, and printing industry leaders; David Sless
1981 (p. 170); Morris, Berry and Hargreaves 1993; Wijnholds 1996, p. 32; Saenger
1997, p. 20; and not least, albeit obliquely and within a slightly different context, by
Andrew Dillon in the chapter ‘Describing the reading process at an appropriate level’,
in his Designing usable electronic text. Howe ver, this o bject ion d oes not nece ssari ly
imply a rejection of the need carefully to design, modify, or choose, appropriate
typefaces for specific applications (as well as the need carefully to manipulate other
interdependent typographic variables; be it on paper, on screen or on any substrate),
on the basis of aesthetic, craft-based, and ergonomic considerations.
125. See similar suggestions in Zachrisson 1965, p. 23; and in Nell 1988, pp. 8891.
Critiques of legibility research / 64
Patricia Wright formulates a similar concern the following way:
‘The information processes concerned with the sensory information on
the printed page are dominated by higher, conceptual and interpretative
levels of analysis’ (Wright 1978, p. 277). In fact, as early as 1907 the
researcher W.C. Ruediger concluded that: ‘reading rate is in the main
determined centrally by the rapidity with which meaning is aroused
after the words are seen’ (Venezky 1984, p. 10).
Similarly, in a recent article in Information Design Journal, Paul
Stiff, while discussing the German design company Metadesign’s work
for the public transport system in Berlin, points out that the ‘legibility’
(in the traditional sense) of the lettering of the transport system’s
information and wayfinding signs, has little to do with how effective
these signs are and how well these signs answers the questions people
ask when trying to navigate with the help of these signs (Stiff 1993b,
pp. 4445).
While pointing out that there is no easy way to speed up the reading
rate, Carver nevertheless admits that there are many ways to slow down
the reading process. He explicitly refers to an environment of high
constraints and excessively low typographic quality, ‘such as using dim
lighting, dot matrix printers, poor handwriting, or poor screen contrast’
(1992, p. 93). This last point is in line with a point made by Macdonald-
Ross: ‘We may well suspect that ‘no difference’ results would be less
frequent in substandard reading conditions’ (1977, p. 45), and Robin
Kinross (but hardly expressed in the literature of legibility studies):
‘Prospects for useful enquiry are greater than with ordinary printed
texts in these areas of high constraint and low typographic quality’.
(Kinross 1992, p. 142).
Also these points of view were anticipated by Pyke in 1926.
I deliberately quote extensively:
The hypothesis is here put forward that extremely large typographical
differences must be present before it is possible to say that there is any
difference in the objective legibility of types. Of types more slightly
differentiated it is impossible to say that one is objectively more legible
than another. … These two ideas that within limits there is no
objective optimum, and that only extremely violent changes in the
stimulus can produce any significant variation in the reaction are
found in other psycho-physiological fields. … Another factor which clogs
Critiques of legibility research / 65
the influence of objective legibility is the ease with which reading habits
are established. It is partly because this is operating without cease …
that the typographical difference must be large in order to invoke a
change in the reaction. (Pyke 1926, pp. 6061)
Some sixty years later, Richard Rubinstein, in his textbook Digital
typography, while referring to the extensive work of Tinker, points out
that Tinker’s work ‘supports the idea that individual parameters can be
varied within ranges without great detriment to legibility, but that
severe degradation sets in outside this range (1988, p. 177). However,
Rubinstein also points out that typographic variables behave differently
together than when viewed in isolation, and thus that several modest
non-optimal factors may cause problems when they are added together
(ibid. p. 176). The reading psychologists Keith Rayner and Alexander
Pollatsek conclude that ‘the fundamental conclusion to be drawn from
the work on typography is that reading appears to proceed at about the
same rate if the type font, size, and length of line are at all reasonable’
(1989, p. 119). But again, this ‘insight’ is hardly surprising.
To round off, the ‘peripherality of the reading process perspective
can also be tied to the often expressed complaints about the lack of
‘significant’ results and about the heterogeneous results of legibility
research.126 However, as Pedhazur and Schmelkin point out, hetero-
geneous results are not uncommon in sociobehavioural studies ‘the
only consistent result is the inconsistency among results’ (1991, p. 186).
Lack of theory
Several legibility researchers and commentators have pointed out what
they perceive as a serious problem with legibility research that it has
been atheoretical, lacking theories127 guiding research and failing to
126. For suc h co mpla in ts, see f or exa mp le Wh itte mo re 1 94 8, an d Sc hr iv er 1 99 7.
127. Kantowitz, while pointing out that human factors research is conducted to answer
pragmatic questions, nevertheless argues cogently for the usefulness of theories in
human factors research (1992, pp. 389, 393).
Critiques of legibility research / 66
generate post-hoc theories.128 Foster, although hopeful for the future,
pointed out that legibility research has been accused of neither having
generated much of practical significance nor generated theories (1980,
p. 77). Venezky describes the situation the following way:
Legibility studies ... have been and remain totally atheoretical. No
conceptual framework for vision or reading has been invoked for guiding
legibility research; no serious legibility models have been proposed, and
consequently no theories of legibility exist. Yet careful and highly useful
studies have been done on the relative legibilities of almost every
printing variable from type style to page lay out. (1984, p. 4)
In this context, it is nevertheless important to point out that the focus of
Tinker’s and other legibility researchers’ work was not, primarily to
understand or explain human behaviour or cognition (or how human behaviour
could be adapted to the environment) but how the environment should be
adapted to the psycho-motor and cognitive characteristics of humans, and in
that perspective one may say that Tinker’s approach was to discover simple
and applicable ‘functional laws’ based on empirical research (cf. Valentine
1982, p. 9192). Tinker’s concern was improved design of printed material and
practical considerations of typographical variables in order to achieve the
improved design. In an apologetic way, Sandra Sutherland (1989, p. 60) argues
along the same lines as Valentine’s general argument. She also clearly
demonstrates that Tinker ‘expected his work to be applied by typographical
practitioners’ (Sutherland 1989, p. 3). This point is certainly confirmed by the
subtitle of Tinker and Paterson’s first book summarising legibility research:
How to make type readable: a manual for typographers, printers and
advertisers (i.e. Paterson and Tinker 1940). Venezky concludes:
Research on legibility … represents one of the most curious to render
into psychological terms because its goal is not an improved
understanding of human behavior but the improved design of printed
materials. For this reason, legibility studies have been derived from
practical considerations of typographical variables (type face, type size,
line length, etc.) rather than from hypotheses about visual or psycho-
logical processes. Consequently, these studies have not produced
theories (interesting or otherwise) about either reading or the effects of
typographical variation on psychological or physiological abilities.
(1984, p. 22)
128. Fos ter 19 73, p. 2 3; 19 80, p . 7 7; Bu rnhi ll and H artl ey 1 975, p. 72 ; S uthe rl and 19 89,
p. 108.
Critiques of legibility research / 67
‘The hypothesis of habit’ empiricism vs.
rationalism
In the 1960s Dirk Wendt suggested the ‘hypothesis of habit’, or simply
that we read best what we are most used to (Wendt 1970a). Many others
before him (and after) have made similar proposals, among them the
type designers Eric Gill and Adrian Frutiger, and legibility researchers
like Pyke, Tinker, and Burt.129 In fact, this ‘habituation’ point of view
seems ubiquitous in contemporary graphic design discourse where
legibility is mentioned.
As well as having recently been pointed out by the scholar and
‘traditionalist’ typographer John Dreyfus (1994, pp. 284285), this point
of view has recently also been proclaimed prominently by the ‘non-
traditionalist’ type designer Zuzana Licko in the avant-garde-like
graphic design magazine Emigre (Licko 1990, p. 13). The implications of
this thesis are many and interesting and worth discussing although
the overall validity of the ‘hypothesis’ is far from clear. Licko’s statement
has been frequently cited since it first appeared.130
It’s the same with Blackletter, which was at one point more legible to
people than humanist typefaces. That’s a shocker. I agree with the fact
that if you are setting books and other things that just need to be read
and understood easily, you need to use something other than Oakland
Six. In those cases you need to use something that is not necessarily
intrinsically more legible, but that people are used to seeing. This is
what makes certain typestyles more legible or comfortable. Yo u re a d
best what you read most. However, those preferences for typefaces such
as Times Roman exist by habit, because those typefaces have been
around longest. When those typefaces first came out, they were not
what people were used to, either. But because these faces were
129. See Gill 1936, p. 44‚ Eurographic Press 1962, p. 260; Pyke 1926, pp. 6061; Paterson
and Tinker 1932, p. 612; Burt 1959, p. xii; Burt 1960a, p. 278. See also Dreyfus 1964,
p. 65; Lane 1966; Prince 1967, pp. 37f; Wendt 1969; Kinneir 1971, p. 8; Lee 1979,
p. 72; Reynolds 1979, p. 317; Turnbull and Baird 1980, pp. 8788; Reynolds 1984,
pp. 197198; Silver, Kline, and Braun 1994, p. 824; Schriver 1997, p. 301; Kostelnick
and Roberts 1998, p. 142.
130. For exa mp le b y Peder sen an d Ki dm ose (1 993, p. 52), MacK enzi e (199 4, p. 12 ), and
Birkvig (1995, pp. 1011).
Critiques of legibility research / 68
frequently used, they have become extremely legible. Maybe some of my
typefaces will eventually reach this point of acceptance, and therefore
become more legible two hundred years from now, who knows? (Licko
1990, p. 13)
A similar statement has been repeated in a mantra-like manner, as
a typeface specimen text, in multiples on almost every page in the many
product and typeface catalogues issued by the Emigre company during
the 1990s:
Typefaces are not intrinsically legible; rather, it is the reader’s
familiarity with faces that accounts for their legibility. Studies have
shown that readers read best what they read most. Legibility is also a
dynamic process, as readers’ habit are ever changing.
While issuing a new typeface in 1996, Licko used the opportunity to
once again reiterate her familiarity thesis. The typeface in question was
Mrs Eaves, a free interpretation of one of the famous 18th century
English typefounder and printer John Baskerville’s typefaces; typefaces
which in the 20th century have been copied or interpreted by many
type foundries, while bearing the name its originator, Baskerville. The
revived Baskerville typefaces have been, and still are, among the most
successful text typefaces in the 20 century. In a promotional booklet for
Mrs Eaves, Licko refers to the classic story often told or alluded to in
contemporary graphic design discourse about the negative reception of
Baskerville’s typefaces. That is, that contemporaries of him declared his
typefaces as illegible and damaging to their eyesight. Since the
Baskerville typefaces has received such respectability in the 20th
century, Licko sees this as proof of her stance and she explains:
This illustrates once again that readers’ habits do change in time and
are influenced by repeated exposure to particular typefaces, more so
than by any measurable physical characteristics of the typefaces
themselves. (Licko 1996b, p. 2.c).
However, in a review of the typeface family Mrs Eaves in Print,
critic Paul Shaw points out that the reputed ‘illegibility of Baskerville’s
types may be largely mythical’.131 Shaw’s point of view is supported by
the text in Daniel Berkeley Updike’s classic Printing types, and the text
131. Shaw (1996, p. 28D); while referring to F.E. Pardoe’s biography of John Baskerville
(1975, p. 18).
Critiques of legibility research / 69
in James Hutchinson’s Letters, both which Licko quotes in her booklet.
Neither sources do in fact say that Baskerville’s types were generally
received as ‘illegible’.132
The habituation point of view is in one respect a banal truism. It is
obvious that the quality of text can vary considerably, from chunks of
text set in barely decodable experimental alphabets and more or less
illegible handwriting, to comfortably read text set in standard fonts or
relatively legible handwriting. It is of course possible to familiarise
oneself with experimental alphabets and ‘illegible’ handwriting (or to
overcome the unease we temporarily feel when our favourite newspaper
changes its layout). However, whether such familiarisations parallels the
proclaimed effect of familiarisation among all reasonably standard and
rather similar typefaces, is quite another question.
Maureen Mackenzie of the Communication Research Institute of
Australia, makes Zuzana Licko’s argument her own and adds that some
‘recent testing of documents’ at the institute (apparently including
testing on serif vs. sans serif typefaces) suggests that legibility is not
‘biologically determined’ as ‘earlier studies suggested’ but ‘culturally
learned and sub-culturally developed’, and that today, as contrary to a
few decades ago, people are as accustomed to sans serif typefaces as they
are to serif typefaces (MacKenzie 1994, p. 12). Birkvig, at the Craphic
College in Copenhagen, although far more reserved, claims that this
point of view is ‘at least partly scientifically documented’ (1995, p. 10).133
However, in spite of Licko’s claim that ‘studies have shown that
readers read best what they read most’ and similar statements from
MacKenzie, Birkvig, and others, it seems that the many statements
about the role of habituation to typefaces and its effect on reading
performance, seems to lack empirical verification in the literature.
Maybe it is a question that cannot be solved empirically. The many
statements proclaiming this view are as far as I can see a priori
statements, or at best attempts to explain a posteriori aspects of research
132. See Updike [1922] 1962 (pp. 113114); Hutchinson 1983 (p. 184); and Licko 1996
(pp. 2.b2.c).
133. See also Twyman (1970), who claims that ‘Research has shown that we … read most
easily, those types we are most familiar with’ (p. 112); and Gribbons who claims that
‘legibility studies tend to dispute the neurological hypothesis’ (1991, p. 44).
Critiques of legibility research / 70
findings that may point in such a direction.134 The only apparently
substantial exception I have come across is in one of Dirk Wendt’s
studies (published in 1969 and again in 1994) where on the basis of
seven empirical studies he describes a trend that support his ‘hypothesis
of habit’ with regard to typefaces. However, when we look into the details
of the data he relies on, it turns out that his claim is unsubstantiated.135
Mark Allen Williams represents another exception to the omni-
present a priori claim that legibility is merely a function of familiarity.
He suggests that the question is something that warrants further inves-
tigation, and he simply suggests a replication of Paterson and Tinker’s
1932 study in order to test their familiarity assumptions (1990, p. 11).
What can be said for sure on the familiarity question is that it is
firmly rooted in a long-standing philosophical controversy between
rationalism and empiricism. It is also expressed in other ways: nature
vs. nurture; innateness vs. cognitive adaptation, or biologically
determined factors vs. environmentally and culturally determined
factors. The claim that aspects of normal vision are to a certain extent a
function of visual exercise, and thus that activity shapes structures in
the nervous system (cf. e.g. Kolers 1983, p. 54), is plausible. However, it
is also plausible that there are certain physiological and psychological
characteristics and limits in the psycho-motor and cognitive system of
human beings. Where a possible ‘correct answer’ lies on the continuum
between these two positions, and more important, whether a possible
habituation to typefaces operates on the same physiological and
psychological levels as this continuum or not, are still open questions.
134. Without going in detail: Prince 1967 (pp. 37f), and Watts and Nisbet 1974 (pp. 3233;
relying on Prince 1967), based on empirical research among ‘visually
handicapped’ readers, present a posteriori explanations of research findings in favour
of the familiarity thesis; while Coghill’s empirical findings from a reading experiment
among school children goes contrary to the familiarity thesis (1980).
135. See the section ‘Wendt 1969’, in chapter 5,A review of empirical studies’.
Critiques of legibility research / 71
Critique from design practitioners
Many designers have also voiced their scepticism about legibility
research; or alternatively, scepticism has been voiced by non-designers in
design magazines. These voices may represent broad currents among
reflective designers. A few of these are documented below, while Sandra
Sutherland quotes some other voices of the same kind (1989,
pp. 95100). Recurrent topics are about lack of validity, the problem of
habituation, and design practitioners’ tacit knowledge. Other topics are
that ‘legibility’ is an unstable concept, that attention to legibility
research inhibits creativity, and that legibility research does not make a
difference.
As early as 1942, Carl Purington Rollins, the Printer to Yale
University, one of the central exponents of ‘fine printing’ in America, and
a friend of leading lights such as Daniel Berkeley Updike, Bruce Rogers
and William Addison Dwiggins, articulated his scepticism about
legibility research the following way:
There has been a deal of nonsense written about the legibility of type,
the proper length of line and the ‘ideal’ size. I once heard a psychological
adept dogmatize on the subject of the proper type for children’s books,
asserting that ten point type and 88 mm. were the correct size and line.
The trouble with the attempt to decide upon the most legible or
readable type face is that there are too many factors involved to be
controlled by any system of valuation so far devised. The trained eye of
the skilled printer is more to be depended upon than all the
investigations so far conducted. (Rollins 1942, pp. 2829)
Only a few years later, a similar concern was voiced in the
American graphic design magazine Print. There, [Dr] Irving C.
Whittemore, in a belated review of Paterson and Tinker’s 1940 How to
make type readable, chose satire as a tool for questioning the relevance of
legibility research. With a fine sense of humour he staged a fictious
conversation between Bruce Rolldike [sic] ‘director of the University
Press’ and professor Tinkerson [sic] ‘of the psych department’. Their
conversation opens with Rolldike greeting Tinkerson:Ah there,
Tinkerson, how’s the psycho-ing?’ However, although the conversation is
satirical, it raises serious questions about the relevance of legibility
research. In the latter part of his article, Whittemore leaves the satirical
Critiques of legibility research / 72
conversation and focuses on the many different operational methods and
the heterogeneous results of legibility research. He rhetorically asks
what is meant by ‘legibility’:
Do you mean (1) easy to read fast, (2) easy to read at a distance, (3) easy
to read in dim light, (4) easy to read when you haven’t your glasses, (5)
easy on the brain, (6) not tiring to the eyes, (7) possible to grasp in big
gulps of meaning, (8) pleasant to read, (9) inviting to the eye, or (10)
something else? (Whittemore 1948, p. 36)
Whittemore doesn’t leave much honour to legibility research, and he
suggests that an operational measure like ‘speed of reading’ does not
have much relevance, when the many various contexts of reading, as
well as aesthetic considerations and age old craft knowledge, are taken
into consideration.
Soon after, Tinker and Paterson replied in the same journal to
Whittemore’s provocation. Whittemore’s complaint is referred to as ‘an
excellent exposition of the scientific complexities of the problem of
legibility’ and ‘laudable attempt to outline the complexity of the problems
involved in legibility’. However, no concessions are given. Whittemore’s
complaint about legibility research is brushed off and rhetorically turned
on its head as only confirming the complexity of the problem.
Unfortunately, insight concerning one’s own ignorance of the complexity
of the problem is rarely encountered in the practical printer or the type
designer. For this reason, practical printers and type designers are
likely to disregard findings from the research laboratory or at best to
accept only those findings which happen to coincide with their own
beliefs derived from experience or from erroneously designed laboratory
studies. (Tinker and Paterson 1949, p. 61)
Tinker and Paterson also used this opportunity to re-iterate their
dismissal of many of the operational methods that Whittemore mentions.
They left it to those researchers who employ other methods than the
ones they preferred136 to answer for themselves:
We hope th e v isi bility meter a nd eye blink boys will give their answer
and that the printers who follow the manuals of style which have come
down from the pre-scientific era of printing practice and the type
136. The methods preferred by Tinker and Paterson are: speed of reading, eye movement
recordings, and reader preferences.
Critiques of legibility research / 73
designers new and old will, if they can, likewise give answer. (Tinker
and Paterson 1949, p. 61)
At the same time, on the other side of the Atlantic, John C. Tarr,
who had directed the Monotype Corporation’s training program during
the 1920s, and having been a colleague of the influential Beatrice
Warde, 137 in the article ‘A critical discursus on type legibility’ published
in Penrose Annual of 1949, also voiced critical remarks on legibility
research (Tarr 1949).138 He acknowledged that ‘The work already done
by the scientist in the field of visibility and readability ... deserves closer
recognition from the printing industry.’ (p. 31). However, one of his
points is that instead of narrowly focusing on the comparative legibility
of different typefaces it would probably be better to focus on certain
typeface variables such as x-height, typeface ‘colour’, az width, and
typographic variables such as line length, and interlinear spacing
(‘leading’). Tarr, however, is clear in his opinion when it comes to sans
serif typefaces: ‘Experiments with sans serifs and egyptians have value
only in relation to their use as display types, not for continuous text.’
(p.31).
In 1962 the celebrated Swiss type designer Adrian Frutiger
expressed his scepticism about legibility research, in an interview
published in the trade journal Print in Britain:
As for the legibility of the character, this is to my mind a fictitious
problem. I do not believe in tests of which one is more contradictory
than the other. Certain statistical evidence shows, for example, that
Bodoni is the most legible character [sic]; a little later, identical figures
are produced to prove that Bodoni is the least legible of all the
characters. Which are you to believe? I learnt to read with gothic
characters and I never experienced the slightest difficulty. I think
legibility is solely a matter of habit, and speed in reading depends not so
much on the speed of the eye than on that of the mind. (Eurographic
Press 1962, p. 260)
137. See Badaracco 1995, p. 82.
138. Another article by Tarr, published two years earlier in Schweizer Graphische
Mitteilungen, carryi ng th e tit le ‘Le gibil ity i n pri nting ’, is (main ly) not abou t leg ibili ty
research (Tarr 1947). This article offers craft-based a priori advice and reasoning
about the physical properties of typefaces and legibility. The approach of this article
is not unlike the approach taken by Alison Black (1990) in the chapter ‘Legibility’ in
her Type fa ce s fo r de sk to p pu bl is hi ng . That is, in fa ct claimin g the righ t to discu rsive ly
use the term ‘legibility’ without marrying it to empirical experimental research.
Critiques of legibility research / 74
In 1964 typographer John Dreyfus, in a letter to the editor of the
British journal Designwhile commenting upon a piece of legibility
research carried out by Christopher Poulton and published in an earlier
issue of the magazine139expressed his scepticism about legibility
research. Poulton’s study involved several roman (serif) typefaces and
the sans serif typefaces Gill Sans and Univers. Dreyfus had been heavily
involved in launching and marketing Monotype’s version of Univers, and
it may be that he was offended by the result where Gill Sans (also a
Monotype face) came out as more legible than Univers. However, his
argument is not unreasonable, pointing out the conflict between single
variable experimental designs and interacting typographic variables:
The legibility of a type depends not only upon the design, as supplied by
the manufacturer, but on the relation of line length to size of type and
degree of leading. These variables, added to the different habits of
different nationalities, make it extremely difficult and costly to devise a
valid set of tests. (Design, no. 188, 1964, p. 65)
A couple of years later, in the newly launched Journal of
Typographic Research, the Belgian typographer and educator Fernand
Baudin expressed scepticism about legibility research, while reviewing
Tinker’s latest book Bases for effective reading:
all typographers en bloc, whether expert or not, are presented merely as
introspective aesthetes deserving, on the whole, of contempt (pp. 115,
125, 135, 136, 183). This is a pity. … Expert typographers in fact do
exist, and it is only fair to add on their behalf that there is little if
anything in Professor Tinker’s exposition that goes to counter their
theories or practice. Indeed, it is a matter for regret that the author,
who is also the typographic designer of his book, did not deem it worth
his while to explain how it comes to be that his ‘scientific’ typography so
much resembles many other ‘merely’ typographic designs in the current
book production on the continent as well as in the U.S.A. This is not to
say that all books are well designed or that Professor Tinker’s book is
not a good example of Aldine sobriety. (Baudin 1967a, pp. 204205)
In a later issue the same year, Baudin simply claimed that legibility
research had run into a blind alley (Baudin 1967b, p. 374).140
139. See the section ‘Poulton 1965’, in chapter 5, ‘A review of empirical studies’.
140. A counterargument by Jeremy Foster appeared in the consequent volume of the
journal (Foster 1968).
Critiques of legibility research / 75
Around the same time, the trade journal Printing Technology
presented three articles on legibility in one issue. In addition to
contributions from E.C. Poulton and Herbert Spencer, the issue
contained an article by lettering artist and type designer David
Kindersley on the legibility of typefaces. Part of this article is a reaction
to Poulton’s recently published test involving Gill Sans and Univers.
Kindersley claimed that inter-character space is one of the most
important variables determining the quality of a typeface, but that it is
never taken into consideration by legibility researchers. According to
Kindersley’s introductory remarks much legibility research is of ‘great
interest’ but needs ‘interpretation in the light of typographical common
sense if it is to be useful’. Later on in the article however, he is less
diplomatic:
I am inclined to think that research into legibility has been of little use
so far. Many designers have stated such feelings. Perhaps only in
extremes of size such as letters for road signs or labels on patent
medicines can it be of assistance, though it is generally ignored; but
then it is more a question of visibility (Kindersley 1968, p. 70)
Kindersley, consciously of his role as a craftsman and artist, had
also another concern which he stressed:
I am rather frightened by all this talk about legibility. I think we are
trying to straighten out the problem at the wrong time. The alphabet
must be allowed to develop. We may be wasting time comparing very
similar objects without clear-cut results. In other arts we don’t as a rule
apply tests, because we know instinctively that standards could result.
We don’t want standards, because they prevent evolution. We need,
above all, by continuous experiment, to avoid fossilization. (Kindersley
1968, p. 69)
Kindersley even anticipated Joel Roth who in 1969 pleaded for
‘typography that makes the reader work’ (Roth 1969, p. 193), and this
idea’s many later manifestations in typographic discourse as well as its
manifestations in designed artefacts:141
It is often said that it takes time to get into a book. Why should it not
take a page or two to get into the typography of book? It could be equally
rewarding. (Kindersley 1968, p. 70)
141. For som e crit ic al com me nts on thi s id ea, s ee S ti ff 199 3a , and Poyn or 1 99 9.
Critiques of legibility research / 76
In 1979, in the American graphic design magazine Communication
Arts, the designer George E. Mack142 not only voiced concerns similar to
those above, but also clearly expressed what undoubtedly have become a
widespread opinion among designers:
Personally, I think anyone who claims ‘greater legibility’ in more than a
rough sense has a tough fight coming educators and psychologists are
still slugging out what this term means. The basic concept is so tangled
up in decipherability, pattern recognition, reading speed, retention,
familiarity, visual grouping, aesthetic response, and real life vs. test
conditions that contradictory results can be obtained for the same type
faces under different test conditions. And if that doesn’t send you
reeling, there is still the theory that if you make readers slow down and
work a little harder they pay more attention and remember it longer.
Research seems to find that people can read something (1) if they’re
familiar with the style, (2) if they want to, or have to, and (3) if the
letterforms are even minimally recognizable subject to your test
conditions. (Mack 1979, p. 96)
Ruari McLean included a chapter on legibility in his widely read
manual of typography (1980, pp. 4148). The chapter first of all presents
an informed common sense discussion on legibility, however not without
strongly biased views. The author also explicitly discusses legibility
research, and his scepticism is voiced the following way:
A great deal of research has been carried out during this century into
the legibility of print, and also into the psychological and physiological
processes of reading and comprehension. I believe that no research so
far published has been seriously helpful to designers concerned with the
design of straightforward reading matter for literate adults, except
insofar as it has, in general, confirmed their practice. Research in
legibility, even when carried out under the most ‘scientific’ conditions,
has not yet come up with anything fundamental that typographic
designers did not already know or believe with their inherited
experience of five hundred years of printing history and their specialized
observation of the civilization in which they live. For example, one
authority, Tinker, had concluded that ‘black print on a white
background is over ten per cent more efficient than white on black’. …
I am glad to hear it, but after even ten years as a typographer it would
never have occurred to me to think otherwise. (p. 47)
However, McLean does not reject research as such:
142. Not to be confused with the Scottish typographic designer Georgie Mackie.
Critiques of legibility research / 77
It is in special cases in which the typographer’s work abounds that
research must be respected. Research into the legibility of lettering and
print has been helpful in specialized areas, such as print for children
and for those with poor vision or for handicapped or elderly people, the
design of bibliographical entries, telephone directories, and motorway
signing where only research, and nobody’s guessing, could discover
how tall, and in what forms and colours, letters should be if they are to
be legible at a safe distance by a car driver travelling at speed on a
motorway. (pp. 4748)
The voice of the British type designer and Linotype manager,
Walter Tracy, echoes both Kindersley and McLean. The context is a
critical discussion of reading research, orthographic and graphological
reform proposals, and legibility research. Here is an extract:
The fact is, though, that so far from being suspicious of research,
typographers are mostly indifferent to it. And with reason. Naturally,
typographers admire the efforts of those who explore ways and means
of helping children and the handicapped to read; and, too, they can see
the case for research into specific matters like the legibility of directory
entries, the details of military maps, the layout of official
questionnaires, the organising of tabulations in a viewdata system, the
design of alphabets for low-resolution print-out devices, and other
particular tasks. … A great deal of research, though, seems to be
produced by academics for the interest not of designers but of other
academics. Their motives are easy to understand: there is the need to
add another title to their list of publications, that being the way to
academic achievement. … For them [designers], research and reform
are occasionally interesting, frequently bewildering, but rarely of any
consequence in the typographic scene as they know it. (Tracy 1988,
pp. 8485)
Accordingly, but this time from a psychologist; in a harsh review
article in Information Design Journal in 1982 David Bartram encap-
sulated much of the then existing critique of legibility research and the
emerging view of the way forward:
My own feeling is that any designer who picked up this book in order to
see what psychology had to offer him, would come away from it very
disappointed. … for the most part, those situations in which
psychologists manage to obtain so-called ‘significant’ effects are those
which any self-respecting designer would avoid … Perhaps it is time
that psychology started asking different sorts of questions.
Comprehension is a process involving an interaction between a reader
who has a purpose, and a writer who has an intention this interaction
being mediated by some form of graphic display. … we need to
Critiques of legibility research / 78
understand design and designers in terms of process and function, and
not treat design simply in structural terms as a static object of analysis.
(Bartram 1982, pp. 7576)
And in the same year, the British graphic designer Ken Garland
described his initial enthusiasm and later disappointment with
‘ergonomics’ and legibility research:
In the late ’fifties, inspired by the zeal of Michael Farr, then editor of
Design magazine, I had been totally involved in the championing of
ergonomics, or human factors studies. It had appeared to us that such a
sensible and promising new aspect of the design process, and one in
which we could happily collaborate with psychologists, anthropometrists
and other applied scientists; we fully accepted that ergonomics would
become a vital study for all design work, graphics included. But this
didn’t happen. In one sense, ergonomics was a crusade that failed.
(Garland [1982] 1996, p. 56)
Postmodernist critique
During the last decade or so the notion of legibility per se has come under
attack from what might be called post-structuralist, anti-modern or even
neo avant-garde modernist positions, exposed in influential graphic
design magazines like Emigre and Eye (see for example Dauppe 1991)143
and not least in professional design practice. For some, traditional
notions of legibility is simply rejected as irrelevant, accompanied with a
few recurring themes such as stressing the importance of allowing
artistic self-expression and rejecting the need for a ‘universally’
understood ‘language’ of typographic form. Another recurring theme is
the claim that typography should literally reflect what is perceived as
the fractured and fluxed condition of post-modernity. And furthermore
in a similar way as the idea behind Berthold Brecht’s ‘Verfremdungs’-
technique typography should take a liberating and critical stance
against societal authority and thus problematise apparently smooth
143. For com me nts up on s om e su ch pos it io ns ( pr oj ecti ng ide as a bout d econ st ruct io n on t o
the material world of typography), see Stiff 1993, and Robin Kinross’ pamphlet Fell ow
readers (1994a).
Critiques of legibility research / 79
texts by disrupting them and making them cumbersome and less legible.
This should be done in order to challenge their authority, and stop the
reader to make him think and realise that the world is not as smooth as
the smooth text purports it to be.
to engage the audience with the text, to make the audience ‘work’, and
to emphasise the ‘construction’ of meaning. Radical typography might
aim, not to flow seamlessly, legibly, but to halt and disrupt, to expose
meaning and language as problematic. … The cry of ‘legibility’ masks a
reactionary attitude (Dauppe 1991, pp. 67)
80
5 A review of empirical studies
Approach
I have explained in the introduction to the thesis my purpose in
reviewing the empirical studies. As I pointed out there, I have tried to
avoid two possible approaches. Crudely put, one is to treat the empirical
source material principally as referential documentary evidence of
events or historic change. The other is to treat the empirical source
material as simply a more or less valid source of ‘findings’. My approach
has been to foreground the ‘combination approach’: to treat the source
material as documentary evidence (or discursive events, for that matter),
and at the same time critically focusing on the source material’s
scientific validity. To extend the historiographical deliberations in my
introduction to the context and reflexive vocabulary of social science
philosophy: I am acknowledging that I to some extent, whether desirable
or not, appear both as participant and spectator.
‘Participant and spectator’ is the translated title of an essay by the
Norwegian philosopher Hans Skjervheim ([1957] 1996). However, while
Skjervheim sees a crucial difference between ‘participant’ and ‘spectator’,
the French social philosopher Pierre Bourdieu’s epistemological project is
to suspend or go beyond the contradiction between an objectivist
perspective and a subjectivist perspective (Solli 1998). In An invitation to
reflexive sociology (Bourdieu and Wacquant 1992), Bourdieu introduces
the apparently contradictory concept ‘participating objectification’, where
he suggests a methodological orientation that includes both ‘empathy’
and ‘rupture’.144
The German social philosopher Jürgen Habermas advocates a
position similar to Bourdieu’s in his critique of positivistic objectivism in
The theory of communicative action ([1981] 1984, pp. 102141). While
relying on Skjervheim as a point of departure for his discourse on this
matter, Habermas asserts that although the active presence of a
‘participant-observer’ ‘unavoidably alters the original scene’, a critical
engagement (or ‘performative attitude’) that confronts knowledge and
validity claims in the object domain and that makes accessible internal
interrelations of meaning, is imperative in order to understand and not
merely describe the object domain. This ‘performative attitude’ or
Bourdieu’s ‘participating objectification’ might appropriately describe
the ‘participant’ and ‘spectator’ approach taken in this thesis.
By critical engagement at least two approaches are possible. One
approach (‘paradigm internal’) would be to focus on the studies very
much on their own terms, e.g., by critically examining them with regard
to technical aspects such as the quality of their experimental design or
use of statistical methods. The other approach (‘paradigm external’)
would be to apply a kind of discourse analysis and focus upon the use of
rhetorical tropes and discursive strategies.145 However, my approach is
something in between. That is, by a relatively close reading, and by
building on whatever criticism the unique individual pieces of source
material invites to. This while trying to contextualise wherever possible,
not least by including appropriate references to the target domains of
printing, graphic design and information design.
Nevertheless, I do focus on one particular aspect of the legibility
studies in question, the internal validity of the studies presented. One of
the reasons for this, as explained in my introduction, is my assumption
that problems of internal validity reside in the typographic stimulus
144. Bourdieu’s concept of ‘participating objectification’ is also brought to the fore (and
further explained) in the penultimate chapter of this thesis; although in the context of
legibility research in relation to its target domain of typography.
145. For a s uc cess fu l appr oa ch of th is k in d, in th e cont ex t of h um ancomputer interaction
(and information design, for that matter), see Cooper and Bowers’ paper ‘Representing
the user: notes on the disciplinary rhetoric of humancomputer interaction’ (1995).
material, and that I can draw on my expertise as a typographer to do
this. Thus, I have been attempting in my criticism to mediate between
target domain (printing and graphic design) and the research domain
(legibility studies and human factors).
I am avoiding any critical and thus rather technical examination of
external population validity and statistical conclusion validity. Among
the reasons for not assessing statistical conclusion validity are:
statistical conclusion validity can only make sense if internal validity is
present; and an a priori assumption that this part of the experiments in
question are most often both elaborate, and adequate, if looked upon in
isolation. ‘Although researchers are expected to apply rigorous standards
of statistical analysis to their data (and publicly demonstrate this
analysis in their reports), their stimulus materials are rarely subjected
to the same scrutiny.’ (Schumacher and Waller 1985, p. 379).
Terminology
When using the term ‘construct validity’ I refer both to the theoretical
definition of legibility, as well as its correspondence with the operational
definition. That is, whether the theoretical definition of the legibility
construct is reasonable or not, and whether methods and techniques
employed really can be said to represent the construct or not.
When I occasionally refer to ‘internal validity’ and ‘external
validity’, I do it, at least as a point of departure, in accordance with the
use of Cook and Campbell (1979), discussed and lucidly laid out by
Pedhazur and Schmelkin (1991). With respect to ‘external validity’ I also
borrow from the exposition of Bracht and Glass (1968).
When using the term ‘internal validity’ I refer to whether or not
there is a causal relationship between the manipulation of the inde-
pendent experimental variable and the observed results. That is,
whether alternative explanations of the observed result are plausible
or not.146
146. See also chapter 1, ‘Introduction’; and the section ‘Lack of internal validity’, in
chapter 4, ‘Critiques of legibility research’.
When using the term ‘external validity’ I refer to whether or not the
results of an experiment are generalisable to other people and situations
(respectively ‘external population validity’ and ‘external ecological
validity’), in other words: the coherence between the test situation and
target situation. ‘Ecological validity’ in experimental psychology is
sometimes called ‘setting representativeness’, and operational fidelity’ in
human factors research.
Berkowitz and Donnerstein (1982) and Kantowitz (1992) vigorously
refute a common criticism of laboratory experiment’s lack of external
validity. They argue that it is important to distinguish between realism
and generalisability, and that the importance of realism (physical
representativeness; i.e. the extent to which the test situation mirrors the
physical target situation) ‘has been vastly overrated’ (Kantowitz 1992,
p. 390). Thus, it is argued that external validity does not necessarily
require ecological validity, and that whether it does or not is an
empirical question in each case. Berkowitz and Donnerstein argue that
laboratory experiments cannot be truly representative designs and that
they are artificial by nature. Laboratory experiments, they argue, are
about uncovering causal relations between variables while controlling
irrelevant variables, and thus, that artificiality is the strength of
experiments (1982, pp. 245, 255256). Nevertheless, Bracht and Glass
point out the obvious and banal fact that: ‘The intent (sometimes
explicitly stated, sometimes not) of almost all experimenters is to
generalise their findings to some group of subjects and set of conditions
that are not included in the experiment.’ (1968, pp. 437438).
With ‘overall validity’ I refer to the relevance or appropriateness of
research when gazed from a reflexive standpoint ‘outside’ the realm of
the discipline or paradigmatic framework within which the experiment
has been performed (while not necessarily implying ‘neutrality’ or a
privileged position).
Bibliographic sources
Some of the studies identified have been found through citations in
similar studies, some as references in ‘standard’ texts like Tinker (1965),
and some in review articles like Reynolds 1984, while some of the studies
have been found in printed bibliographies.147 More or less systematic
searches148 have been performed in on-line and off-line electronic
bibliographic databases.149 Complete runs of some journals have also
been scanned.150 Finally, some of the studies have been found by
serendipity.
Selection criteria
Twenty-eight experimental behavioural studies on the relative legibility
of serif and sans serif typefaces are examined in chronological order of
appearance. The first of these studies was published in 1896, and the
last was published in 1997. The 28 selected studies amount to almost
half of the total of 72 legibility studies that I have has uncovered on this
147. Among the printed bibliographies scanned were: Cornog and Rose 1967; Foster 1971;
Foster 19 72; M acdo na ld-Ross and Smith 1977; Foster 1 98 0; Fel ker 19 80 ; Kempso n
and Moore 1994; and the useful periodical Ergonomic Abstracts, unfort unate ly on ly
for 1994, 1995, until June 1996. The following un-published bibliographies were also
helpful: Brinch 1976; Rannem 1983; ‘Information design database: authors AK’
1992; and ‘Information design database: authors LZ’ 1993. I have also scanned
volumes of the following periodical, annual or irregular bibliographies: Design and
applied arts index (19871990); ABM ArtBibliographies Modern (19701991); PIRA
Printing and publishing abstracts (19751994); Aslib Index to Theses (predominantly
British doctoral theses) (19801994).
148. The search terms have been: ‘legibility’, ‘readability’, ‘typeface’, ‘font’, ‘sans serif’,
‘sanserif ’, and ‘typography’.
149. Among the electronic databases scanned were: ProQuest Dissertation Abstracts
(predominantly American doctoral theses); OCLC World Cat (American master
dissertations are included in this ‘world library catalogue’); OCLC Article 1st;
Medline; ERIC Education Resources Information Center (19661998); PsycLit
(19741994); LLBA Lingustics and Language Behaviour Abstracts
(19731994); ABM ArtBibliographies Modern (19841993); and the ISI Science
Expanded, Social Science, and Art & Humanities Citation Indexes (from 1987 to
present).
150. Journa l o f Typ ographic Research; Vi si bl e La ng ua ge ; Icographic; Eye; Typos;
Typographica; Information Design Journal; and Applied Ergonomics.
topic. Most of them are experimental, a few of them are ‘typeface
topology studies’ either with or without attached behavioural
experiments, and a few of them are preference studies. Nevertheless,
I am certain that a more systematic and persistent search would have
uncovered several more relevant studies.
For the sake of order: Some of the 72 comparative studies are
studies focusing on the legibility of individual typefaces (but which
happens to include both serif and sans serif typefaces).151 Many of the
72 studies are univariate, focusing only on typefaces as the independent
variable, while some of them are multivariate, including variables like
line length and paper stock (as in Neman 1968), or typeface posture and
typeface weight (as in Wendt 1969).
Most of the studies appear as articles in scientific journals, some
appear as papers in conference proceedings, some appear as individually
published reports, some appear as a chapter or subchapter within
monographs on related topics, some appear as doctoral theses only, a few
appear as master dissertations, while some appear in several guises (for
example as a thesis, as an article in a scientific journal, and as a popular
article in a printing trade journal).152
The reason for assessing individual studies as discrete units in a
chronological order has been explained in my introduction. The exercise
of assessing as many as 28 studies has its background in the initial
intention of assessing all the studies I could identify in order to
document this research in a comprehensive empirical manner. However,
as the number of detected studies grew, I realised that some sort of
selection had to be undertaken so my task could stay manageable on the
one hand, and not completely become a quagmire of tedious
oversaturated empiricism on the other hand.
151. There exists other ‘genres’ of comparative typeface legibility studies: For example
dichotomous studies on proportional set width vs. fixed set width typefaces (for
example on Times vs. the ‘typewriter’ typeface Courier). Examples of such studies are:
Payne 1967; Beldie et al. 1983; Ari diti et al. 1990 ; and Mansfi eld et al. 1996.
152. See for example de Lange 1992; de Lange 1993; and de Lange et al. 1993 . It is
interesting to observe that when a part or extract of a thesis or dissertation appears
in a scientific journal, it happens that the supervisors are included as authors
(compare for example Amachree 1975 with Amachree et al. 1977; de Lang e 199 3 wit h
de Lange et al. 1993; and S tone 19 97 wi th St one et al. 1999) .
Studies assessed
A list of the studies reviewed in this chapter appears on the content
pages of this thesis (pp. 45).153 Partly because my change in strategy,
from assessing all of the relevant studies to assessing only a selection of
the studies, happened en route, most of the omitted studies are from the
period from the 1960s to and including the 1990s (with two exceptions:
Ovink 1938, pp. 84104; and Nolan 1959). However, the studies assessed
are spread over the whole century, and while 15 of them are from the
period from the 1890s to and including the 1950s, 12 are from the much
shorter period from the 1960s to and including the 1990s.
The studies assessed also represent a wide variety of context
rationale, operational methods, face results, and degree of formality of
publication medium. Note also that as a criterion for inclusion I have
given priority to studies that have been ‘published properly’ as opposed
to ‘unpublished’ doctoral theses or master dissertations. The studies
which I have not reviewed in this chapter have with very few exceptions
all been inspected, and references to their content may appear elsewhere
in the thesis. There is no reason to believe that another selection would
have altered the picture that emerges and my conclusions in any
substantial way. Observe that, except for the two studies mentioned
below, the selection of studies for review has not been based on an
advanced assessment (for example in order to pick out studies that could
yield a more ‘productive’ result than others).
Note that two of the studies have on purpose received a more
thorough and extended treatment than the others (Christie and Rutley
1961; and Robinson, Abbamonte, and Evans 1971).
The reason for the special treatment of Christie and Rutley’s study
is that it was part of a unique instance of a public debate on legibility
(of the new directional traffic signs for Britains new motorways and all-
purpose roads in the early 1960s). Many people participated in this
debate, including many designers. Furthermore, it was about alphabets
that would become very prominent in the ‘visual landscape’ of Britain
153. For the s ake of o rder : the re vi ewed s tudy C rosl an d and John so n 19 28 , is no t counted
among the 28 reviewed studies, nor among the total number of 72 studies. See the
section ‘Crosland and Johnson 1928: the serifs that never were’, in chapter 5.
and elsewhere. And not least, this debate still surfaces every now and
then, almost forty years later, and it relates to important questions about
the nature of design.
The reason for the special treatment of Robinson, Abbamonte and
Evans’ study is the following. When it appeared in 1971 it was unusual
in two respects: unlike most typeface legibility studies it proposed a
theory for why serif typefaces are more legible than serif typefaces, and it
also took a cognitive science approach, by employing an explanatory
computer-based model that claimed to rely on how the human visual
system works. Furthermore, the study is still cited, and several of these
citations suggest that this study stands for something more scientific,
objective, and conclusive than other typeface legibility studies. The study
therefore deserves the extended treatment it has been given.
Studies not assessed
The studies of relative typeface legibility that have not been reviewed in
this chapter can be categorised in the following way:
1. Practical legibility studies based on evaluation of image
degradation of typefaces, for example created by repeated photocopying
or by fax-transmission (Spencer et al. 1977; Bowden and Brailsford 1989;
Adobe 1989; Birkvig 1990).154
2. Experimental studies on typeface legibility for people with
impaired vision (Nolan 1959; Prince 1967; Shaw 1969; Zachrisson and
Smedshammar 1971; Jansen and Thomsen 1986; Plass and Yager 1995;
Ya ge r et al. 1998).155
3. Experimental screen legibility typeface studies. The decision to
omit screen legibility studies is partly arbitrary but also pragmatic:
A line has to be drawn, and some of these studies were published too late
for inclusion. The studies are: Engman 1988; Smedshammar et al. 1990;
154. These studies do not represent a new strand of research. See for example the
(general) study by Bullington, ‘Legibility determinations of multiple carbon copies’
1948).
155. The rationale behind these studies is certainly commendable. Most of these studies
are based on subjects with impaired vision. However, some experiments are based on
simulations of visual impairment, for example where subjects wear specially designed
glasses.
Williams 1990; Garcia and Caldera 1996; Boyarski et al. 1998; and Stone
et al. 1999.156 Although the media substrate is different from paper,
signboards, or various kinds of one-line displays (like LED displays),
computer screens represent, at least in one respect (‘the psycho-motor
domain’), just another media substrate, and it is the same kind of
questions that are asked as for print legibility studies, the studies have
the same approach and similar operational methods are employed. And
not least: The criticisms of traditional print legibility research reported
elsewhere in this thesis, should to a large extent also apply to screen
legibility research. The important difference to print (with regard to ‘the
psycho-motor domain’) has of course to do with the fact that computer
screens represent an environment of higher constraints (for example
considerably lower image resolution, lower text/background contrast,
disturbing flicker, and disturbing glare). Therefore, many typographers,
who are indifferent to whether serif or sans serif typefaces are most
legible on paper, clearly prefer sans serif typefaces for today’s screen
technologies. This is due to the relatively coarse image resolution of
computer screens and problems with representing the fine serifs and the
finely tuned stroke contrast of serif typefaces.
4. Some of the studies that appear as theses or dissertations only.
For example, only one (Taylor 1990) of three PhD dissertations under the
label ‘instructional technology’ from Wayne State University is included;
the other two are Stahl 1989, and Kravutske 1994. The inclusion of
Taylor 1990 happened because this was the first to arrive on my desk.
Other PhD dissertations omitted are Neman 1968; and Kunst 1972. Two
master theses that are omitted are: Sorg 1985 and Connelly 1998.
5. Three studies on ‘projected’ typefaces (from slides and overhead
projection transparencies) are also omitted. These studies are: Adams
et al. 1965; Grooters 1972; and Phillips 1976.
156. Stone et al. 1999 and Stone 1997 are ba sical ly th e sam e stu dy, only appea ring in two
guises. For the sake of order: A screen legibility study by Geske (1996), do not
compare serif and sans serif typefaces, but only variants of a sans serif typeface.
6. A handful or two of other studies of various character are also
omitted, i.e. Ovink 1938, pp. 84104; Brachfeldt 1964;157 Jha and
Daftuar 1981; Moriarty and Scheiner 1984;158 Hoffman 1987; Lenze
1989;159 Misanchuk 1989a; Misanchuk 1989b; Regan and Hong 1994;160
Smither and Braun 1994; Klitz et al. 1995;161 and Leat et al. 1999.162
7. Finally, comparative non-experimental typeface ‘preference
studies’ are also omitted. Some of these studies are labelled legibility
studies; however, not all of them are. The studies in question are:
Paterson and Tinker 1940, pp. 1820 (about ‘legibility’ and ‘ease and
speed of reading’);163 Becker et al. 1970 (about perceived ‘appealingness’
and ‘attractiveness’); Amachree et al. 1977 (about ‘legibility’ and ‘easy to
read type’); Cooper et al. 1979 (about ‘reader preference’); Bell and
Sullivan 1981 (about ‘interest’ and ‘liking’); Holleran 1992 (about
‘favourite fonts’); Schriver 1997, pp. 288303 (about ‘preference’,
‘legibility’, and ‘semantic fit’). Note however that several of the legibility
studies included in the study-by-study assessment in this chapter
contain preference studies as subsidiary supplements to their primary
behavioural experiments.
A note on ‘semantic’ studies
One type of comparative ‘empirical’ typeface study involving both serif
and sans serif typefaces has not been considered, as this type of study
certainly does not relate to legibility: studies on semantic typeface
connotations. That is, studies that relate to the more transient and
culture-contingent ‘affective domain’, and not to the ‘psycho-motor
157. Brachfeldt 1964 is cited in the literature as if it is an empirical study, e.g. by
Zachrisson (1965, pp. 3738) and Wendt (1969). It is a short article based on a priori
arguments that only contains a brief and vague mentioning of an experiment.
158. Primarily a study on ‘close-set type’.
159. Kravutske (1994) refers to this study, but her bibliographic details about the study
are inadequate (pp. 16, 77).
160. Primarily a study about ‘texture-defined letters’.
161. This study appears only as a ‘conference poster’ (meeting abstract).
162. Primarily a study about ‘crowding in central and eccentric vision’.
163. This study has been described above, in the section ‘Subjective preference studies’, in
chapter 2.
domain’.164 Such empirical studies, dating back to the early 1920s
(e.g., Berliner 1920,165 and Poffenberger and Franken 1923), represent
a particular strand of typographic research. Two of the ‘standard’
legibility research monographs, Ovink 1938, and Zachrisson 1965, deal
not only with legibility, but also with typeface connotations (cf. the
expression ‘atmosphere-value’ in the title of Ovink’s book and
Zachrisson’s concern with the equivalent concept of ‘congeniality’).
Examples of contemporary comparative typeface studies of this kind are:
Wendt 1968; Bartram 1982; Rowe 1982; Morrison 1986; Hoffman 1988;
Lewis 1989; Lewis and Walker 1989; Pedersen and Kidmose 1993,
pp. 1723; Tantillo et al. 1995; Johnson 1995; Braun and Silver 1995).
Note that one of the legibility studies selected for assessment in this
chapter is dual in its approach (Silver, Kline and Braun 1994). Although
this study is about ‘legibility’, it can also be categorised as a ‘semantic
study’ (as well as a ‘preference study’), since it operates with variables
like ‘perceived hazardousness’ in addition to less semantically charged
variables.
Foregrounding connotative semantic aspects in typographic form is
a central concern in the craft domain of typography and graphic
design166 (and much more so in advertising, packaging design, and
corporate identity).167 This concern has also been prominent throughout
the history of graphic communication.168 The wide variety of pluralistic-
historicist display typefaces of the latter part of the 19th century
represents a particular creative and rich instance of this concern.169 The
concern with semantic foregrounding (whether symbolic or iconic) applies
more to 20th century eclectic-historicist ‘traditionalist’ typography,170
164. See the section ‘Ergonomics as a framework’, in chapter 2.
165. For the s ake of o rder : Ann a Berl in er use d hand d rawn l et terf or ms , n ot p ro per
typefaces.
166. See Branding with type, by gener, Pool and Packhauser (1995).
167. See for example Ehses and Lupton 1988. For a comprehensive annotated visual index
of trademarks, see Mollerup 1998.
168. For a f as cina ti ng vis ua l inde x, see M as sin 19 70. C f. also t he c once rn s of t he sem io tics
journal Word & Im ag e.
169. For a f as cina ti ng vis ua l inde x and gr ound br eaki ng s tudy, s ee Gray [1 93 8] 1 97 6.
170. See for example Vincent Steer’s manual of typography from the 1930s (n.d.), and
Crutchley 1986.
‘New York expressive’ typography,171 and ‘postmodern’ typography,172
than to 20th century ‘modernist’ typography. The concern is expressed in
graphic artefacts and craft discourse, ranging from the obviously
appropriate, via the subtle and intricate, via the speculative, via
blatantly banal clichés, via the disturbingly interventionist and
designer centred (colliding head on with authorial intention), to the
absurd.173 A spirited and witty instance of this ‘semantic concern’ in
current craft discourse is represented by Eric Spiekermann and
E.M. Ginger’s best-selling book Stop stealing sheep & find out how type
works (1993).
This ‘semantic concern’ also appears as a joint interest between
typographic studies and literary studies when the genre of (modernist)
‘visual’ or ‘concrete’ poetry is considered.174 The concern is akin to
certain strands within postmodern architecture as well as the semiotic
movement in architectural design of the 1960s and 1970s, and the more
current ‘product semantics’ movement in industrial design.175 Not least,
the concern is expressed through the prevalence of more or less
appropriate visual metaphors and visual analogies in the graphical user
interfaces of the operating systems, application software, multimedia
applications and web sites of digital media. Finally, there is a kind of
kinship between ‘typeface semantics’ and ‘graphology’, that is,
graphology not in its linguistic sense, but graphology in the sense of
handwriting analysis for revealing the psychological characteristics of
the writer.
171. For a v is ual in de x, se e Sn yd er a nd Peck ol ick 19 85 .
172. For vis ua l in de xe s, see fo r exam pl e Ald erse y-Williams et al. 1990; and B lackw ell
1995.
173. See for example Baker 1985; and Stiff 1995a.
174. See for example Bohn 1986; and Levenston 1992.
175. On ‘product semantics’, see for example the collections of papers in Vihma 1990; and
1992. See also Richard Buchanan’s critical review of Vihma 1990 (Buchanan 1993).
Griffing and Franz 1896
*
The first experimental study ever published, which deals with the
legibility of sans serif typefaces compared to serif (‘roman’) typefaces,
was initially presented to the International Congress of Psychology in
Munich in 1896, and subsequently published in the American journal
Psychological Review.176 The study is undertaken from the perspective
of investigating the relationship between visual fatigue in reading and
conditions causing this fatigue. The authors justify this approach by
linking myopia and other eye disorders with visual fatigue; ultimately
caused by the increased amount of reading in modern society. One of
several conditions which they identify as contributing to visual fatigue is
‘the quality of the type’ (pp. 513, 522525). By ‘quality of type’ they
simply refer to typeface categories.
Myopia was a common concern in reading research and legibility
literature around the turn of the century. For example; in an extensive
article on legibility research that appeared in the leading Scandinavian
printing trade journal Nordisk Boktryckarekonst in 1906, myopia was
prominently pictured as caused by inadequate typography (see Schmidt
1906).177 And not to mention Emile Javal, who as an ophthalmologist
was preoccupied with myopia and its possible causes in typography.
The authors initially suggest that it can be theoretically assumed
that legibility of typefaces is a function of ‘complexity of structure’; i.e.
the more simple the structure (e.g. sans serif typefaces) the more legible
he typeface, and vice versa the more complex the structure (e.g.
fraktur typefaces) the less legible the typeface.
* Harold Griffing and Shepherd Ivory Franz. 1986. ‘On the conditions of fatigue in
reading’. Psychological Review, vol. 3, p p. 5 13530.
176. Vene zky poi nts out th at a rou nd t he turn of th e ce ntu ry r ead ing res ear ch ( whi ch then
included legibility research) was prominent in the psychological literature, and that
very few psychology journals existed Psychological Review and American Journal of
Psychology in the USA, Mind in England, and a few in Germany and France (1984,
pp. 7, 23, 28).
177. Interestingly this article is based on the work of Javal as well as a book written
jointly by the psychologist Dr Prof. Herman Cohn and the printing ink factory
director Dr Robert Rübencamp: Wie sollen Bücher und Zeitungen gedruck werden?,
published in Germany in 1903, and addressing the world of printing and typography.
Both the article in the Scandinavian printing trade journal and the German book
serve as an index of an early contact between research domain and target domain.
The operational criteria employed for measuring the relative
legibility of six different typefaces are expressed through the
‘illumination threshold method’.178 By the ‘illumination threshold
method’ both time of exposure and the distance between the stimulus
material and the subject’s eyes are constant. The luminous intensity of
the light source is also constant. However, the distance between the light
source and the stimulus material is varied.
The typefaces in question are: a roman typeface, a ‘German’ gothic
typeface, a bold sans serif typeface of capitals only, a lighter sans serif
typeface of capitals only, a modern fat face, and a normal sans serif
typeface of normal weight of both small letters and capitals. The size of
the letters is referred to as ‘1.5 mm. in height … with slight variations,
to .1 mm. in the individual letters’ (p. 523). The measure 1.5 mm. refers
to the x-height of the typefaces that contains both small letters and
capitals, and to the capital height of the capitals-only typefaces.
Translated into nominal pica points, ‘1.5 mm in height’ should
approximately equal a nominal typesize of 10 points (inferred from
information on p. 514 and p. 518).
The results are not in line with the authors’ initial assumption that
legibility is a function of the ‘complexity of structure’. The results show
that there is only a slight difference of legibility between the roman
typeface, the ‘German’ gothic typeface, the normal sans serif typeface
and the light sans serif capitals-only typeface. However, the modern fat-
face typeface and the bold sans serif capitals-only typeface are
considerably more legible than the other typefaces. The authors point out
that this phenomenon obviously makes legibility a function of stroke
thickness.
That a rather fat typeface requires less light than a typeface of
‘normal’ width and per se becomes more legible, clearly illustrates a
deficiency of the authors’ ‘illumination threshold method’. And the
authors acknowledge that something might be wrong with their
legibility-construct, and that their results therefore can be questioned.
They also acknowledge that it is possible there is no direct connection
178. For the s ake of o rder : The a utho rs a lso em ploy t he ‘s pe ed o f read in g me th od’ an d the
‘time of exposure method’ when doing experiments on other ‘conditions’.
between what is actually perceived at a specific threshold and the level
of fatigue, i.e. that something perceived at a certain threshold may be
either less or more fatiguing than something else perceived at another
threshold.
There is nothing much left of the optimism after the authors have
discussed their own results, and they conclude laconically: ‘The form of
the type is of less importance than the thickness of the letters.’ (p. 530).
Roethlein 1912
*
The second experimental study ever published, which deals with the
legibility of sans serif typefaces compared to roman typefaces, was
published in the American Journal of Psychology.
The explicit aim of this large-scale experimental study was not to
compare the legibility of sans serif typefaces to roman typefaces. It is an
investigation into the relative legibility of up to 26 different typefaces,
and including, for some of the typefaces, variants such as italic, bold and
condensed. The relative legibility of each letter of the typefaces is
measured, and the average legibility of the letters belonging to each
typeface determines the legibility of the typeface. The study has also a
more far-reaching aim: to identify the physical properties of typefaces
which determines their degree of relative legibility in order to improve
legibility. The paper contains a relatively extensive bibliography (40
items) with references to Griffing and Franz’ study (1896); Javal; Huey;
and the American master printer Theodore Low De Vinne’s authoritative
excursion in type technology Plain printing types. A breakdown of the
bibliography into languages shows that it lists 20 English references,
17 German, and 3 French.
By the construct ‘legibility’ the author refers to the visibility of
single letters both isolated letters and letters appearing in awkward
* Barbara Elisabeth Roethlein. 1912. The relative legibility of different faces of printing
types’. American Journal of Psychology, vol . 23, n o. 1 , pp. 136. (Also published in the
series Publications of the Clark University Library, vol. 3, no. 1. Worche ster,
Massachusetts: Clark University Press).
nonsense combinations like ‘ksitugy’. The objective for investigating the
legibility of letters in groups is to determine the role of adjacent letters
when determining legibility. The operational criterion for measuring the
legibility is expressed through the variable distance method. The time of
exposure method was discarded by the researchers as too unreliable
after some preliminary experiments due to the interference between
reaction-time and varying levels of concentration (p. 5).
The prints of the typefaces employed in the study were donated by
American Type Founders Company. ATF also provided typographic
expertise; Frank Berry, Linn Boyd Benton and Morris Fuller Benton
contributed by suggesting typefaces and in the interpretation of the
results. The investigation also took into consideration data on
compositors’ and proof-reader’s errors.
The two sans serif typefaces included in the study are Franklin
Gothic and News Gothic.179 All letters were of ten points nominal size.
The author explicitly acknowledges the problematics of nominal type size
(p. 21), and its bearings for her interpretations.
The results of this extensive investigation are many and varied, and
laid out in a large number of tables. About the comparative legibility of
sans serif typefaces to roman typefaces the author concludes:
If legibility is to be our sole criterion of excellence of type-face, News
Gothic [a sans serif typeface] must be regarded as our nearest approx-
imation to an ideal face, in so far as the present investigation is able to
decide this question. The aesthetic factor must always be taken into
account, however, here as elsewhere. And the reader who prefers the
appearance of Cushing Oldstyle or a Century face may gratify his
aesthetic demands without any considerable sacrifice of legibility. (p. 29)
Roethlein is not satisfied with ranking typefaces. She points out
that the letters of the typefaces in question differ not only in specific
form, but also with regard to size and strokewidth (p. 22). She therefore
measured strokewidth plus the height and width of the letters m, o, and
z of 16 of the typefaces ‘In order to obtain a clearer insight into the
relative significance of each of these variable factors as determinants of
179. Although these typefaces carry different names, they are today frequently referred to
as complementary typefaces, Franklin Gothic being a bold companion to the lighter
News Gothic or vice versa.
legibility’. The results of these measurements lead her to suggest that
the specific forms of the typefaces are less important than size and
strokewidth as determinants of legibility (p. 2425), (Franklin Gothic,
the bolder relative of News Gothic, is however not included in this part of
her study.)
Roethlein finally concludes that legibility of individual letters is less
dependent on letter-form than: letter-size, letter strokewidth, letter
spacing, position in a group, and the shape and size of adjacent letters
(pp. 3334). Roethlein’s study appears to be thorough and well reasoned.
Legros and Grant 1916
*
Chapter 11 is devoted to legibility in Lucien Alphonse Legros and John
Cameron Grant’s voluminous and impressive manual on type technology
published in 1916: Typographical printing surfaces: the technology and
mechanism of their production (pp. 156192). The authors refer to four
reading researchers, Sanford, Javal, Catell and Cohn; and in addition to
Theodore Low De Vinne. There are however, no references to the work of
Griffing and Franz (1896) or Roethlein (1912).
Legros and Grant are not only measuring legibility, they are also
proposing a theory of typeface legibility. Their constructs ‘legibility’ and
‘illegibility’ refer to the degree of topological similarity between two
single letters in certain pairs of similar looking and frequently occurring
characters of the same typeface. They believe that similarity of form
leads to confusion and misreading with enormous consequences for the
total amount of time millions of people spend reading. The study is not
experimental, but a typeface topology study, based on measurements of
the ‘coincident’ (common) and ‘non-coincident’ (peculiar) areas of the two
individual letters of chosen characters-pairs when superimposed on each
other. On the basis of such intra-typeface measurements, typefaces are
compared. According to the theory, a high degree of non-coincident area
* Lucien Alphonse Legros and John Cameron Grant. 1916. Typog ra ph ic al printing
surfaces: the technology and mechanism of their production. Lond on: Lo ngmans,
Green, and Co.
suggests a high degree of legibility for that typeface. And on the
contrary, a high degree of coincident area suggests ‘illegibility’.
Practical measures were made possible by drawings where the
characters had been enlarged 45 times. The study deals with three
separate groups of characters small letters, capitals, and figures and
present the result for each group independently. However, the emphasis
is on the group of small letters because they are the most frequently
used. The character-pairs compared are: for small letters: c/o, c/e, e/o,
n/u, i/l, h/b and a/s; for capitals: C/G, O/Q, B/R and X/Z; and for figures:
3/5 and 8/6.
FIGURE 2. An illustration of Legros and Grant’s concept of legibility. A high
degree of non-coincident area suggests a high degree of legibility. A high degree
of coincident area suggests ‘illegibility’. (From Legros and Grant 1916, p. 172:
figure 129.)
The ratio of the total peculiar area of both characters (of a pair) to
the total area of the two characters is called the legibility coefficient
(given as a percentage). If there is no common area, the peculiar area of
the two characters would be as large as the total area of the two
characters (peculiar area / total area = 1; i.e. 100%), and there would be
‘perfect legibility’. The difference between 100% and the legibility
coefficient makes the illegibility coefficient (in percent). For determining
the legibility of a typeface the authors take into consideration the
occurrence ‘given in the fount bill’ of each of the characters of each pair,
and then of each character of all the characters in question. Thus, the
mean legibility coefficient and the mean illegibility coefficient of the
typeface can be calculated (as percentages).
However, the authors realise that if the strokes of the characters of
a typeface are very thin, the peculiar area will be larger than if the lines
of the characters are very thick (imagine the character X of a light
typeface, made of two very thin crossing lines, compared to the character
X of an ultra bold typeface, made of two very thick crossing lines, while
both Xs are of the same point size). To avoid that legibility/illegibility to
a large extent becomes a function of the leanness/fatness ratio, and to
avoid the threat this problem poses to their theory, the authors introduce
two more concepts, ‘blackness’ and ‘specific legibility’. The blackness is
expressed by the ratio of the total area of the character to the area of a
horizontal cross-section of the shank of the type (in per cent). Finally, the
specific legibility (in per cent), is expressed by multiplying the mean
legibility coefficient (in percent) with the mean blackness (in per cent).
Specific legibility gives, according to the authors, the best comparative
measure of the legibility of a typeface. The following example can serve
to clarify the idea behind blackness and specific legibility: To avoid that
the degree of fatness of a typeface unintendedly becomes the most
important determinant for its legibility, the ‘mean blackness’ (expressed
as a percentage) is multiplied with the ‘mean legibility coefficient’
(expressed as a percentage), resulting in the ‘specific legibility’
(expressed as a percentage), thus employed as a method to even out
(reduce) the influence of the fatness when determining the legibility of a
typeface. The following example can illustrate the relationship between
the mean legibility coefficient and the specific legibility: for a light
typeface with a high mean legibility coefficient of 25%, the coefficient is
multiplied with its low mean blackness of only 10%, which makes the
specific legibility 2.5%. Accordingly, for an ultra fat typeface with a low
mean legibility coefficient of only 5%, the coefficient is multiplied with its
high mean blackness of 50%, which actually results in the same specific
legibility of 2.5%.
The authors’ approach is applied to five more or less generic Latin-
script typefaces: a modern typeface modified by the authors (given serifs
with the thickness of old style serifs), an old-style typeface (with
modified figures ‘with a view to increased legibility’), a mannered
typeface referred to as Blackfriars (‘in which legibility of the lower-case
was specially sought’ and ‘produced under the direction of one of the
authors’, with the main strokes thickening towards the thick serifs), a
sans serif typeface, and a fraktur typeface.
It seems that the authors have problems in deciding whether it is
the specific legibility or the mean legibility coefficient which should serve
as the best measure of legibility. However, the results of the
measurements and the calculations for both measures are laid out below.
The specific legibility of the sans serif typeface compared to the
other four typefaces in question:
Small letters: the sans serif typeface comes out as the second most
legible typeface (3.44%), only after Blackfriars (4.51%), before the
modern typeface (3.33%), the old-style typeface (3.21%), and the fraktur
typeface (2.45%).
Capitals: the sans serif typeface comes out as number three (2.60%),
after Blackfriars (3.44%) and the modern typeface (2.88%), before the
oldstyle (2.28%) and the fraktur typeface (0.89%).
Figures (the fraktur is not included): the sans serif typeface comes
out as number three (4.82%), after the modified modern typeface (9.31%)
and the oldstyle typeface (6.97%), but before Blackfriars (4.39%).
When only the mean legibility coefficient is considered (the
‘blackness factor’ not taken into consideration), the sans serif comes out
far worse. For both small and capital letters it comes out penultimate
only before the fraktur, and for the figures (fraktur not included) it
comes out last. The mean legibility coefficients were:
Small letters: Blackfriars (27.3%), old-style (26.3%), modern
(23.1%), sans serif (18.2%), fraktur (15.1%).
Capitals: modern (15.5%), Blackfriars (15.2%), old-style (15.1%),
sans serif (10.2%), fraktur (3.8%).
Figures: modern (52.3%), old-style (46.8%), Blackfriars (20.4%), sans
serif (18.0%).
A review of empirical studies
/
100
The authors acknowledge that the blackness factor is ‘unavoidably’
larger for the sans serif typeface than for the other typefaces, something
which gives the sans serif typeface a higher specific legibility than mean
legibility coefficient. However, they also claim that the blackness factor
does not help, because the specific legibility of the sans serif typeface is
less than proportionally increased compared to the other typefaces when
the blackness factor is included in the calculation (p. 164). The authors
state that ‘Popular belief holds the sans serif, or, as it is popularly called,
block letter, to be very legible, but to printers, and especially those who
do much display work, this view is known to be erroneous.’ (p. 164). It is
clear that the authors, independently of the results of their own
investigation, view sans serif typefaces as inferior to roman typefaces.
Their attitude is illustrated by the following somewhat witty comment:
It may be some comfort to motorists to know that the form of the
characters and figures selected for car numbering by the governments of
this and other countries is less legible than many others which might
have been chosen; in fact, it would be difficult to improve them in the
direction of greater illegibility except by combining German Fraktur
capitals with the existing sans serif figures. (Legros and Grant 1916,
pp. 169170)
Legros 1922
*
In this publication, A note on the legibility of printed matter, Lucien
Legros, now with the initials O.B.E. (Order of the British Empire) added
after his name, maintains his and John Grant’s theory of typeface
legibility. However, his claims are modified he refers to the method
employed in Typographical printing surfaces as a suggestion, and
proposes that if the method is combined with a time of exposure method,
‘it is probable that a definite measure of legibility would be obtained’
(p. 6). The practical application of the method, as demonstrated in
Typographical printing surfaces, is only briefly referred to. The mean
* Lucien Alphonse Legros. 1922. A note on the legibility of printed matter. Prepared for
the information of the Committee on Type Faces. London: His Majesty’s Stationary
Office.
A review of empirical studies
/
101
legibility coefficient and the specific legibility of the group of small
letters of the old style typeface from the 1916 investigation are referred
to, respectively 26.3% and 3.21%. However, this information is followed
by the following statement: ‘A modern face of the same gauge and
blackness would only have a specific legibility of about 2.8%, sans-serif
about 2.2%, and German Fraktur 1.8%.’ (p. 6). This tells us that the
author in reality has discarded his formula for calculating specific
legibility. He now arrives at a neat legibility hierarchy with old style on
top, then modern, then sans serif, and then fraktur in line with
contemporary belief. He has discovered that the more narrow
(condensed) a typeface is, the higher is its blackness, and vice versa for
broad (expanded typefaces). Furthermore the method is applied to an
old style typeface family of eight weights, with the generic old style
typeface referred to above as the starting point. The thickness of the
main stroke has been gradually increased and decreased in each
direction. Here we can see that the mean legibility coefficient and the
mean blackness actually varies as a function of weight. Legros admits
that ‘this trial shows that the method gives the highest figure [both the
mean legibility coefficient and the specific legibility] for type blacker
than is generally used for reading matter’ (p. 7). Since the results are
obviously not in accordance with his preconceptions, and common
typographic practice, he then suggests that ‘it would be possible to fix a
limit for the blackness and to require that the specific legibility should
exceed say 2,5 per cent for any selected pair of letters. Such a stipulation
would ensure that those characters that are almost commonly misread
should be made to a definite percentage of dissimilarity as measured by
the legibility coefficient.’ (p. 7). This statement illustrates the deep
troubles which the theory now has encountered; remember, this is not
the first attempt to solve problems encountered; compare the
introduction of the concepts of blackness and specific legibility.
However, the theory is dubious. In Typographical printing surfaces
the authors tried to validate their theory by rhetorically referring to the
form of the then abolished long s (identical form to the character f, except
for the lack of the right-hand part of the f’s crossbar), resulting in an
extremely low legibility coefficient. It is not difficult to accept that two
characters with such a degree of sameness easily can be confused and
A review of empirical studies
/
102
misread. However, to generalise from this (and possibly other similar
examples) into a theory as the above represents a step that is hardly
defensible. Characters are definitely not read while superimposed upon
each other.
Another fallacy of Legros and Grant is that they induce (without
saying explicitly that they do) from specific typefaces (the ones actually
employed in the investigation) with all their details and idiosyncrasies,
to generic categories (old style, sans serif, etc.) and then they judge the
categories on this basis. Furthermore, the choice of character pairs is
obviously not arbitrary, other character pairs could have been selected,
and the results may have been different. Actually, in in an application of
their method in A note on the legibility of printed matter they have
omitted some of the small-letter character-pairs where one of the eight
typeface variants is identical with one of the typefaces employed in
Typographical printing surfaces, and not unexpectedly the figures
differsomething which Legros acknowledges. There is not much left of
the theory.
The typeface designer Walter Tracy commented upon this study in
1988:
The fundamental error … in Legros and Grant’s study of resemblances,
is the assumption that a reader has to recognise each letter before he
can recognise each word. But it has been shown, and experience
confirms it, that the competent reader’s eyes take in a text not by
nibbling at it unit by unit but by gulping groups of words; and such is
the speed at which the perceptive faculty operates even failing to
notice a wrong letter in a word. (p. 83)
Pyke 1926
*
In 1923 the British Medical Research Council on the request of His
Majesty’s Stationery Office appointed the ‘Committee upon the legibility
of type’.180 R.L. Pyke at the Psychological Laboratory at the University
* R.L. Pyke. 1926. Report on the legibility of print. Med ical Research C ouncil, Speci al
report series, no. 110. London: His Majesty’s Stationery Office.
180. Lucien Legros was one of six members; however, this committee should not be
confused with the former ‘Committee on Type Faces’; cf. Legros 1922).
A review of empirical studies
/
103
of Cambridge was engaged by the committee, and in 1926 his Report on
the legibility of print was published by HMSO as a ‘Medical Research
Council Special Report’. The report presents an unusually compre-
hensive discussion of previous legibility research. It also presents Pyke’s
own experimental work on the relative legibility of typefaces.
Pyke refers to Griffing and Franz (1896), Roethlein (1912), Legros
and Grant (1916), and Legros (1922). He politely states that testing for
legibility, in his opinion, can best be done by actual reading. Measuring
legibility by involving real acts of reading ‘springs from a more
comprehensive aspect of the process of reading’ than Legros’ measure of
‘specific legibility’ (Pyke 1926, p. 30). Pyke discards what he calls
objective criteria (e.g. ‘specific legibility’ where ‘objective’ refers to the
exact measures of typeface dimensions), subjective criteria (mental or
physical states like fatigue, eye strain, and ‘aesthetic satisfaction’), and
functional criteria based on ‘unconscious processes’ (e.g. eye movements).
What is left is functional criteria based on ‘conscious processes’. By this
he refers to the following five operational methods: distance threshold,
illumination threshold, focus threshold, speed of reading, and errors. He
then, after some discussion, discards distance threshold, illumination
threshold, and focus threshold. Speed of reading is then, after some more
discussion, also discarded: ‘moderate changes of a realistic sort in the
typographical stimulus are extremely unlikely to produce reactions in
speed directly proportional, or even significant’ (p. 31).
At last he settles with the rather odd ‘number of errors’ as the ‘best
available criterion’ although he is not completely satisfied with it. He
then, after another elaborate discussion, decides that the tests must be
based on reading aloud as opposed to silent reading, at maximum speed
(as fast as the subjects can), and for four seconds at a time (‘long enough
for about one line of nonsense or two of sense to be read’) (p. 34).
In most of his series of experiments he compares the legibility of
eight typefaces: seven roman and one sans serif typeface. Three of the
romans are described as ‘standard’ typefaces (Lanston Monotype Old
A review of empirical studies
/
104
Style No. 2, L.M. Imprint Old Face No. 101, and Caslon Modern Series
No. 23). Of the remaining four roman typefaces, one is ‘narrow’ (L.M.
Modern Condensed No. 39), one is ‘broad’ (L.M. Modern Extended No. 7),
one is ‘thicked-limbed’ (L.M. Old Style Antique No. 161, and one is ‘thin’
(L.M. Cushing Series No. 17). The sans serif is Stephenson & Blake,
series no. 10, Lining Grotesque. All typefaces are illustrated as complete
upper and lower case alphabets (p. 114115).
On the face of Pyke’s results, the standard Old Style is the most
legible, 18 per cent better than the second the sans serif typeface.
However, the biggest difference between any typeface and the next on
the ranking list is between the sans serif and the next (the standard
Modern) which is 30 per cent less legible than the former. Pyke points
out that the best type (the standard Old Style) is the most ordinary.
However, he is not willing to attach any significant meaning to that, due
to the fact that the second best, the sans serif typeface, ‘is one of the
most uncommon’ (ibid. 52). Pyke suggests that the success of the sans
serif might have something to do with the fact that its actual size (as
opposed to its nominal size) was the second largest. However, he does not
pursue this suggestion. He laconically states that the sans serif’s
‘relative legibility agrees with Roethlein’s results, but disagrees with
Legros’ and Grant’s’ (p. 52).
In Pyke’s final conclusion he expresses doubt and scepticism about
his own results. He specifically questions the extent of the differences
and ‘whether the relative legibility of the types in the experiments holds
in the ordinary world’ (p. 60). He elaborates:
The hypothesis is here put forward that extremely large typographic
differences must be present before it is possible to say that there is any
difference in the objective legibility of types. Of types more slightly
differentiated it is impossible to say that one is objectively more legible
than another. Such types will be ‘legible’ according as they suit the
psychological make-up of the reader. Hence the most legible type, in this
subjective sense, is unlikely to be the same for all readers. … These two
ideas that within limits there is no objective optimum, and that only
extremely violent changes in the stimulus can produce any significant
variation in the reaction are found in other psychological fields. (p. 60)
Pyke’s final conclusion comes like a deep sigh: ‘The problem
of legibility seemed simple at the outset; it is in fact complex and
A review of empirical studies
/
105
elusive’ (p. 61). Accordingly: The anonymous preface (by the Medical
Research Council on behalf of the Committee) concludes that available
methods are inadequate and that experimental work ‘on the same lines’
therefore should be discontinued and integrated into a more ‘compre-
hensive scheme of research into the physiology of vision’ conducted by
a new committee (p. [3]).
Crosland and Johnson 1928:
*
the serifs that
never were
This paper, an article published in the Journal of Applied Psychology,
is every now and then cited in legibility studies. In 1959 Cyril Burt181
referred to Crosland and Johnson’s article the following way:
Our conclusions are fully borne out by the results of H.R. Crosland and
H. [sic] Johnson, who also found serifed letters more legible than
unserifed(J. Ap p l. P sy ch o l. XII, 1928, p. 121). (Burt 1959, p. 9)
However, the compromising fact is that Burt cannot possibly have
read anything but the short summary of the article (on p. 121) which
states that ‘seraphed letters are more legible than unseraphed
letters’.182 James Hartley and Donald Rooum have pointed out that the
only typeface involved in this study was Caslon (1983, p. 205). When
Crosland and Johnson used the terms ‘unseraphed’ and ‘seraphed’ they
clearly attached the terms to something other than serifs. They mean,
respectively, small letters without ascenders or descenders, and small
letters with ascenders or descenders.
A number of writers on legibility have since erroneously referred to
this article. Such errors of disconnected intertextuality is something
* H.R. Crosland and Georgia Johnson. 1928. ‘The range of apprehension as affected by
inter-letter hair-spacing and by characteristics of individual letters’. Journal of
Applied Psychology, vol. 12 , pp. 82124.
181. For inf or mati on abo ut C yril B urt, se e th e sect io n ‘Bur t, Coo pe r and Ma rtin 1 955 /
Burt 1959’ in this chapter.
182. Alternatively, he may have seen this quoted sentence in a secondary source which he
refers to in his bibliography: Carmichael and Dearborn’s Reading and visual fatigue
(1947, p. 106).
A review of empirical studies
/
106
anyone can fall victim to. Nevertheless, the sheer number of writers in
question as well as the fact that they all imply that they cite Crosland
and Johnson directly, without acknowledging that their source of
information has been Burt or another faulty secondary one, is thought-
provoking. The alternative explanation, that the writers in question, just
like Burt, have not read the Crosland and Johnson article carefully, is
also possible. Anyhow, they refer to the article the following way:
Stanley Morison: ‘what other investigators had proved, that “serifed
letters are more legible than unserifed” (in Burt 1959, p. xi)
Miles Tinker: ‘Serifed letters were significantly more legible than
unserifed letters.’ (Tinker 1963, p. 272)
Dirk Wendt: ‘Crossland [sic] & Johnson 1928 fanden eine
Unterlegenheit der Groteskschrift.’ (1969, p. 20)
Sidney Berger: ‘Crosland and Johnson … say that serifs increase
legibility’ (1991, p. 7)
Pedersen and Kidmose: ‘[Crosland and Johnson] disclosed that the
subjects had more difficulty in reading sans serif typefaces than
Roman ones’ (1993, p. 69)
Dirk Wendt: ‘Crossland [sic] & Johnson (1928) reported an inferiority
of a sans serif typeface’ (1994, p. 305)
De Beaufort Wijnholds: ‘Among others … Crosland and Johnson
(1928) … agree that serif typefaces are more legible’ (1996, p. 32)
Only two papers that I have come across erroneously refer to
Crosland and Johnson without implying that the information is taken
directly from the source. Zachrisson (1965, p. 37) and Jansen and
Thomsen (1985, p. 58) refer to Crosland and Johnson’s ‘findings’ while
acknowledging Burt as their source of information. Kunst (1972) reveals
that he has actually read the paper, and he refers to it accordingly.
A review of empirical studies
/
107
Moede 1932
*
In 1931/1932 the Bauer type foundry in Germany issued a promotional
brochure called Futuraheft for the then new and innovative sans serif
typeface Futura designed by Paul Renner. This brochure contained six
pages of five dubious statements dressed in a self-important language
about the excellence of Futura, while radiating some kind of scientific
authority. The statements were made by a doctor, ‘a well known
philosopher’, a medical doctor and ophthalmologist, another medical
doctor, and a doctor and professor of psychology. At least three of the five
statements are unsubstantiated claims where the only link between the
content of the statements and science seems to be the scientific sounding
vocabulary and the titles of the persons uttering them.
Two of the five statements are more elaborate than the others. One
is a statement by Dr Med. G. Secker, an ophthalmologist in Hamburg. It
includes a long, dubious and even referenced reasoning on the physiology
of vision. The reasoning purports to scientifically ‘prove’ that a typeface
like Futura, that is, a typeface without serifs and with a high degree of
regularity, ‘obviously’ will cause less eye-fatigue than any other typeface.
The other elaborate statement, seemingly more substantial, and
which avoids the pompous language of the other statements, is the only
one which refers to experimental research and which thus is of special
interest for this thesis. It is based on an unpublished report183 of
experimental work by Prof. Dr Walther Moede at the ‘Institut für
industrielle Psychotechnik und Arbeitstechnik in Berlin-
Charlottenburg’.
It is worth noting that Walther Moede (18881958) was a prominent
and influential applied psychologist or ‘psycho-technician’ in the inter-
* Futuraheft. [1931 /1932 ]. Fra nkfur t am Ma in, Ba rcelo na an d New York: Ba uersc he
Giesserei.
183. Ovink briefly refers to an ‘unpublished report by Moede’ (without giving further
bibliographical information or mentioning Futuraheft) (1938, p. 111). The text in
Futuraheft indicates that it is based on an independently existing report as opposed
to the other statements in the publication. However, I have made no effort in trying to
track down the original report. A few comments on the content of Futuraheft can be
found in Ovink 1938, p. 111; Zachrisson 1965, pp. 37, 6869; and Burke 1998a,
pp. 112113.
A review of empirical studies
/
108
war period. He was a teacher at Berlin Technische Hochschule, the
founder of Germany’s first institute for industrial ‘psychotechnics’ and
for many years the president of the professional organisation ‘Verbandes
praktischer Psychologen’. He was an active researcher who published
widely and he was the editor of numerous journals, among them
Praktische Psychologie and Industrielle Psychotechnik. He was also the
author of many books, among them several textbooks, and he was
adviser to the German railways.184
‘Psychotechnik’ was applied psychology in Germany in this period.
Psychotechnics was preoccupied with applying psychology to industry,
for example by psychological testing of individuals in order to match
them to particular occupations and trades. It played a key role in
German ‘Arbeitswissenschaft’, and was seen as an instrument for the
‘rationalisation’ of work, complementary to both ‘Fordism’ and
Tayloristic ‘scientific management’. However, it was also seen as an
instrument for the ‘humanisation’ of work, by aiming at healthy and
attractive work environments. Interestingly, psychotechnicians also
showed an early interest in human beings as consumers, by carrying
out quantitative research on the psychology of advertising and the
consumption of commodities.185
Moede’s experiments, which involved Futura and a roman typeface,
is described in Futuraheft under the title ‘Welche Schrift ist leichter
lesbar?’ In the first experiment Moede measured the time it took subjects
to read a text set in Futura and a roman typeface.186 On average the
text set in the roman typeface was read in 28.5 seconds and the text set
in Futura in 27.8 seconds. In the second experiment the subjects had to
read the text from a shaking apparatus. The average results were 57.6
seconds for the roman typeface and 52.4 seconds for Futura. The third
experiment investigated the display effect (‘die Reklamewirkung’) of
Futura. It was carried out with a tachistoscope, and here Futura came
out as an even clearer winner: Futura was on average read after only 3.2
184. Campbell 1989, p. 142, 356; Neue Deutsche Biographie 1994, p. 611.
185. See Joan Campbell’s fascinating study Joy in work, Ger man work: the n ational
debate, 18001945 (1989, especially pp. 131177).
186. Burke points out that ‘Renner most certainly intended the typeface for use in
[continuous] text.’ (1998a, p. 113).
A review of empirical studies
/
109
successive exposures of 1/6 of a second, while the average number for the
roman typeface was 5.4. In addition, Moede refers very briefly to
‘distance’ and ‘low-illumination’ experiments, where, by now, to no
surprise, Futura also came out as the superior. No wonder Moede’s
conclusion of Futura’s superiority is unambiguous. He even points out
that the results were obtained in spite of the subjects’ habituation to
roman typefaces.
Maybe I am too dismissive of Moede’s research. The fact that all the
results, regardless of operational method, without exception were in
Futura’s favour, and furthermore, that the research itself most likely
was made to order, does not necessarily imply that the results were
made to order. Nevertheless, as long as no details or illustrations of the
‘stimulus material’ is available, it is simply impossible to assess the
validity of the research in a meaningful way.
It was not unusual for type manufacturers during the 1920s and
1930s to market typefaces by referring to scientific research on legibility,
and not necessarily by such dubious statements as those in Futuraheft.
American Type Founders promoted their hugely popular serif typeface
Century Schoolbook, launched in the mid 1920s, by referring to a series
of tests and experiments made by, among others, the British Association
for the Advancement of Science and Clark University. The marketing
material cleverly implied that Century Schoolbook was objectively far
more legible than other typefaces, and claimed that the experimenters,
who had worked independently of each other, all arrived at the same
conclusions.187
Linotype Mergenthaler extensively marketed their (serif) news-
paper typefaces Ionic no. 5 (launched in 1925) and Excelsior (launched in
1931), and some other similar typefaces, as ‘The Legibility Group’ of
typefaces. Mergenthaler Linotype hailed ‘The Legibility Group’ of
newspaper typefaces as ‘a scientific solution to a typographic problem’
and Ionic no. 5 was claimed to have been designed to overcome ‘the
187. See for example the 1923 main typeface catalogue from American Type Founders
(1923, p. 203). See also Shaw 1989.
A review of empirical studies
/
110
handicap of eye fatigue and impaired vision’ on the basis of ‘a study of
eye movements in reading’.188
When Luckiesh and Moss published their book on legibility
research, Reading as a visual task, in 1942, they emphasised that their
book was printed in Textype ‘designed primarily for … works requiring
intensive study and prolonged reading’ (p. [429]). Textype is similar to
Ionic no. 5 and belongs to Linotype’s ‘legibility group’, but has
considerably smaller x-height and was marketed for use in books and
periodicals.
Linotype’s competitor, the Intertype composing machine and
typeface manufacturer, marketed their imitations of Linotype’s
newspaper typefaces, Ideal News and Regal, in a similar manner. Thus,
Ideal News was marketed under the heading ‘Readability, legibility,
durability’ while Regal was marketed under the heading ‘A type face
scientifically designed for easy reading’.189
The strategy of selling a typeface, or any product, by procuring
favourable statements from scientists (I am not necessarily including
Moede’s report as such) is not unique for the inter-war period. A similar
strategy seems to have been employed in the 1980s for a typeface that
later developed into one of the most successful typefaces of the 1990s,
Erik Spiekermann’s sans serif typeface Meta. In an essay by
Spiekermann on the design process of the typeface (which initially was
commissioned and intended to become the new corporate typeface for
Deutsche Bundespost), he claims that before it was submitted (and
subsequently turned down) ‘a professor of applied psychology had
written a very favourable report on it’ (Spiekermann 1986). Paul Stiff,
the editor of Information Design Journal, recently described a similar
contemporary phenomenon the following way: ‘some information design
companies try to enlist the help of “researchers” for much the same
188. See for example the booklet The legibility of type, published by Mergenthaler
Linotype in 1935 (p. 27); and the 1939 main typeface catalogue from Mergenthaler
Linotype (1939).
189. See promotional type specimens for Ideal News and Regal from Intertype issued
during the 1930s. For a general introduction to the history of dedicated newspaper
typefaces, see Level 1989; and for a comprehensive visual index, see Gürtler 1988.
A review of empirical studies
/
111
reasons that men in white lab coats appeared in the first toothpaste
adverts on tv’ (Stiff 1993, p. 43).
Parallells to the Futuraheft kind of cynical, or naive, scientism by
vaguely referring to ‘science’ or by procuring favourable statements from
medical doctors or psychologists can be found in abundance in an area
akin to typography. Exactly this kind of rhetorical strategy was
ubiquitous in a long and heated debate on handwriting models for
schoolchildren in post-war Norway. In an essay based on his research on
the introduction and implementation of the Marion Richardson-like
handwriting model ‘formskrift’ which in the period from 1947 to 1962
monopolistically managed to take over for the older copperplate-like
handwriting style Trond Berg Eriksen discloses an astonishing amount
of almost caricature-like scientistic statements. This kind of rhetoric was
applied by all sides of the debate, and the ergonomic arguments were
often mixed with political, moral, social hygienic and aesthetic
arguments as well as more pragmatic arguments (Eriksen 1993).190
Paterson and Tinker 1932
*
The article ‘Studies of typographical factors influencing speed of reading.
10. Style of typeface’, was the tenth of thirteen numbered legibility
studies published by Tinker and Paterson in the Journal of Applied
Psychology in the period from 1928 to 1936. This study was also the first
of four studies three experimental and one non-experimental carried
out either by Tinker alone or with co-authors between 1932 and 1944, on
typeface legibility, using the same typefaces but different methods each
time.
190. For a d es crip ti on and s urve y of Mar io n Ri ch ar dson ’s a nd oth er h andw ri ting m odel s,
albeit in an English context, see Myers 1983. Gordon and Mock 1960 is brief, but still
a comprehensive index of 20th century handwriting models. Sassoon 1999 is a
recently published book on the same topic. For critiques of the methodology of several
research papers on the ergonomics of handwriting and handwriting models, see
Sassoon 1988.
* Donald G. Paterson and Miles A. Tinker. 1932b. ‘Studies of typographical factors
influencing speed of reading 10: Style of type face’. Journa l of Applied Psychology ,
vol. 16, pp. 605613.
A review of empirical studies
/
112
The authors open up by pointing out the tendency of ‘editorial,
advertising and printing experts’ to show much greater interest in the
legibility of various typefaces than for for example for line spacing and
line width. They then go on, claiming that they will settle the question of
the relative legibility of typefaces:
Absence of definite facts tends to perpetuate such a state of affairs. It is
obvious that experiments on the relative legibility of type faces are
needed in order to determine the merits of the claims advanced by
partisan advocates. (p. 605)
They then continue by reviewing the literature, including Roethlein
(1912) and Pyke (1926). They claim that, although Roethlein’s study has
been quoted widely, and was a pioneering experiment, it must be
discounted because it is based on stimulus material of isolated letters
and meaningless groups of letters ‘and thus fails to give information
regarding the relative legibility of typefaces as found in actual reading
situations.’ (pp. 606, 612). They describe Pyke’s experiment as more
ecologically valid than Roethlein‘s because it represents ‘a closer
approach to the normal situation’ (p. 607). However, they also discount
Pyke’s experiment by referring to Pyke’s own doubts about his own
experiments. Elsewhere in the paper they put it more bluntly, however
without referring to Pyke explicitly: ‘... and the failure of any previous
investigator to study the question under ordinary reading conditions led
to the present study.’ (pp. 612613).
Paterson and Tinker based their selection of typefaces on feedback
from a large number of editors and publishers. On this background they
selected the common typefaces Caslon Old Style, Scotch Roman,
Garamont, Old Style, Bodoni, Cheltenham, and Antique. They added, in
order to provide strong contrast, the typewriter-like typeface American
Typewriter, the textura typeface Cloister Black, and the sans serif
typeface Kabel light. All the typefaces are illustrated in the article
(p. 608). The authors applied the Chapman-Cook speed of reading test
(which includes a comprehension check), while displaying the typefaces
in text set solid in 10 point nominal type size, at 19 pica line length, and
read for 1 minute and 45 seconds.
Paterson and Tinker conclude that, except for the typewriter-like
typeface and the textura typeface, all the typefaces in the experiment are
A review of empirical studies
/
113
‘equally legible’ (pp. 609, 613). The tabulated results show that the sans
serif typeface Kabel is 2.3 per cent less legible than the standard
typeface in the experiment, Scotch Roman (p. 611). The authors
emphasise the significance of the ‘findings’ for the printing industry
(p. 612). They also point out that their results disagree with Roethlein’s
results, but closely agree with Pyke’s conclusions.
The authors emphasise that the ‘ultra modern’ sans serif typeface
Kabel came out ‘practically as legible as the other type faces in common
use’ (‘however with a slight disadvantage’) (pp. 609, 612). To this they add
that the sans serif typeface may have come out better if the subjects had
been more used to read it.
The authors do however miss out on an obvious threat to the
internal validity of their controlled single variable experimental design.
The typographical variable ‘typeface’ is the independent experimental
variable while the typographical variables linelength, interlinear spacing
and (nominal) typesize are invariant. However, by applying the same
interlinear spacing for each typeface, the results may very well be caused
not by the difference in typeface, but by the differences in the ratio of the
actual appearing typesize to interlinear spacing, or in the ratio of the
x-height (the dominating dimension of a typeface) to interlinear spacing.
At least, this will be a confounding factor.
Furthermore, the actual appearance size of Kabel light (as shown in
an illustration) is very large in relation to the other typefaces (except for
American Typewriter). Thus, the effective line space becomes
comparatively small to the possible disadvantage to Kabel light.
Furthermore, among typographers in general, Kabel is not regarded as a
sans serif typeface very well suited for continuous text, due to certain
idiosyncratic design features such as the uneven distribution of internal
space between the individual characters. It is certainly not a
representative sans serif text typeface.
A review of empirical studies
/
114
Webster and Tinker 1935
*
This article, ‘The influence of type face on the legibility of print’ by Helen
A. Webster and Miles A. Tinker, was published three years after the
previous article by Tinker that compares the relative legibility of various
typefaces.191
The authors emphasise the study’s potential usefulness to printing
practice (p. 52). The rationale behind this experimental study is to
compare the data generated by the speed of reading method (used in the
previous study) with the variable distance method, by employing the
same typefaces as last time. By the distance method the authors refer to
the maximum distance from the eye which the stimulus material can be
read. Five-letter words are arranged in four lines of four words each. The
horizontal and vertical distances between the words are not given.
The authors conclude that the ranking order of the previous study
hardly relates to the ranking order of the present study. This time, one of
the least legible typefaces last time, American Typewriter, came first,
‘read at a distance distinctly greater than the others.’ (p. 46). This time
Kabel light came out worse than last time, penultimate according to the
ranking list, however not worse than ‘slightly less legible than the
Scotch-Roman standard’ (against which all the typefaces were measured)
(p. 46). The expression ‘slightly less legible’ refers to a difference in
average distance between Kabel lite and Scotch Roman of 140.75 cm
compared to 140.68 cm; i.e. less than one millimetre.
The authors point out that the theoretical and operational criteria
differ between the two studies, and that the ‘factors’ which determine
legibility may not be the same (or may even be the opposite) for reading
at a distance as for reading continuous text under ‘ordinary reading
conditions’. They suggest that the relatively large (actual) size and the
relatively large strokewidth of American Typewriter are probable
determining factors for its good result in the present study. (The
* Helen A. Webster and Miles A. Tinker. 1935. ‘The influence of type face on the
legibility of print’. Journal of A pplied P sychol ogy, vol. 19, pp. 4352.
191. Helen Webster did her M.A. thesis at Tinker’s university, the University of
Minnesota. The title of her thesis was ‘The influence of type face and paper surface
on the legibility of print’ (Webster 1933).
A review of empirical studies
/
115
relatively heavy typefaces Cheltenham and Antique also do well in this
study). The authors also point to the ‘inadequate adjustment of spacing
around the letters’ of American Typewriter192 as a determining factor for
the poor result in the previous study. They explain this by pointing out
that when text is carefully read as isolated words at a distance, the
individual letters becomes more important, whereas when read rapidly
as continuos prose under ‘normal conditions’, familiar word-forms and
the context (surrounding word-forms) become more important.
(pp. 4951).
However, how can this explain the poorer result of Kabel light
another typeface with a large actual size in the present study? By
looking at the illustration that accompanies the previous study (by
Paterson and Tinker), we can see that while Kabel light is rather large
in actual size (but far from as large as American Typewriter, due to
American Typewriter’s excessive x-height), it is one of the lightest, if not
lightest, typefaces in the study. Thus, Kabel medium, nearer the weight
of the other typefaces in the study, may have scored better.193
Webster and Tinker are keen to stress that the clear differences
produced by the two studies are not necessarily contradictory, but
depend on the different methods employed. They further stress that this
shows that different criteria of legibility cannot be used interchangeably.
The authors conclude ‘if … perceptibility of words at the greatest
possible distance from the reader is desired, and speed is less essential,
American Typewriter type face is the best of the ten type faces used in
this investigation’. However, they seem to have forgotten that American
Typewriter had the largest actual size of the typefaces appearing in the
study.
192. American Typewriter is a typewriter-like ‘monospaced’ typeface not to be confused
with the current ITC American Typewriter where the characters have individual set-
widths.
193. However, whether Kabel medium existed in the USA at the time of the execution of
the previous study (published in 1932) is not certain. According to Mac McGrew, Sans
Serif Medium the medium version of American Monotype’s series of copies of
Gebrüder Klingspor’s foundry type Kabel was designed sometimes between 1930
and 1934, while the light version appeared in 1930 (McGrew 1994, p. 279). I assume
that it was Monotype’s Sans Serif light which was employed in Tinker and his co-
author’s studies, and not German foundry type. However, Tinker does not say
anything conclusive about this.
A review of empirical studies
/
116
Luckiesh and Moss 1937
*
‘The visibility of various type faces’, published in the Journal of the
Franklin Institute, by the engineers and prolific legibility researchers
Matthew Luckiesh and Frank K. Moss at General Electric Company’s
Lighting Research Laboratory, does not contain any references to the
previous experiments described in this thesis. In their study they employ
one of their methods, by measuring ‘the relative visibility’ with their
Luckiesh-Moss Visibility Meter.
The article opens up with stating very much the same concerns as
Griffing and Franz did in 1896, on impaired vision created by ‘visual
tasks … imposed by civilization’ (p. 77). They reason that such problems
can be eased by correction of ocular deficiency, by proper lighting (not
surprisingly, their affiliation considered), and by increasing visibility of
reading matter (what the present study is about). ‘Since reading is a
universal task in the home, schoolroom, and office, the visibility of the
reading matter is of major importance.’ (p. 77).
Luckiesh and Moss acknowledge the validity of craft knowledge and
the knowledge represented by existing artefacts to a larger extent than
Tinker:
In general, an acceptable standard of type size has been more or less
definitely indicated by the characteristics of good typography since these
have been evolved through mass experience for generations. (p. 77)
They also stress that their results should not be taken too seriously
and applied to all kinds of situations, for example where a less ‘visible’
italic typeface is used within continuous prose to catch attention, or
where certain aesthetic considerations are important (p. 78).
The study employed 20 Monotype typefaces (of 8 typeface families)
in 8 point size, including 3 variants of a sans serif typeface: Sans Serif
light, medium, and bold (p. 79). The typeface Sans Serif light is probably
identical to the Kabel ‘lite’ employed by Tinker. The serif typeface
families included are Bodoni, Caslon, Cheltenham, Copperplate Gothic,
Goudy, Cochin, and Garamond.
* Matthew Luckiesh and Frank K. Moss. 1937. ‘The visibility of various type faces’.
Journal of the Franklin I nstitute , no. 223, pp. 7782.
A review of empirical studies
/
117
The ‘Visibility Meter’ employed is described in length by Luckiesh
and Moss elsewhere.194 It is criticised as inadequate (not measuring
‘optimal’ legibility, but only ‘threshold’ visibility) by Tinker (1963,
pp. 910, 17). The main features of this apparatus are two circular filters
where the density varies continuously when rotated in front of the eyes
of the reading subjects, and it measures threshold visibility. Although
the typefaces, the paper and the illumination is precisely described, the
visual organisation of the ‘stimulus material’ is not. Due to the nature of
the ‘threshold’ visibility experiment, the stimulus material is obviously
not intended to be read continuously. However, illustrations and
information elsewhere indicate that the ‘stimulus material’ may
actually be continuous text.195
The results show that the weight-variants of the Sans Serif typeface
family are measured to have a better relative visibility than most of the
equivalent weight variants of the roman typeface families (1937, p. 79).
However, Luckiesh and Moss themselves do not discuss the results of the
sans serif typeface variants to any extent. Of the four light typefaces
(I am excluding the capital only typeface, Copperplate Gothic, from this
and the next comparisons) Sans Serif light gets the second highest
‘relative legibility’ score after Cochin Light (a mannered serif text
typeface), but before Caslon Light and Goudy Light. Of the four medium
weight typefaces, Sans Serif medium again gets the second highest
legibility score after Goudy Antique (a marginal difference), but before
Cheltenham Wide and Bodoni Book. However, of the seven bold
typefaces Sans Serif bold gets the lowest score.
Perhaps not unexpectedly, Luckiesh and Moss, working for General
Electric as they did, throughout the paper emphasise that deficiencies
with regard to visibility may be compensated by the level of illumination.
Accordingly, they also express ‘relative visibility’ as ‘footcandles required
(for equal visibility)’. That is, a high score on ‘relative visibility’ matches
a low number of ‘footcandles required’, and vice versa.
194. Luckiesh and Moss 1942, pp. 6792, and inserted plate facing p. 67.
195. Luckiesh and Moss 1942, p. 398, insert facing p. 67, and insert facing p. 244.
A review of empirical studies
/
118
Luckiesh and Moss 1942
*
Luckiesh and Moss’ book Reading as a visual task, published in 1942,
not only includes an exposition of their typeface visibility experiment
from 1937 (see above), but also an exposition of another, until then
unpublished visibility experiment (pp. 159162). In this latter
experiment (also based on the Luckiesh-Moss Visibility Meter) the
typefaces employed were Caslon Old Face, Scotch No. 2, Textype, the
sans serif typeface Metrolite No. 2, Bookman, the newspaper typeface
Excelsior, and the slab serif typeface Memphis Medium all from
Linotype. All typefaces were of ten points size, except for Caslon Old
Face which was 11 points.
The sans serif typeface Metrolite No. 2 was ranked in the middle,
10.3 percent better than the one extreme Caslon Old Face, and 5.8 per
cent worse than the other extreme, Memphis Medium. By looking at the
illustrated ranking (p. 161), it seems to be a clear correlation between
weight and visibility the rather bold Memphis Medium is ranked on
top, and the very light Caslon Old Face on the bottom.
Tinker 1944
*
*
In ‘Criteria for determining the readability of typefaces’, published by
Miles Tinker in The Journal of Educational Psychology in 1944, he tries
to establish the superiority of speed of reading as a method for assessing
legibility. It does however also introduce a fourth study by Tinker on
typeface legibility, employing the same ten typefaces as in previous three
studies (of which one is the preference study referred to in the section
‘Subjective preference studies’’ in chapter 2). Tinker juxtaposes the
results of the four studies ‘speed of reading’,196 ‘perceptibility at a
* Matthew Luckiesh and Frank K. Moss. 1942. Reading as a visual task. New York :
Van Nostrand.
** Miles A. Tinker. 1944. ‘Criteria for determining the readability of typefaces’. Journal
of Educational Psychology, vol. 35, no. 7, pp. 385396.
196. Paterson and Tinke r 1932. See the review in this cha pter.
A review of empirical studies
/
119
distance’,197 ‘readers opinion’,198 and ‘visibility’ (the study in question
here, based on the Luckiesh and Moss Visibility Meter) and uses his
juxtaposed interpretations to support in a well-articulated manner his
advocacy for the speed of reading method (especially on p. 394).
In this fourth typeface study by Tinker, single five-letter words of
ten points size were identified one at a time by each subject through
the rotating filters in the Luckiesh-Moss Visibility Meter. This time
Kabel Light came penultimate, only 8.2 per cent better than the
standard typeface in the test, Scotch Roman. Tinker points out that the
typefaces with highest ‘visibility’ tended to be rather bold, thus Antique
came first, 56.3 per cent better than Scotch Roman. Likewise, those
typefaces with the lowest visibility tended to be rather light typefaces, cf.
Kabel light.
These studies illustrate that the different operational methods
employed produce incoherent results. Tinker claims that ‘there is a slight
tendency for the more visible faces to be read slower’ and that a ‘similar
trend is found in comparing perceptibility and speed of reading.’ (p. 391).
Tinker points out that Scotch Roman with a very low score on both
‘visibility’ and ‘perceptibility’ ‘is read as fast as all of the other commonly
used typefaces.’ (p. 392). Scotch Roman’s score should however not come
as a surprise. Its low score on ‘visibility’, and ‘perceptibility’ at a
distance, may very well be due to the fact that it is the smallest of the
ten typefaces with regard to actual size. In reading continuous text, a
slight difference in size may not be so decisive, and may furthermore,
due to the fact that all ten typefaces have the same nominal interlinear
spacing, be counteracted by its relatively large and generous actual
interlinear spacing.
197. Web ster and Tinker 19 35. S ee t he r evie w in t his chap te r.
198. Paterson and Tinke r 1940a, pp. 1820. See the section ‘Subjective preference studies’,
in chapter 2.
A review of empirical studies
/
120
English 1944
*
The year 1944 also saw another typeface legibility study published:
‘A study of the readability of four newspaper headline types’, by Earl
English in Journalism Quarterly.199 This article was based on a PhD
dissertation at the department of psychology at the State University of
Iowa. The article gives place to a thorough discussion of theoretical and
operational criteria of legibility. The author refers to work on typeface
legibility by Roethlein; Paterson, Tinker and Webster; and Luckiesh and
Moss. What is more, the author has also corresponded with the type
designer Frederic Goudy, and received material support and advice from
the typeface expert Douglas McMurtrie of the Ludlow Typograph
Company.
English measured the number of words read in three-line headlines
of three different sizes (14, 24, and 30 points) during brief exposures
through a tachistoscope. The typefaces employed were the serif typeface
Bodoni bold, the slab serif typeface Karnak bold, and the sans serif
typeface Tempo medium, all typefaces in common use.200 Specimens of
the actual ‘stimulus material’ are not shown in the article.
Interestingly, this author acknowledges that by controlling all
‘factors’ (i.e. typographic and environmental variables) in order to isolate
the effects of the independent experimental variable ‘one type may gain
an advantage in the comparison by being better suited to the controlled
conditions than another’ (p. 222). Thus, the line-spacing in this
experiment was adjusted differently for the different typefaces: ‘The
Ludlow type designing department considered spacing effects equalised
* Earl English. 1944.A study of the readability of four newspaper headline types’.
Journal ism Quar terly, vol. 21, pp. 217229.
199. This was the first of several legibility studies published in Journal ism Quar terly in
the post-war period. The journal carried its first mention of legibility research in
1941, with Douglas McMurtrie’s review of Paterson and Tinker’s How to make type
readable (Neman 1968, p. 3).
200. The Karnak typeface family, designed by Robert Hunter Middleton at the Ludlow
company, was derived from Linotype’s Memphis. The first versions of the Tempo
family, by the same designer, was introduced in 1930 as an answer to the popular
Futura and other Futura-like typefaces of that period (McGrew 1994, pp. 193,
303304). Nevertheless, note that Tempo is distinctly different from Futura. It is not
an imitation.
A review of empirical studies
/
121
under this arrangement.’ (p. 223). This experimental design therefore
represents an important but rare insight in legibility research. It also
indicates an experiment with a high degree of internal validity, at least
with regard to the stimulus material’s influence on the internal validity.
In legibility research, the only way to control possible nuisance variables
is not to hold them constant, but to adjust differently in an attempt to
equalise ratios and relationships among the experimental typographic
variable and other typographic variables. Thus, the design of English’s
experiment contains (at least in principle) an answer to the criticism of
legibility studies relying on single variable (univariate) laboratory tests
where only one typographic variable is manipulated as the independent
experimental variable while other typographic variables are constant,
whereas in the real work of typographic design a large number of
variables interact simultaneously in a complex way.
English also arranged a readers’ preference test, but discarded the
results as accidental. And in another experiment, designed like the first,
he tested family variants of Intertype’s serif typeface Cheltonian, in bold,
bold condensed, and bold capitals only.201
According to the results reported, Bodoni bold, Tempo bold,
Cheltonian bold, and Cheltonian bold condensed was read ‘significantly’
faster than the Karnak medium and Cheltonian bold capitals only. The
sans serif typeface Tempo bold got the highest score, however not
‘significantly’ higher than Bodoni and Cheltonian.
201. Cheltonian was a copy of American Typefounder’s Cheltenham. Except in a clarifying
footnote, English consistently refers to Cheltonian as Cheltenham. ATF’s Cheltenham
condensed was used for the size of 30 points.
A review of empirical studies
/
122
Burt, Cooper, and Martin 1955
*
/ Burt 1959 **
Cyril Burt’s and (fictious?202) co-author’s once acclaimed article
‘A psychological study of typography’ was published in 1955 in the
journal Burt edited, The British Journal of Statistical Psychology. Burt’s
almost identical monograph, bearing the same title, with a foreword by
Stanley Morison, was published by Cambridge University Press in 1959.
The two publications’ positive reception until the early 1980s, among
both researchers and designers, have been thoroughly dealt with by
Rooum (1981) and Hartley and Rooum (1983). They have convincingly
shown that Burt’s well known dubious if not fraudulent practices203 also
extended into his work on legibility and typography.204
Neither version of ‘A psychological study of typography’ present
original experimental research on the legibility of sans serif typefaces.
Burt briefly discusses the legibility of sans serif typefaces in a footnote in
the article (1955, p. 56n). In another footnote Burt mentions preliminary
experiments involving four named typefaces ‘found to be much inferior’,
including the sans serif typeface Gill Sans (1955, p. 32n). The former
footnote is then expanded into four additional paragraphs plus a footnote
in the monograph (1959, pp. 89).
Burt opens up his discussion by claiming that the ‘four most usual
arguments’ in favour of sans serif typefaces are ‘all unsupported by
experimental evidence.’ He then argues vehemently against the use of
sans serif typefaces; while supporting his a priori arguments with subtle
hints which might give the reader the impression that his views are
based on scientific evidence. For example by employing the expression
‘as psychologists have so often pointed out’. In order to substantiate his
* Cyril Burt, W.F. Cooper, and J.L. Martin. 1955.A psychological study of typography’.
The British Journal of Statistical Psychology, vol. 8 , pt. 1 , pp. 2957.
** Cyril Burt. 1960. A psychological study of typography. Wit h an intro ducti on by
Stanley Morison. Cambridge: Cambridge University Press. (Reprinted by Bowker in
1974 for the College of Librarianship, Wales). See also Burt’s article ‘The readability
of type’ in New Scientist (1960a).
202. See Hartley and Rooum 1983, p. 209.
203. See for example Hearnshaw 1979.
204. Robert B. Joynson’s long attempt to rehabilitate Burt (Joynson 1989), ignores Rooum and
Hartley’s articles. Rooum and Hartley’s articles are not mentioned in Mackintosh’s
collection of articles (1995).
A review of empirical studies
/
123
views, Burt cites two named sources (Kerr 1926, and Crosland and
Johnson 1928), which when examined, turn out not to confirm his
claims.205 Interestingly, Burt does not refer to Pyke, nor to the prolific
writings of Tinker, a natural frame of reference for a study like
A psychological study of typography. The picture of dubious rhetoric and
dubious citation-practice is in stark contrast to Beatrice Warde’s206
melodramatic review207 in Penrose Annual of Burt’s article:
let me now tell you neighbours that the most reassuring and welcome
document has come to the attention to the typographic fieldworkers this
year (judging by the cries of ‘At last!’ with which they received it) is one
which came fluttering down almost by accident from the very Other
Hemisphere to that of the arts, namely that of the sciences. … It was
welcomed by practising typographers as the first scientific study of their
subject …[and] it was hailed as surprisingly helpful, coming as it did
from the realm of science. (Warde 1956)
It is however worth noticing that Burt at least in one respect
displays admirable sensitivity, compared to most previous legibility
researchers by pointing out that the x-height is the most dominant
dimension of a typeface, and in addition, by pointing out that nominal
typesize do not necessarily correspond to actual typesize (1955, p. 32;
1959, p. 6). It is reasonable to believe that Burt’s frequent contact with
typeface expert Beatrice Warde played a role here.
Burt’s biased exposition is reinforced in Stanley Morison’s foreword,
where over several pages Morison proclaims the superiority of serif
typefaces (pp. xixv). Morison appeals to the authority of scientific
investigation, as well as appearing broadminded and unprejudiced:
Thus, while the investigators found that, in terms of word-recognition,
the design known as sans serif ‘was the worst of all’, and confirmed
what other investigators had proved, that ‘serifed letters are more
legible than unserifed’ (p. 9), it does not follow that sans serif letters are
never appropriate in any typographical composition, literary or
otherwise. (p. xi)
205. See Hartley and Rooum 1983, p. 205; and the section ‘Crosland and Johnson 1928:
the serifs that never were’ in this chapter.
206. See Hearnshaw for an exposition of the extraordinary relationship between Warde
and Burt, exchanging hundreds if not thousands of letters (1979, pp. 199203).
Badaracco suggests that Beatrice Warde saw a potential in Burt’s publications for the
marketing of Monotype’s typefaces (1995, pp. 78, 81).
207. Beatrice Warde also reviewed the book in the London Times (Badaracco 1995, p. 223).
A review of empirical studies
/
124
Unfortunately for Morison both the ‘finding’ and the ‘proof’ in the above
quote are the ones which in fact turned out not to confirm Burt’s claims.
Given this background, the final pompous conclusion in Morison’s
foreword becomes absurd:
No degree of consistency with an ideological, artistic, or intellectual
theory[208] is an acceptable substitute for conformity with verifiable
paleographical, physiological and psychological facts. (p. xix)
Soon after the book was published, P.M. Handover in an extensive
article on the history of sans serif typefaces, published in Motif and
elsewhere,209 referred to Burt’s book and bluntly turned Burt’s claims
into the following statement
The British psychologist, Sir Cyril Burt, has recently recalled and
reaffirmed scientific findings that ‘for word recognition, a sans serif type
face was the worst of all’. (Handover 1961, p. 81)
This statement by Handover was in turn used as evidence by
Reynolds Stone, against the use of sans serif letterforms, in a public
debate on the lettering on British traffic signs:
Fas hio nable o r n ot, t he use of sa ns meant ign ori ng exp ert s l ike the
psychologist Sir Cyril Burt, who ‘has recently recalled and reaffirmed
scientific findings that “for word recognition a sans serif type face was the
worst of all”.’ (see Letters without serifs by P.M. Handover in Motif 6).
Presumably motorists, as well as Motif readers, are concerned with ‘word
recognition’.210
In modified versions of Handover’s article, that appeared in Monotype
News Letter and Revue suisse de l’Imprimerie, Handover added yet
another dubious piece of information, by paraphrasing from a then more
recent article which Burt had published in Yea rb oo k of E du ca ti on (i.e.,
Burt 1960b):
Independent investigations in the U.S.A. support the result of British
experiments. (Handover 1963, p. 9)
208. That is, modernist advocation of common use of sans serif typefaces. My comment.
209. Of which in 1963 a slightly altered version was published both in Monotype News
Letter (Handover 1963) and in Fre nc h in Rev ue sui ss e de l ’I mp rime rie (the French
language section of the Swiss journal Typografisch e Mo natsblätter). The article
appeared in other guises as well.
210. As an invited comment to Christie and Rutley 1961, p. 61 (appearing in the same
issue).
A review of empirical studies
/
125
Interestingly, more or less the same phrase also appeared a few
years later in another context: in ‘Map design and typography’‚ an
unsigned article in a special issue of Monotype Recorder entitled
‘Precision in map making’:
Scientific investigation has shown sans serif to be the worst of all type
styles for word recognition. (Monotype Recorder, vol. 43, no. 1, p. 45)
As mentioned above, the reception history of A psychological study
of typography has been thoroughly dealt with by Rooum (1981) and
Hartley and Rooum (1983).211 However, in spite of their devastating
critique in the early 1980s, Burt is still often cited in a positively or at
least uncritical manner in contexts such as typography or information
design.212 These citations are not necessarily hidden away in obscure
privately published pamphlets or in printing industry trade journals.
Examples can be found in recent issues of refereed journals such as
IEEE Transactions on Professional Communications and Safety Science,
and in the two most prominent and acclaimed textbooks on information
design available today: Karen Schriver’s Dynamics in document design:
creating texts for readers, and Kostelnick and Robert’s Designing visual
language: strategies for professional communicators. These citations
range from ‘innocent’ uncritical citations where Burt is treated as any
other source of knowledge, but not necessarily with an explicit accep-
tance of his ‘findings’, as in Schriver’s textbook; to positive endorsements
as in Kostelnick and Robert’s textbook, where Burt is described as an
211. I dare to mention two prominent publications that Rooum and Hartley overlooked.
The American textbook The graphics of communication: ‘Burt has offered evid ence’
(see Turnbull and Baird 1980, pp. 8788). And the 1960 and 1979 editions (and 1996
reprint) of the well known Glaister’s glossary of the book, also publi shed as Encyclo-
pedia of the book: ‘Impor tant resul ts ha ve be en ac hieve d’ (Glaister 1979, p. 278).
212. See for example Gribbons 1991, pp. 44, 49; Kostelnick 1995, pp. 185187; Kostelnick
and Roberts 1998, pp. 142, 204; Pedersen and Kidmose 1993; Silver and Braun 1993,
pp. 623f. For more such references, but specifically on the legibility of sans serif
typefaces, see for example Gallagher and Jacobson 1993, p. 101; Lubell 1993; Schriver
1997, pp. 274, 276, 301. Badaracco represents an exception (she is aware of that Burt
has been discredited). She nevertheless writes uncritically about the existence of exp-
eriments on the legibility of sans serif typefaces allegedly performed by Burt (Bada-
racco 1995 pp. 7880, 223; cf. Hartley and Rooum 1983) . Watts and Nisbet represent
an early exception. They cite Burt, but add cautiously: ‘Unfortunately, many of the
opinions expressed are not substantiated by empirical evidence.’ (1974, p. 32).
A review of empirical studies
/
126
‘excellent source’. And similarly, in a recent paper in the IEEE Trans-
actions on Professional Communications, Kostelnick states that
Studies in the legibility of typefaces, such as those conducted by Tinker
and by Burt, represent the quantum physics of document design
because they address basic performance questions about typefaces.
(Kostelnick 1995, p. 187)
For a more complete picture of the reception of Burt’s book, why not
include an annotation in a recent catalogue from a British antiquarian
bookshop that specialises in books on typography and printing:
A landmark experimental survey of the mind’s capacity to take in a
printed message. The first test of accuracy and speed of reading reveals
the legibility of various styles of setting. The second test considers
choice of typeface. (Wakeman 1999, p. 48)
To conclude: Burt’s suspect contribution to typography has been
used for what it is worth by some, and is still alive in some quarters.
Christie and Rutley 1961:
*
The legibility of traffic signs and the public debate
on Jock Kinneir’s Motorway alphabet
In August 1961, two researchers at the Road Research Laboratory in
Britain, published a paper on the ‘Relative effectiveness of some letter
types designed for use on road traffic signs’ (Christie and Rutley 1961a).
It appeared in Roads and Road Construction. A shorter version was
published in Design the same month (Christie and Rutley 1961b). These
two papers represented the culmination of a vigorous public debate on
letterform legibility which had been going on since March 1959. The
controversy and the Road Research Laboratory’s subsequent
experiments happened in connection with the introduction of direction
signs for Britain’s new high speed motorways.
* A.W. Christie and K.S. Rutley. 1961a. ‘Relative effectiveness of some letter types
designed for use on road traffic signs. Roads and Road Construction, no. 3 9, Augu st,
pp. 239244; and A.W. Christie and K.S. Rutley. 1961b. ‘[Research] on road signs’.
Design, no. 151, Au gust, pp. 59 60.
A review of empirical studies
/
127
The design of these directional and other informational motorway
signs represented the first phase of an overall development of a new
coherent system of traffic signs in Britain between 1957 and 1963. The
new system was a late British adaptation to (but not an adoption of)
European practice and the UN Geneva protocol of 1949 (see Froshaug
1963). The new British system contained a large number of innovative
direction signs of several categories for use on motorways (including
direction signs on ‘all-purpose’ roads pointing to connected motorways),
direction signs for use on major ‘all-purpose’ roads, and direction signs
for use on local ‘all-purpose’ roads. The new British system also included
basic categories of largely pictorial traffic signs (both iconic and
symbolic), such as mandatory signs, prohibitory signs and warning signs.
Although modified and redrawn, these signs were more directly adopted
from the 1949 Geneva Protocol than the direction signs.213
The Motorway alphabet
The Anderson advisory committee on traffic signs for motorways
(19571962), set up by the Ministry of Transport, consulted a large
number of interested organisations and it also did something which was
then groundbreaking:214 in June 1958 it appointed a professional
designer. Jock Kinneir who by then had already designed the signing
system for Gatwick Airport215designed, on the basis of the
committee’s broad recommendations, the elegant and innovative
motorway direction signs, as well as their accompanying letterforms.
213. For bac kg ro und in form at ion on t his de velo pm ent, see Sp ence r 19 61 ( an illus tr ated
article depicting the chaotic and insular traffic sign situation in Britain before this
work started); Ministry of Transport 1962 (The Anderson report on traffic signs for
motorways); Ministry of Transport 1963 (The Worboys report on traffic signs for ‘all-
purpose’ roads); Froshaug 1963 (a profusely and systematically illustrated article on
the historical development of traffic signs, leading up to the new British traffic sign
system based on the recommendations of the Worboys committee); Krampen 1983
(a special issue of Semiotica, devot ed to the origi n and deve lopment of r oad s ign
systems in a broad international context), Department of Transport 1991 (a brief
history of traffic signs in Britain); Department of Transport 1994 (a design-manual
for British directional traffic signs); Department of Transport 1995 (a comprehensive
all-colour exposition of the current British traffic signs).
214. Froshaug 1963, p. 50; also Krampen 1983, p. 110.
215. See Kinneir 1970, pp. 1516.
A review of empirical studies
/
128
Kinneir, who had rejected on ‘aesthetic grounds’ the committee’s
initial wish216 to employ the German DIN sans serif lettering of capitals
and small letters,217 designed a new sans serif alphabet. It resembles
and was probably based on the typeface Berthold Grotesque from the
German type foundry Berthold.218 Like the DIN alphabet, Kinneir’s new
design is characterised by obliquely cut terminals as well as relatively
open semi-enclosed counters, of letters like a, c, e and s. However, the
DIN alphabet suffers from having rather narrow and rectangular
letterforms, and further, from inconsistent orientations with respect to
how the terminals of its capitals with semi-enclosed counters are cut.219
Kinneir’s new design got the committee’s full support, as well as
‘almost unanimous’ support from the consultative organisations.220
216. ‘We have as a committee got into the habit of accepting the general weight &
appearance of the German alphabet as being the sort of things we need! I think
therefore something on these lines is what the committee believes it wants.’ Letter
from Colin Anderson to Jock Kinneir, dated 26 June, 1958. (Location: Margaret
Calvert, London.)
217. Kinneir 1971, p. 6; 1984, p. 344.
218. Berthold Grotesque was issued in 1928. Nobel Grotesque (1929) from the Dutch
type foundry Amsterdam is a similar typeface. Specimens of both typefaces are shown
in Jaspert et al. 1970 (pp. 254, 312313).
219. The numerals for use in route numbers only (and ditto letters A, B, and M) got a
different design from the main numerals which were used for stating distances and
later for the numbering of motorway exits (see illustrations in Ministry of Transport
1962, pp. 4041, as well as illustrations in Department of Transport 1995, pp. 4751).
David Kindersley’s comment: ‘The road numbers, together with their letters, are even
worse than the main alphabet, and do not conform with the simplest rules of legibility
or differentiation.’ (Kindersley 1960, p. 465).
220. Ministry of Transport 1962, p. 4. Specimens of the alphabet appeared, without any
name attached, in the Andersen committee’s report (i.e., in Ministry of Transport
1962, pp. 3539). In more recent literature it is referred to as ‘the Motorway alphabet’
(Department of Transport 1994, p. 4). Jock Kinneir was subsequently engaged as
designer to the Ministry of Transport’s Worboys committee (19611963) on traffic
signs for ‘all-purpose’ roads. In the Worboys report specimens of two slightly modified
variants of the same letterforms carried the names ‘Transport Medium’ (on a dark
background) and ‘Transport Heavy’ (on a light background) (see Ministry of Transport
1963, pp. 97102). However, an alternative style for route-numbering was not
included for ‘all-purpose roads’, as for the motorways (see the previous footnote).
A review of empirical studies
/
129
FIGURE 3. Jock Kinneir’s Motorway alphabet. The letters ABM and numerals in
the lower right corner are for route-numbers only. The original background
colour is blue. (From: Ministry of Transport 1962, pp. 3641: figures 14.)
A review of empirical studies
/
130
When designing his sans serif alphabet, Kinneir performed informal
‘low-tech’ experiments: with reflective material in an underground
garage in order to determine a sensible weight; in Hyde Park in London
in order to determine sensible appearance-widths and a sensible
x-height; and in addition, experiments in order to create an appropriate
letterspacing system. Kinneir later commented on other aspects of the
creation of the sans serif letterforms in question:
The basis of the letter design was the need for forms not to clog when
viewed in headlights at a distance. For this reason counters ... had to be
kept open and gaps prevented from closing. Also, as pointillist painting
has shown, forms tend to merge when viewed from a distance, and this
suggested a wider letter spacing than is usual in continuous text.
(Kinneir 1984, p. 344)
The public controversy
It was exactly this lettering that after appearing on the first experi-
mental motorway signs put up in 1959 provoked public controversy on
letterform legibility. The controversy was reflected in the columns of
publications diverse as The New Scientist, Design, Roads and Road
Construction, Traffic Engineering & Control, The Times, The Daily
Teleg r a ph, The Observer and Cambridge Daily News.221 In particular
221. The following chronological list does not claim to be exhaustive. Letters to the editor
in The Times (1959): March 17, p. 11 (by Brooke Crutchley, printer to the University
of Cambridge; attacking Kinneir’s work); March 20, p. 13 (by Noel Carrington, a
designer and member of the advisory committee; supporting Kinneir); March 24,
p. 13 (by J[ohn]. G. Dreyfus, and by G.S. Bagley, both attacking Kinneir’s work).
Editorial note in New Scientist: vol. 5, no. 124 , April 2, 19 59, p. 731. M ain e ditor ial
article, ‘Which signs for motorways?’, in Design no. 129, Sept., 1959, pp. 2832
(Design 1959). This article reported a discussion organised by the journal, with
contributors such as: a car manufacturer, a traffic sign manufacturer, the Ministry of
Transport, the Road Research Laboratory, a landscape architect, Dr E.C. Poulton
from the Medical Research Council’s Applied Psychology Research Unit in
Cambridge, David Kindersley, Jock Kinneir, as well as other designers, typographers
and Brooke Crutchley. Letters to the editor in Design, no. 132, D ec. 1959, p. 71 (by
Herbert Spencer, and Aidron Duckworth); no. 133, Jan. 1960, pp. 75, 77 (by Ernest
Hoch, and Norbert Dutton). An article by David Kindersley in Traffic Engineering &
Control, Dec. 1960 , pp. 463465 (Kindersley 1960). A note in the ‘Peterborough
column’ in the Daily Telegraph, 8 M arch, 1961, and a follow up n ote a week or s o
later. Kindersley also appeared in Cambridge Daily News, March 9 , 1961.
A review of empirical studies
/
131
the radical solution of employing small letters with initial capitals
never before used on British road signs222instead of using capitals
only, was heatedly debated. Nevertheless, the use of sans serif lettering
instead of serif was also debated in this unique instance of a public
debate on letterform legibility.
The somewhat disgruntled223 opposition to Kinneir’s solution was
led by the ‘traditionalist’ letter cutter David Kindersley, and Brooke
Crutchley, printer to Cambridge University, both linked to the British
inter-war/mid-century typographic establishment (Stanley Morison and
others). I can only guess that this ‘establishment’, as well as being
suspicious of continental modernism, must have been seriously offended
by the fact that a major national lettering project had been initiated and
partly implemented without having been consulted. It was asserted that
Kinneir’s solution ignored specialised ‘knowledge accumulated over the
years’ and that there had been some ‘misguided work behind the present
proposal’.224
It was argued that lettering for destination names on sign panels is
more legible in capitals than in small letters,225 in spite of that words in
small letters are more legible for continuous text in books. (It was argued
that words in small letters make both irregular wordshapes and familiar
word patterns in continuous text, and that this aids recognition.) The
In March 1961 BBC planned a debate between Kindersley and Kinneir in the
‘Tonight’ TV-program. However, the Ministry of Transport advised Kinneir not to
participate, while reassuring Kinneir that he had the committee’s full support. (Letter
from Kinneir to Ministry of Transport, dated 17 March, 1961. Letter from Ministry of
Transport to Jock Kinneir, dated 24 March, 1961. RTC.51/2/03.) (Location: Margaret
Calvert, London.)
Subsequently, in August 1961, the two papers by Christie and Rutley at the Road
Research Laboratory, was published in respectively Roads and Road Construction
(1961a) and Design (1961b). To the article in Design were attached comments by the
designers Herbert Spencer, Reynolds Stone and Colin Forbes. Letters to the editor of
Design followed up the debate in subsequent issues: no. 154, October 1961, pp. 87, 89
(by Hans Schmoller, and by David Kindersley); no. 155, November 1961, p. 77 (by
Colin Forbes); n o. 15 6, Dece mb er 1 96 1, pp . 8 1, 83 (by A.G. Long ).
222. Christie and Rutley 1961a, p. 239.
223. An unpleasant aspect of David Kindersley’s argumentation was his public insults
directed against Kinneir, alleging that Kinneir lacked competence and skill as a
designer (see Kindersley 1960).
224. Brooke Crutchley in The Times, Marc h 17, 1959, p. 1 1.
225. See especially the elaborate, but also occasionally dubious and rambling argument in
Kindersley 1960.
A review of empirical studies
/
132
reason for preferring all-capitals for sign panels, according to the same
argument, is that horizontal eye movements is not an issue on sign
panels, and that place names are not familiar word patterns.
It was also argued that capitals are intrinsically clearer than small
letters, especially when compared in the same nominal size. Therefore,
the argument went, all-capital lettering would allow for considerably
smaller sign panels and therefore give large benefits with regard to the
cost of production, as well as creating less impact on the landscape. The
reasoning behind the size argument was that as long as the dominant
dimension of capitals (baseline to top of capitals) is bigger than the
dominant dimension of small letters (the x-height), big conspicuous all-
capital lettering could be applied in a given area without the need to
allocate space for ascenders and descenders.
It was further argued that serifs would strengthen terminals and
thus define letters more clearly from a distance. David Kindersley
actually proposed a theory on how serifs improve the legibility of
letterforms in certain situations:
Try reading a page of sans-serif lower-case, and then a page of ‘normal-
face’ and you will see at once that the normal one is more readable. The
reason for the existence of the serif is clear, and is not just a
meaningless tradition. In very small type, or in larger letters to be read
at a great distance in fact, wherever there is a question of distance in
relation to size there is always a loss of definition. The serif reinforces
the individual character of the letter exactly where this loss is greatest.
(Kindersley 1960, p. 465)
Against the arguments of the traditionalists, it was argued by
supporters of Jock Kinneir that words with initial capitals and small
letters with ascenders and descenders provided more differentiated as
well as more familiar word-shapes, as opposed to the rectangular and
monotonous shapes of all-capital words. Words would therefore be easier
to recognise from a distance. It was also claimed that serifs and
modulated strokes are not very well suited for reflective material.
A review of empirical studies
/
133
The experiments
Comparative experimental research was recommended by several of the
participants in the debate,226 and the experiments undertaken by
Christie and Rutley for the Road Research Laboratory were, apparently,
conducted as an answer to this demand. To put these experiments in
perspective it should be taken into account that Kindersley’s challenge
was to a process that was already under way. It was not as if the job had
been conceived as a competition between designers as in big architec-
tural projects. The Anderson committee was fully in favour of Kinneir’s
proposal based on parameters set by the committee itself. I think it is
correct to say that the committee had no plans whatsoever abandon
Kinneir’s solution, regardless of the outcome of the Road Research
Laboratory’s experiment. The Kindersley ‘challenge’ was regarded with
irritation in the Anderson Committee as well as in the Ministry of
Transport. As pointed out in a letter to Jock Kinneir from the ministry:
Yo u ar e a lr e ad y i n a s tr on g p os i ti on vi s-a-vis your detractors; it is you
who were commissioned by the Department to do the job, it is your signs
that have been erected on the motorways, and you can be sure of the
solid support of the Committee for what you have done.227
In fact, Kinneir and the committee members tacitly regarded Kindersley
as an indefatigable crank who carried out a campaign based on
‘tendentious claims and half-truths’ against both the committee’s and
Kinneir’s work.228
226. Both parties claimed support from existing experimental research (mainly on the
capitals versus small letters question, but also on the serif vs. sans serif question).
See Noel Carrington in The Times, March 20 , 1959, p. 13 ; John Dr eyfus in The Times,
March 24, 1959, p. 13; Design 1959, p. 30; and Kindersley 1960, p. 464. See also a
reference to existing research in the Anderson report (Ministry of Transport 1962,
p. 4), as well as a comment from Reynolds Stone in a comment to Christie and
Rutley’s paper in Christie and Rutley 1961b (p. 61); and a letter to the editor from
David Kindersley, in Design, no. 154 , 1961 , pp. 87, 89.
227. Letter from the Ministry of Transport to Jock Kinneir, dated 24 March, 1961, Ref.
RTC.51/2/03. (This is the letter where the ministry advised Kinneir not to participate
in a TV-debate with Kindersley.) (Location: Margaret Calvert, London.)
228. See for example the letter from Jock Kinneir to the Ministry of Transport, dated
17 March, 1961; and a letter from the committee chairman Colin Anderson to Fred
Salfield of the Daily Telegraph, date d 8 M arch, 1961, wh ere h e com plain ed abou t the
misinformed pro-Kindersley coverage in the newspaper’s ‘Peterborough’ column the
very same day. (Location: Margaret Calvert, London.)
A review of empirical studies
/
134
Kindersley had stated optimistically that: ‘No decision should be
finally and publicly announced on the M1 signs until the facts are
established by the Road Research Laboratory’ (Kindersley 1960, p. 465).
Similarly, Dr E.C. Poulton from the Applied Psychology Research Unit
of the Medical Research Council, who participated in the discussion
organised by Design, ‘was in no doubt that the facts could be established
providing the criteria could be agreed in the first place’ (Design 1959,
p. 129).
Four different types of lettering were employed in the experiments:
sans serif letters of capitals only, based on designs by Edward
Johnston, commissioned by the Road Research Laboratory, from
David Kindersley
serif letters of capitals only, commissioned by the Road Research
Laboratory, from the chief critic of Kinneir’s solution, letter cutter
David Kindersley
Jock Kinneir’s sans serif small letters (with initial capitals); by then
already employed on the M1 motorway
the same letters by Kinneir as above, but in a smaller size and applied
with more interlinear space and more generous margins
FIGURE 4. One of the 24 basic signs that were used in the
experiment (here shown in four alphabets). The total number
of signs were 96, based on four alphabets and 24 basic signs
(6 single-name destination signs, 6 two-name destination
signs, 6 three-name destination signs, and 6 message signs
like ‘Stop’ and ‘No entry’).
From the top: Kindersley’s ‘Edward Johnston’ alphabet;
Kindersley’s own serif alphabet; Kinneir’s sans serif alpha-
bet; and Kinneir’s sans serif alphabet with generous spacing.
(From: Christie and Rutley 1961a, p. 240: figure 3c.)
The aim of the experiment was to find out which of these types of
letterforms could be read at the greatest distance in order ‘to keep the
angle between the driver’s line of sight and the road ahead as small as
possible’ (Christie and Rutley 1961b, p. 59). However, the experiment
also aimed at investigating the question of capitals versus small letters,
A review of empirical studies
/
135
and in addition, investigating ‘the value of serifs ... because it has been
suggested ... that serifed lettering is more legible than sans-serif
lettering’ (1961a, p. 240). The authors pointed out that the question of
sans serif lettering versus serif lettering was only examined with respect
to the two capital lettering styles employed (since no small letters with
serifs were included in this multi-variable experiment).
Altogether 6000 reading distances were recorded. The experiments
were conducted literally on a field. In order to speed up the experiments
the signs were attached to a car moving towards stationary observers
instead of the opposite natural way. Christie and Rutley sensibly pointed
out that this reversing should not affect the relative order between the
letterforms.229 The size of the letters on the test signs were around five
times smaller than for Kinneir’s real signs already in use on the M1
motorway. However, absolute size and thus absolute distances, was not a
question here, and had been dealt with experimentally earlier (see
Design, no. 129, 1959). Sizes were probably anyway expected to vary for
different applications and situations.
FIGURE 5. ‘The experiment was greatly speeded up by mounting the signs on a
vehicle and driving them towards a group of 1015 stationary observers seated
on a tiered platform.’ (From: Christie and Rutley 1961a, p. 242: figure 4.)
229. Although Cohen (1981), according to Hughes and Cole (1986), demonstrated that
eye movement behaviour is different in a laboratory situation from when actually
driving on the road, Hughes and Cole claim that the pattern of eye movements is
not a critical factor with regard to object conspicuity (Hughes and Cole 1986,
pp. 11081109).
A review of empirical studies
/
136
The mean reading distances were, in descending order:
247 ft. for David Kindersley’s serif capitals
240 ft. for Jock Kinneir’s sans serif small letters with initial capitals
239 ft. for the Edward Johnston-based sans serif capitals
212 ft. for Jock Kinneir’s letterforms in a smaller size
Discussion
While discussing their results, Christie and Rutley pointed out that the
difference in favour of Kindersley’s serif capitals to the two sans serif
lettering styles was statistically significant ‘about 3 per cent … i.e. the
difference is unlikely to be due to chance’. However, they added that
spacing, layout, and width to height ratio of the letters, might could have
been confounding factors. Nevertheless, the authors seem to have
overlooked that according to figures given in the text the height of
Kindersley’s serif capital letters is at least 25 per cent larger than the
dominating dimension of the largest version of Kinneir’s sans serif
letters, their x-height. Thus, since reading distance to a large extent is a
function of size, adjusting upwards the size of Kinneir’s sans serif letters,
to a point somewhere on the 0 to 25 per cent range differential, while
taking into consideration the size of the initial capitals in each
destination name as well as the (relative short) ascenders and
descenders, would undoubtedly produce results more in favour of
Kinneir’s letterforms.
However, the Road Research Laboratory, seemed taken in by
Kindersley and his supporters’ heavily promoted view that the
competing letterforms should be compared while positioned tightly into
areas of equal size hardly without surrounding space, with reference
given to the production cost per square unit of a sign. Under this
unrealistic condition, capitals (that is, letters without ascenders and
descenders) will unavoidably create a more prominent visual image than
small letters. This is because small letters will have to accommodate
extra interlinear spacing for ascenders and descenders. Their visual size,
expressed by their dominant dimension, the x-height, will under this
condition have to be left smaller. Not only does this ‘tightly crammed on
A review of empirical studies
/
137
an equal area’ argument rely on an unrealistic condition (both printed
matter and sign panels usually rely on relatively large areas of space
around the text), it also disregards a basic heuristic rule among
designers: that interlinear spacing needs to be larger for text in capitals
than for text in small letters. Furthermore, it also disregards the fact
that capitals are wider than small letters and thus obviously need
considerably more space width-wise, something which might become a
critical factor on sign panels with long destination names. Some
destination names even needed to be abbreviated or contracted on some
sign-panels.230 Nevertheless, the Anderson report concluded this debate
the following way:
We fee l, ho wever, th at in design ing a trafc sig n regar d must be pa id to
the space around the lettering as well as to the lettering itself, and that
a sign that completely filled the space available would be so
unattractive as to be quite unacceptable. (Ministry of Transport 1962,
p. 4)
Kinneir reasonably pointed out that ‘The criterion requiring an
economic use of sign surface was to be largely overridden by the need to
achieve clarity of layout on the more complex signs’ (Kinneir 1971, p. 10).
A similar point of view had also been present in the internal discussions
of the committee, where it was pointed out that instead of filling the
space available, a relatively large area of uninterrupted blue background
against the landscape background was desirable in order to improve the
‘target value’ of the lettering.231 Observe that the experimental signs
employed by the Road Research Laboratory, probably of similar reasons
(‘target value’), were mounted on a large kaki-painted panel on top of the
car that moved towards the stationary observers (see Christie and
Rutley 1961a, pp. 241242).
Christie and Rutley seemed to be fully aware that the technical
concept of ‘statistical significance’ does not express meaningful
230. See the discussion in the Anderson report (Ministry of Transport 1962, p. 9). Also of
relevance here is a discussion in the Worboys report on situations ‘when site
conditions restrict the width of signs’ (Ministry of Transport 1963, p. 10).
231. See committee ‘Notes’, dated 29 February, 1960; and committee ‘Minutes’ of the 22nd
meeting, dated 25 April, 1960; as well as a letter from Jock Kinneir to the Ministry of
Transport, dated 17 March, 1961. (Location: Margaret Calvert, London.)
A review of empirical studies
/
138
significance or substantial meaningfulness.232 They concluded that ‘the
most remarkable feature of the results for the three … scripts is that the
reading distances are so nearly equal … the difference is so small that
caution is necessary in interpreting its meaning’ (p. 243), and ‘the results
do clearly indicate … that none of the three scripts tested has any
appreciable advantage over the others with regard to legibility’ (p. 60). It
is thus reasonable to claim that no significant difference in legibility was
found. In their conclusion Christie and Rutley also called attention to the
fact that the small difference between the two capital lettering styles
(one of them serif and one sans serif) was not necessarily based on the
serifs or lack thereof, but might depend on other variables.
Christie and Rutley finally concluded: ‘Since there is little difference
in legibility between the different types of lettering, it seems reasonable
to make the choice on aesthetic grounds’ (1961b, p. 60). They went even
further and perceptively suggested that ‘there are grounds for believing
that aesthetic questions may be at the root of the controversy’ (1961a,
p. 243). In the final Anderson report, it is admitted that ‘taste plays so
important a part, as we believe it should’ (Ministry of Transport 1962,
p. 5). Herbert Spencer in his comment in Design, stressed that aesthetic
consideration ‘taste, tradition, relevance and fashion’ were of utmost
importance. He clearly expressed his disapproval of Kindersley’s
‘partially’ seriffed letters, which he described as ‘clumsy’ and as ignoring
‘both taste and tradition’ (in Christie and Rutley 1961b, p. 61). Also
Reynolds Stone seemed unhappy about Kindersley’s ‘unusually seriffed
capitals’ and he complimented Kinneir’s sans serif lettering.
Nevertheless, he suggested that if ‘good’ small serif letters had been
included in the tests, they might have outdone the others.233
To no surprise: Cyril Burt’s name was called upon in the debate
(by Reynolds Stone): ‘Fashionable or not, the use of sans meant ignoring
experts like the psychologist Sir Cyril Burt, who “has recently recalled
232. For a b ri ef but u sefu l di scus si on o n th e te ch ni cal co ncep t of ‘sta tist ic al sig ni fic ance
see Pedhazu r and Schme lkin 199 1, p p. 202203.
233. Kinneir referred to Kindersley’s alternative lettering as ‘mis-serifs’ (although not in
public): ‘As far as appearance goes I cannot imagine even the most obdurant
philistine wanting to cover England with ’mis-serifs!’ (Letter from Jock Kinneir to the
Ministry of Transport, dated 17 march 1961.) (Location: Margaret Calvert, London.)
A review of empirical studies
/
139
FIGURE 6. Davis Kindersley’s capitals-only
serif alphabet. (From: Design 1959, p. 35.)
and reaffirmed scientific findings that ‘for word recognition a sans serif
type face was the worst of all.” (in Christie and Rutley 1961b, p. 61).
David Kindersley defended his design. In a letter to the editor in a
later issue he applauded the tests undertaken by the Road Research
Laboratory. He however urged that ‘the conclusions drawn from it are
bad really bad’ (Design, no. 154, 1961, pp. 87, 89). He pointed out that
the tests were not performed at ‘real distances’ (that is, they were
performed at shorter distances). He claimed (with reference to a study by
Paterson and Tinker on the legibility of newspaper headlines), that both
capital legibility and serif letter legibility decrease at a lesser rate
(assumingly with an increase in distance) than for both small letters and
for sans serif letters. Furthermore, he claimed that his letters for
motorway signs ‘can be read from at least 175 ft further away than the
existing lower case signs [i.e. Kinneir’s] with equal areas’ (p. 89). I can
only guess that this figure is based on calculations where several
quantities are included for example the differences of the results
between his and Kinneir’s alphabets, the difference between test
distance and a larger distance, and ‘the decrease at a lesser rate’ thesis
referred to above.
Kinneir’s supporter Herbert Spencer brought up a fundamental
reservation about the value of experimental research, and cautioned:
Such tests of lettering as these are therefore useful in disposing of
pseudo-scientific arguments, but in cases where the results strongly
favour a particular design they must, to be of any practical use to
designers, be elaborated upon so that we can clearly understand why
one design functions more effectively than another. (In Christie and
Rutley 1961b, p. 61)
A review of empirical studies
/
140
FIGURE 7. An illustration from Kindersley 1960 that shows a suggested sign by
the author (left) and a Kinneir sign (right). Kindersley’s caption reads: ‘Examples
of signs to illustrate the better legibility [of] upper-case alphabet (left) to lower-
case (right)’. In his main text, Kindersley comments on Kinneir’s sign: ‘Apart
from the ill-chosen type, the height of the sign is still further exaggerated by
large areas of wasted space, resulting from the off-centre and asymmetrical
contemporary typographical fashion.’ (From: Kindersley 1960, p. 464.)
It is interesting to observe in retrospect that the public debate on
these new direction signs only focused on one aspect of the new direction
signs their lettering. One of the letters to the editor of Design in the
aftermath of the presentation of Christie and Rutley’s research came
from A. G. Long (assumingly not a designer, perhaps a road engineer).
He applied what today might be called a usability perspective and
accused the research of suffering ‘from an unnecessarily restricted
consideration of some aspects and an unduly indiscriminate study of
others’. He pointed out that the question of colour seemed to have been
ignored, and the same applied to performance in bad weather, or in the
dark while illuminated by different kinds of artificial lighting along the
road and from the car. ‘All these are surely more urgent problems of road
design than a finicking survey of the effect of serifs on capital letters on
large boards displayed in good conditions on the best roads.’ (Design,
no. 156, 1961, pp. 81, 83).
In retrospect
It was decided to use Kinneir’s sans serif lettering of small letters and
initial capitals (as well as his overall design-solution) for both motorways
A review of empirical studies
/
141
and all-purpose roads.234 Some of the many ‘prominent’ ‘non-features’ of
his directional motorway signs were: no boxes around destination names,
no barbs on the heads of the arrows that symbolise the road ahead, and
not least: no forced symmetrical or grid-based positioning of destination
names. This represented an ‘exceedingly inferior layout’ and resulted in
‘large areas of wasted space’ according to Kindersley (1960, p. 464).
Although the ‘no boxes feature’ adhered to an emerging modernist norm
in graphic design at that time (groups were to be implied minimal-
istically by spatial relationships alone), it was also based on the wish of
the Anderson committee, and it was in accordance with the 1949 Geneva
protocol.235
FIGURE 8. Left: a pre-Kinneir, pre-motorway, directional road sign. (From:
Christie and Rutley 1961a, p. 239: figure 1.) Right: a Kinneir directional road
sign. (From: Ministry of Transport 1962, p. 59: figure 38.)
Kinneir’s rather ‘neutral’ sans serif letterforms were undoubtedly,
compared to Kindersley’s somewhat unusual serif capitals, more in line
with contemporary aesthetic preferences among designers and taste
trend-setters. To suggest that the final decision was taken already before
the Road Research Laboratory performed their tests (with both explicit
and implicit neutrality) is perhaps to overstate the issue, but
nevertheless, a feeling that the experiments were some kind of play to
234. See Ministry of Transport, 1962, 1963; and subsequent regulations, referred to in
Department of Transport 1991, p. 12.
235. See letter from Ministry of Transport and Civil Aviation to Jock Kinneir, signed
R.L. Huddy, dated 12 June, 1958. RTC 53/4/024 Pt.4. (Location: Margaret Calvert,
London.) See also the illustration on p. 46 in Froshaug 1963.
A review of empirical studies
/
142
the gallery only necessitated by the public debate and performed in
order to shrug it off is hard to avoid. Nevertheless, Kinneir’s lettering
solution corresponded more with the practice on the European continent;
as well as with the direction signs on the American ‘interstate
highways’236an important imperative.
Kinneir and many other designers at that time strongly believed
that sans serif letterforms exactly in the combination of small letters
and initial capitals were intrinsically more legible for signing systems
than serif capitals. In addition to the belief in the importance of the more
distinctive word shapes of words in small letters with initial capitals, it
was believed that sans serif letterforms were easier to handle and less
‘aesthetically sensitive’ and generally more ‘forgiving’ when actually
produced; that is, in various modified forms for various applications,
especially with the tools available at the time. Furthermore, if small
letters was the preferred option: then sans serif small letters were
undoubtedly more suited aesthetically than roman small letters to have
relative short ascenders and descenders (i.e., to have a large x-height),
and thus being more practical and less demanding of interlinear space on
signboards (Mason 1994).
With the recommendations of the Worboys committee for all-
purpose roads, words in all-capitals became to a large extent reserved for
certain mandatory and prohibitory traffic signs. This distinction
provided the means for a functional differentiation of certain important
signs such as ‘STOP’ and ‘GIVE WAY’ (see Ministry of Transport 1963,
pp. 10, 103).
It was also argued that since serif types are characterised by a high
contrast between thick and thin strokes, they ‘would almost certainly
prove unsuitable when the letters have to consist of reflectionized
material to catch the headlights’.237 To this, it has to be admitted that
Kindersley’s sturdy serif capitals were unusually low in contrast not
unlikely in order to solve exactly the problem suggested by Carrington.
However, this characteristic of Kindersley’s design, together with its
highly idiosyncratic and unusual serifs, and not its serifness per se,
236. See Design, no. 129, Septe mber 1959, p. 3 1; Design, no. 13 2, 195 9, p. 71; M inistry of
Transport 1962, p. 4.
237. Letter to the editor from Noel Carrington, in The Tim es, Mar ch 20 , 1959 , p. 13.
A review of empirical studies
/
143
might very well have created exactly the uneasiness people like Herbert
Spencer felt towards it.
Interestingly, it seems that the leading figure of British typographic
establishment, Stanley Morison, who did not participate in the public
debate, undermined at least part of the argument of his fellow
traditionalists, while addressing a continental public. In a postscript
submitted in 1962, for a German language edition of his famous First
principles of typography, published in Switzerland in 1966, Morison
stressed the practical and pragmatic aspects of using sans serif letter-
forms for applications like traffic signs (see Morison 1996, pp. xiii,
3940).
Sanserif type is … quicker, easier and therefore cheaper to make. It is in
fact the cheapest of all to make. Its forms can be mastered by the lowest
category of draftsmen. Naturally, municipal architects and others to
whom lettering is no more and no less than a necessary evil, gave the
medium a cordial welcome, and with reason. That is to say with reason
of a natural kind: of self-interest, which is the best because a material
and rational basis for the choice of sanserif. It is not surprising that
sanserif is superseding the serifed style in all transport and street
designations. Its economy of cost cannot but make sanserif the universal
public medium of communication. (Morison 1996, p. 39)
The aftermath
Together with his associate Margaret Calvert, Kinneir came to dominate
the design of public and official wayfinding and information signing
systems and their various types of accompanying sans serif letterforms
in Britain in the next few decades for motorways and all-purpose
roads, airports, the railways, the public hospitals, and the armed forces.
His influence was also felt abroad,238 and his motorway signing system
has ‘been called Britain’s true corporate identity’ (Rainford 1996, p. 13).
Although Kindersley lost, another serif capital letterform designed
by him, a vigorous interpretation of the historic Trajan letterform, which
was approved for street names by the Ministry of Transport in the early
238. For exa mp le ( ei th er o n airp or ts or on mot or ways) in Au stra li a, in t he M id dle Ea st, i n
Greece, on the Continent, and in Scandinavia.
A review of empirical studies
/
144
1950s (Dreyfus 1957, p. 38), was later to become the most widely used
letterform for street nameplates throughout Britain.239 Thus, both
Kinneir and Kindersley came to put their decisive graphic marks on the
public ‘landscape’ of post-war Britain.240
However, the dispute did not end in the 1960s. Since then it has
come to the surface on many occasions. Robin Kinross appears
prominently among the ‘pro-Kinneir historians’, celebrating Kinner’s
work as a great achievement:241
These signs were the first, in any country, in which ‘visual’ and ‘functional’
considerations were fused. They marked a new turn in British typography.
And in the subtleties of their letterforms and of their rules of configuration,
the signs showed a sophistication beyond the grasp of the title page- and
inscription-bound traditionalism. (Kinross 1992, p. 167)
239. Nevertheless, in the road leading into the road where I used to live for a while, there
is a still an old ‘pre-Kindersley’ street nameplate, which actually faces a ‘Kindersley
name-plate on the other side of the road. Also: a remaining pre-Worboy s di rect iona l
signboard can be seen on a major all-purpose road nearby. The latest replacement
date for pre-Worbo ys dir ec tion al s ig ns o n al l-purpose roads is 1st of January 2005
(Department of Transport 1994, p. 2).
Recently journalist, and since then author-of-fame, Helen Fielding wrote a
spirited account of the current ‘road sign madness’ in Britain in Independent on
Sunday (1994). It is not an attack on the British road signs in themselves or their
design, but an attack on the current somewhat inadequate situation on British local
roads and streets where signs too often are either lacking, hidden behind something,
in the wrong place, or cluttered together in such a way that ‘motorists of Britain just
don’t know which way to turn’. She perceptively states that: ‘The trouble is, systems
are usually set up by people who know the way anyway. They ought to be checked by
people who are strangers to the area’.
Donald Norman comments in a similar but more polemic fashion on the signing
of British (and American) local roads (as opposed to major national motorways) in his
The psychology of everyday things. ‘Wh en ma king long journey s on the s econd ary
roads of England, I learned to go around each roundabout two or three times, each
time eliminating a different exit until I finally could select what appeared to be the
best’ (Norman 1988, pp. 226227). The reason for why there are more problems on
ordinary roads than on motorways has to do with the fact that it is less visual
competition on motorways and fewer kinds of junctions. Information on all other
roads are more dense and complex, and here there are more kinds of junctions and
often more complex junctions (see Kinneir 1970, p. 18; and 1971, p. 10).
240. The ultimate winner of the dispute in question, Jock Kinneir, since wrote three
substantial accounts of his work on signing systems: two on his road signs (one
unpublished article in 1971, and one article published in 1984), and one article on
several of his signing systems (published in Danish only, in 1970).
241. See Kinross 1984, 1989, 1992, 1994b.
A review of empirical studies
/
145
Kinross is not only enthusiastic about the end result of the design
process but also attaches great importance to the process itself: as an
exemplary model for all design that aims to fulfil public needs. He sees it
as an index of modernisation and public service democracy in Britain’s
post-war pre-privatisation era: a large-scale unglamorous planning
process open to rational justification, in contradistinction to ‘the recent
cult of the designer, who reveals expensive master-creations to a
boardroom, as a fait accompli’ (Kinross 1989, p. 52). Kinross refers to the
fact that the design process in question involved a broadly composed
committee, an outward look towards continental Europe,242 assessment
of relevant research, consultation with a large number of interest groups,
a public debate, an expert designer, and technical advisers who
conducted experimental research. Furthermore, the committee was
indeed not a passive body. The minutes from the committee’s 27
meetings bear evidence of lively and constructive debates.
Another perspective that can be applied focuses on the ‘system
design’ aspect. In a recent article, David Sless (as others have done
before him), depicts changes in this century from a situation of small
scale crafting of single artefacts (in a pre-designer era), via a situation of
small-scale crafting and planning of individual artefacts for mass
production (in the designer era), to a situation today where ‘projects are
concerned with designing whole systems rather than individual
artefacts’ and where the outcome may be a large number of unique
artefacts produced by others (1998, p. 7).243 Kinneir’s signing system is
certainly an interesting and early example of ‘system design’ with a
graphic designer playing the central role. First: the sheer number of
signs: a whole range of different kinds of traffic signs related to each
other, with separate solutions for motorways, major roads and local
roads. Second: all the practical consideration on layout, lettering, size of
242. Not only did the 1949 UN Geneva protocol inform the work, the committee had also
toured continental roads while taking colour slides (Kinneir 1971, p. 3).
243. Ironically, the emergence of computer assisted publishing (and the subsequent
convergence of tools, production processes, and products) has also created a situation
where many graphic designers today not only plan and specify, they themselves also
often have to do the implementation that earlier was done by typographic craftsmen
(see Stiff 1996b for an account of the historical rise and demise of specification as an
important but hardly recognised part of typographic designing).
A review of empirical studies
/
146
lettering, colour, background colour, reflection, illumination, mounting
and siting, and not least, the content of the many different signs of each
category. And third: the design of a system with prototypes and
specifications which included a letter spacing system based on a limited
number of tiles so the signs could ‘design themselves’, that is, so local
sign manufacturers easily could space the letters consistently just by
following the instructions in the manual.
Several contemporary writers are vigorous supporters of David
Kindersley’s position of the early 1960s. Montague Shaw describes in his
book on David Kindersley Kinneir’s lettering and Kindersley’s
involvement in the competition the following way:
It is one of the misfortunes of the creative person that his sensible work is,
from time to time, set aside in favour of a vastly inferior article, by
ignorant judges who are swayed by fashion and an uneducated taste. …
[David Kindersley’s alphabet] were better in every single case. But the
sanserif was used. (Shaw 1989, p. 19)
This account was recently accentuated in a more politicised manner by the
designer James Souttar in a talk he gave at the Monotype Conference in
Cambridge in 1992. In addition to quoting Shaw, he describes Kinneir’s
solution as reflecting ‘a vision of shabby utopianism.’ (Souttar 1992, p. 5).
However, not only opinionated accounts such as Kinross’, Shaw’s
and Souttar’s, but also unreliable accounts of what factually happened
have been perpetuated:
A series of tests carried out by the Road Research Laboratory showed
that in terms of recognition and legibility at speed, Kindersley’s
capitalized serif letters were greatly superior to the modernist sans
serif, upper-and-lower-case letters. (Eason and Rookledge 1991, p. 98)
And similarly, in a recent article by Robert Long on David Kindersley and
his oeuvre, in the American typographic journal Serif:
The all-caps alphabet that he designed used heavy, bracketed square
serifs to promote legibility and intelligibility when seen from a rapidly
moving vehicle. While it appears that practical tests clearly demon-
strated its superiority to the Helvetica [sic] that was much in fashion at
the time, Helvetica set in upper and lower case won. (Long 1995, p. 35)
However, the opposite misinterpretation is also present in the
literature. The Dutch architect, information designer and professor,
A review of empirical studies
/
147
Paul Mijksenaar in a recently published booklet, Visual function: an
introduction to information design, claims that
1963 test models from the Road Research Laboratory in England …
researchers found that place names in small letters with initial capitals
were easier to recognize [than place names in capitals only]
(Mijksenaar 1997, p. 22)244
Poulton 1965
*
In 1965 Dr E.C. Poulton at the Applied Psychology Research Unit at the
Medical Research Council, Cambridge, published a paper in the Journal
of Applied Psychology‘Letter differentiation and rate of comprehension
in reading’. The basic content of this paper had been presented the
previous year in an article written by Dennis Cheetham and Brian
Grimbly in the journal Design (Cheetham and Grimbly 1964). Poulton’s
experiment was suggested by the Council of Industrial Design, the
publisher of Design, and Cheetham and Grimbly, who were affiliated
with the council, had assisted Poulton in carrying it out.
The background for these writings and Poulton’s experiment can be
found in contemporary craft discourse, and in marketing efforts, where
the new and innovative sans serif typeface family Univers was hailed as
a remarkable achievement.245 It was alleged that Univers’ subtle
character shape made it an eminent and highly legible typeface. By 1965
the heyday of high modernist design and ‘Swiss typography’ the use
of sans serif typefaces, for body text in both advertising, ephemeral
244. Elsewhere in his booklet Mijksenaar refers to Moore and Christie 1963 (a paper
which I have not consulted), as the source for the accompanying illustrations of
Kindersley’s and Kinneir’s lettering. The same paper is therefore possibly
Mijksenaar’s source of misinformation about the outcome of the Road Research
Laboratory’s research.
* E.C. Poulton. 1965. ‘Letter differentiation and rate of comprehension in reading’.
Journal of Applied Psychology, vol. 4 9, no. 5, p p. 3 58362. See also Dennis Cheetham
and Brian Grimbly. 1964. ‘Design analysis: typeface’. Design, no. 186 , pp. 6171.
245. See for example Ruder 1957, 1959, 1961; a special issue of Typo grafisch e Mo na ts -
blätter, no. 1, 196 1; Hando ver 1 960; D reyfu s 1961; Biema nn 19 61; Fr utige r 1962;
Eurographic Press Interview 1962.
A review of empirical studies
/
148
printing, and design magazines, had become common, not only in
Switzerland, on the continent and in the USA, but also in England.
Many designers were enthusiastic about the use of sans serif typefaces,
especially the new sans serifs adhering to the ‘Swiss idiom of sans serif
typefaces’,246 for example Helvetica and Folio. ‘Many younger
typographers have felt for some years now that sans serif faces are more
suitable for text settings than has usually been acknowledged.’
(Cheetham and Grimbly 1964, p. 62).
However, this state of affairs was not without dissent the
‘traditionalists’ also voiced their opinion as in this letter from John
Peters to Monotype News Letter:
News Letter 67 [set in Univers] is ingenious but unreadable. Having
tried its acres of sans-serif on various friends, it is clear that I am not
the only crank traditionalist. One busy friend declared that any
advertisement he receives in sans-serif goes unread into the basket
simply because the letter form slows down the reading process.
(Monotype News Letter, 68, 1962, [p. 12])
More reasonable criticisms could also be heard. An example here is
an article by R.S. Hutchings in British Printer, while attacking not only
the new sans serif typefaces, but also the idiom of ‘Swiss typography’
(Hutchings 1965).
The contemporaneous enthusiasm for, and interest in, sans serif
typefaces manifested itself not only in the widespread use of this kind of
typefaces, but also in craft and academic discourse. For example, some
substantial accounts, on various aspects of the history of sans serif
letterforms and typefaces, were published in this period.247
The rationale behind Cheetham and Grimbly’s article in Design
(1964), and Poulton’s research carried out on the request of Design, was
to test the claims about Univers’ superior legibility and suitability for
continuous text. The article in Design examines how Univers was
246. Sans serif typefaces adhering to the ‘Swiss idiom of sans serif typefaces’ were
produced in the late 1950s and early 1960, primarily in Switzerland, but also by
type foundries in Holland, Germany, France and Italy. They are characterised by
horizontally cut terminals as well as relatively closed semi-enclosed counters of
letters like a, c, e, and s. See Lund 1993, for a detailed exposition.
247. That is, Gray 1960; Handover 1961 (although somewhat speculative); Mosley 1965;
and Falk 1965. Substantial contributions, from the same decade, on the form or
topology of sans serif typefaces, are Gerstner 1963 and Schulz-Anker 1969.
A review of empirical studies
/
149
designed. The issue of Design also contains an interview with its Swiss
designer, Adrian Frutiger, carried out by the young British type designer
Matthew Carter. Cheetham and Grimbly state that their aim is ‘to
examine how far it [Univers] meets the claims that have been made on
its behalf.’ (p. 61).
Poulton’s experiments compared six different Monotype typefaces,
three sans serif and three roman. This study does not include one sans
serif typeface among several roman typefaces, but is unlike other studies
evenly balanced. Thus, the probability of top ranking of at least one of
the sans serif typefaces should be higher than in most of the previous
experiments. The typefaces in question, a priori assumed to be
reasonably readable, represented different subcategories or idioms of
roman and sans serif typefaces. The typefaces, all of normal weight and
appearance-width, were Bembo, Baskerville, Modern Extended,
Grotesque 215, Gill Sans, and Univers. The line length was the same in
all of the test material. Poulton explicitly acknowledges the possible
problem of confounding due to differences not only in typeface design but
also in actual size posed by the discrepancy of nominal and actual
appearance size of a typeface. In order to avoid this problem, the x-height
(usually the dominant vertical dimension of a typeface) was equalised
between the typefaces (as far as admitted by type size availability).
Thus, the nominal sizes varied from 9.5 points (Univers) to 12 points
(Bembo). However, Poulton does not explicitly acknowledge another
possible source of causal confounding in that different results also can be
interpreted as being caused by different ratios of size to interlinear
spacing. Nevertheless, Poulton might accidentally have solved this
problem which he seems to be unaware of while controlling ‘for the
amount of paper covered by each typeface’ (p. 360). He achieves this aim
by letting the sum of the eventual type size (nominal) and interlinear
spacing become roughly the same for all the typefaces. Thus, the
interlinear spacing varies between 1.5 points (for Bembo) and 3 points
(for Univers). Thus, the ratio between the x-height and the interlinear
spacing varies in accordance with ‘master craftsman’ advice the larger
the x-height the larger the interlinear spacing, and vice versa.248
248. See for example Dowding [1954] 1966, p. 14; 1957, p. 7.
A review of empirical studies
/
150
Poulton makes the point that type size, interlinear spacing and
line length are each as near optimal as possible following Tinker’s
recommendations in order to make the typefaces in the experiment the
only non-optimal variables.
Poulton’s operational criterion for measuring legibility is ‘rate of
comprehension’. This criterion, developed by Poulton, reflects his
scepticism towards Tinker’s comprehension checks deemed as insensitive
because they are used merely as a means for checking that
comprehension is above a minimum level, and thus to a large extent
mainly relying on the speed of reading as a measure of legibility. Poulton
juxtaposes two equal variables, comprehension and speed of reading, and
the resulting measure is named ‘rate of comprehension’.
Earlier, Poulton had criticised Tinker and Paterson’s experiments
both with regard to internal, external and statistical conclusion validity
(Poulton 1960). Poulton strongly advocates the idea that by measuring
the degree of comprehension in conjunction with speed of reading,
legibility experiments would become more sensitive and thus yield
differences that otherwise would go undetected (allegedly the reason for
Tinker’s ‘negative’ results). A further exposition of his concept of ‘rate of
comprehension’ can be found in Poulton (1968). The claim that rather
subtle differences in the design of ‘ordinary’ typefaces can be reflected in
the degree of comprehension measured by answering questions after
having read a given passage for a limited time, seems to me to be
implausible. It represents an extreme position in contradistinction to the
view that typefaces are very much peripheral to the reading process.
Furthermore, what comprehension is, is not easily answered: is Poulton
measuring comprehension, or a facet of comprehension only, or some-
thing else?249 Nevertheless, Poulton claims that: ‘Using this method
I have found statistically reliable differences between typefaces where
previous research workers have failed to do so’ (Poulton 1968, p. 73).
The results of the experiment ranked a sans serif typeface on top
Gill Sans, not Univers. Gill Sans was found to be ‘reliably’ better than
the two other sans serif typefaces, Grotesque 215 and Univers, which
249. For a b ri ef but u sefu l di scus si on o n co mpre hens io n in t he con te xt of re adin g, se e
Rayner and Pollatsek 1989 (pp. 316320).
A review of empirical studies
/
151
came respectively penultimate and last on the ranking list. However, the
study did not reveal ‘reliable’ differences between any of the sans serif
typefaces and the roman typefaces.
Poulton argues explicitly in opposition to Burt that since a sans
serif typeface produced the highest average rate of comprehension, it is
likely that it is not the serifs (or lack of serifs) which determine the
legibility of a typeface. In order to find a theoretical explanation, Poulton
refers to his two typographic advisors, Cheetham and Grimbly (1964).
They pointed out character differentiation as an answer, claiming that
Gill Sans ‘with its geometrical approach allied to humanistic
letterforms’ compared to the two other sans serif typefaces has a
stronger character differentiation (the shape of individual characters)
and thus less family resemblance. They also suggest the possibility that
the rather uniform internal space (‘counters’) of the characters of
Univers interact with the (assumingly relatively small amount) of space
between the letters, and thus creates confusion. Poulton does however
not pursue these points of view any further. Cheetham and Grimbly
conclude that:
The readability tests show that Univers does not in fact possess greater
readability than its older rivals; but if it falls down on this claim, it still
has abundant attraction for typographers. (1964, p. 71)
The exposition of this experiment in Design, provoked a heated
discussion in subsequent issues, with prominent participants from the
world of design Ken Garland, John Dreyfus, Anthony Froshaug, Noel
Carrington, Walter Tracy, and Jock Kinneir. The graphic designer Ken
Garland who had been the designer of the very same journal between
1956 and 1962 opened up the discussion in the same issue as
Cheetham and Grimbly’s article:
a discovery has come out of the work on this article which should give
great heart to all those who fear that the incursion of science means the
death of whatever freedom designers have managed to wrest from their
employers. For years they have been burdened with the traditional
typographer’s dogma that serif typefaces are much better for continuous
reading than non-serif typefaces. Every time graphic designers
attempted to use non-serif typefaces in any quantity they had this holy
law slung at them. Now, glory, be, a scientist has proved that the
difference in the degree of comprehension between commonly used serif
A review of empirical studies
/
152
and non-serif typefaces is negligible; and that what small superiority
does exist belongs to a non-serif face. Thus a great and venerable
shibboleth is banished. Now I ask you, fellow designers, is this freedom
or bondage? (Design, 1964, no. 186, p. 29)
John Dreyfus, who had been involved in launching the Monotype
version of the Univers family (see Dreyfus 1961), aggressively expressed
his doubts about the experiment’s validity and about Ken Garland’s
point of view. Dreyfus claimed that since
most people read most easily the types which they are most used to
reading, it is not surprising that Gill Sans, popular in England since
1928, should come out of Dr Poulton’s tests with such flying colours.
(Design, 1964, no. 188, p. 65)
This last criticism was however vehemently repudiated by
Cheetham and Grimbly in a comment to Dreyfus and other letter-
writers, while pointing out that Gill Sans was not at all a familiar
typeface for continuous text (in the same issue of Design, p. 67).
In a follow-up article by Cheetham, Poulton and Grimbly while
advocating an extensive programme of experimental research into
graphic design they showed that they certainly had picked up
Garland’s argument:
the fact that what we offered was conjecture rather than dogma (plus
thefact that our results could free designers from a self-imposed
restriction on the use of sans serif faces in text settings) seemed to
suggest that scientific methods were both less rigid and more creative
than had been thought. (Cheetham, Poulton, and Grimbly 1965,
pp. 4849)
Thus, between the lines: science had a liberating potential and could
become a weapon for young progressive non-dogmatic and creative
designers in their battle with older traditionalist colleagues.
Shortly after, in 1968, letter cutter, type designer, and former
apprentice of Eric Gill, David Kindersley, in a conference paper
published in Printing Technology, discussed Poulton’s experiment.
Kindersley asked how Gill Sans, a sans serif typeface with geometric
shapes and a very small x-height, could oust Univers, ‘a beautifully
tailored modern sans’ with a very large x-height (Kindersley 1968, p. 71).
Kindersley, wh o i n t h e 1 96 0 s e ag e r l y i n v e s ti g a t e d a n d p r es c r i b ed a
A review of empirical studies
/
153
letter-spacing system (see Kindersley 1976), claimed that both Gill Sans
and Univers were ‘wrongly’ spaced, but that the more generous internal
and inter-letter spacing of Gill Sans (plus a higher degree of
differentiation among characters) probably carried the answer.
Kindersley pointed out both internal and inter-letter space as important
variables which determines the qualities of a typeface, and he suggested
that type manufacturer’s default inter-letter spacing of typefaces is not
an intrinsic property of a typeface. He therefore concluded that inter-
letter spacing must be controlled in experiments in order to avoid it as a
confounding factor (1968, p. 71).
Zachrisson 1965
*
In 1965 Bror Zachrisson, the director of the educational institution the
Graphic Institute in Stockholm, published his doctoral thesis, Studies in
the legibility of printed text. The thesis was based on research carried out
in 1954 and 1964, and was an extended version of his licentiate
dissertation from 1957 ‘Studies in the readability of printed text with
special reference to type design and type size’.
Zachrisson, son of the famous Swedish printer Waldemar
Zachrisson, had had a long experience with the printing industry and he
was also one of the co-founders of the Graphic Institute. In his thesis, he
investigated both ergonomic aspects (legibility) and semantic aspects
(congeniality) of typefaces, and the problems found suitable for
investigation had been chosen after consulting with ‘leading text-book
publishers’.
A considerable part of Zachrisson’s study is devoted to investigate
the relative legibility of sans serif typefaces and roman typefaces.
Zachrisson refers to what he describes as an assumption often found in
the printing trade: that sans serif typefaces are less legible than roman
typefaces.
* Bror Zachrisson. 1965. Studies in the legibility of printed text. Stockholm: Almqvist &
Wiksell.
A review of empirical studies
/
154
While discussing research on reading and comprehension
Zachrisson acknowledges that subtle differences in typeface design are
peripheral to the reading process. He states that: ‘Under normal circum-
stances, a reader is limited in speed only by his rate of comprehension’
(p. 23). And he goes on: ‘There is reason to assume that under normal
conditions no significant differences exist between type faces in common
use by adults for running text.’ He makes the point that because children
are slow readers struggling with comprehending the text, nuances of
type design probably matter less to them than to adults (p. 36). However,
these ‘insights’ do not deter him from performing many comparative
experiments on the relative legibility of typefaces. His subjects are
mainly schoolchildren, for whom he states ‘the author is convinced
that the question of type face design is one of the most pressing
problems’ (p. 93). Zachrisson claims that sans serif typefaces are widely
used in school-books and juvenile literature, at least in Sweden and the
United States, however not without controversy. It therefore ‘seems
worthwhile investigating whether there is any real difference between
them’ (p. 93).
Zachrisson refers to Pyke, Tinker and Paterson, Ovink, Burt,
Moede, and Brachfeldt (pp. 3638), and he employs a wide variety of
operational criteria in his experiments:
errors in oral reading at a normal rate (pp. 97ff)
speed of silent reading at a normal rate plus comprehension check
(pp. 109ff)
instant perception of single words, through a tachistoscope (pp. 115ff)
threshold from blur to focus, through a focal variator (pp. 121ff)
indirect vision of single words, through a perimeter (pp. 124ff)
ocular preference in a situation of binocular rivalry between typefaces,
through a stereoscopic haploscope (pp. 128ff)
readers opinion (pp. 131ff)
Speed and errors were the criteria used in the main experiments.
The error criterion (oral reading) was used for 78 years old pupils in
grade 1 for whom silent reading would prove difficult. A variety of sans
serif and old style roman typefaces were employed in the various
A review of empirical studies
/
155
experiments. In almost all of the experiments the result is described as
‘the null hypothesis may be accepted’, i.e. no significant difference was
found.
Thus, Zachrisson’s conclusion is clear:
The empirical results show that there is no significant difference, in
objectively measured legibility, or subjective opinion regarding ease of
reading between the OF [roman old style] and SS [sans serif] type faces.
(p. 132)
Wendt 1969
*
In 1969, Dirk Wendt published an extensive study on typeface legibility.
Translated to English, the title would read: The influence of typeface
category (Bodoni vs. Futura), typeface posture, and typeface weight, on
the speed of reading.250 This study was one of several in a series
published by the Department of Psychology at the University of
Hamburg under the series title ‘Untersuchungen zur Lesbarkeit von
Druckschriften’ (‘Investigations into the legibility of typefaces’). Dirk
Wendt, who later moved to the University of Kiel where he became a
professor and since then has continued to work on legibility, published
many articles on legibility and related topics in the late 1960s and the
1970s in the Journal of Typographic Research and its successor Visible
* Dirk Wendt. 1969. Einflüsse von Schriftart (Bodoni vs. Futura), Schriftneigung und
Fettigk ei t auf di e erzi el ba re L es eges chwi nd igke it m it e in er D ru ck schr if t. Ber icht Nr. 5,
Untersuchungen zur Lesbarkeit von Druckschriften. Hamburg: Psychologisches
Institut der Universität Hamburg.
250. As late as 1994 the main content of this study was re-published as a substantial
part of a 36 page article on legibility by Wendt (Wendt 1994), in Peter Karow’s
anthology, Font t echn ol ogy, pub lishe d by Springe r-Verl ag in b oth Ge rman and
English editions. However, neither Wendt’s article, nor the general bibliography,
contain any reference to the original study from 1969. Furthermore, certain clues and
omissions in the text give the impression that the article is describing research
carried out recently, and an illustration which claims to show ‘specimens of the
typefaces actually used’ (pp. 296297 in the English edition) most probably shows
more recent (and somewhat different) digital versions of the typefaces.
A review of empirical studies
/
156
Language, and not least in German printing trade journals, like Druck-
Print and Papier und Druck.251 He still publishes legibility studies.252
The aim of the study in question is to investigate the influence and
possible interactions of typeface category (roman and sans serif), typeface
posture (upright and cursive/slanted), and typeface weight (light, regular,
bold, and extra bold), upon legibility. The ‘modern’ or neo-classical
typeface Bodoni of the Berthold type foundry and the sans serif typeface
Futura of the Bauer type foundry are chosen to represent respectively the
roman category of typefaces and the sans serif category of typefaces.
These typefaces, or more correctly these typeface families, were chosen
because they have a large number of variants. The large range of family
variants made it possible to investigate the relative legibility of eight
variants (combinations of posture and weight) of each of the two typeface
families altogether 16 different typeforms. Wendt employed carefully
adapted German versions of Tinker’s Chapman-Cook speed of reading
tests, and generally seems to have invested heavily in designing sound
experimental procedures the study also contains an extensive
introductory chapter on methodological considerations.
Wendt, points out thatThe question of “roman or sans serif ” has
been controversially discussed among typographers for some time’, and
then goes on discussing back and forth several arguments put forward in
the defence of either of the two typeface categories:
The characters of sans serif typefaces are better differentiated and
therefore more legible, because they lack serifs which would have
made them more similar to each other.
The modulation of the stroke of roman typefaces are archaic
remainders of earlier production methods.
Serifs help to form more compact and typical word shapes.
Serifs constitute a useful rail for the eye to slide along.
With respect to the last argument, Wendt refers to a preference
study of typefaces and interlinear spacing, where the readers found that
for a text set in a sans serif typeface to be appealing, it needed more
251. See for example Wendt 1970b, 1971, 1972.
252. See Wendt 1994; and Wendt, Groggel and Gutschmidt 1997.
A review of empirical studies
/
157
interlinear spacing than for a roman typeface (Becker, Heinrich, von
Sichowsky, and Wendt 1970). Wendt suggests that this may be so
because of sans serif typefaces’ missing rail of serifs (1994, pp. 294295).
In Wendt’s own study, each piece of test material was set in 8 point
size, in five columns, unjustified (uneven right margin), and with slightly
different line lengths so the same lines could be identical from one piece
of test material to another. Two thousand subjects were involved in the
experiment, which measured the number of words read in three minutes.
The results of interest to us here showed that the overall difference
between the two typeface families (all the variants taken into
consideration) was less than three words per minute (0.74%); in the
favour of the sans serif typeface family Futura. Wendt’s clear conclusion
is that the difference ‘cannot be considered significant’ (1994, p. 305).
Actually, if we compare only the regular variants of the two typeface
families, which Wendt does not do, the difference was measured to less
than half a word read per minute; in favour of the roman typeface
Bodoni.
However, while discussing the overall result, Wendt surprisingly,
but not without ulterior motive as we will later see, states that it
was a surprise in so far as Paterson and Tinker (1932), applying in
principle the same experimental procedure, found a superiority of a
roman typeface (Scotch Roman) over a sans serif (Kabel lite) of 2.3%,
and Pyke (1926) even found a superiority of 18% for a roman typeface
(Monotype No. 2 Old Style) over a sans serif (Stephenson & Blake
No. 10 Lining Grotesque). (1994, p. 305)
Wendt seems to overlook the simple fact that while he himself has
employed one typeface of each category (serif and sans serif), Paterson
and Tinker employed only one sans serif typeface among a total of ten
typefaces in their experiment, most of them roman. Likewise, Pyke
employed one sans serif typeface together with seven roman typefaces in
his experiment. Thus the sheer statistical probability, of the sans serif
typeface in Wendt’s experiment to be ranked over or on a par with the
roman is considerably higher than in Paterson & Tinker and Pyke’s
experiments. Furthermore, Paterson and Tinker’s interpretation of their
own results was that (except for the typewriter and textura typeface) all
the typefaces (including the sans serif) were ‘equally legible’ (Paterson
A review of empirical studies
/
158
and Tinker 1932, pp. 609, 613). Although Pyke ‘found a superiority of
18% for a roman typeface ... over a sans serif’ (Wendt), the sans serif
was actually ranked second (of altogether eight typefaces, seven roman
and one sans serif), and the biggest difference between any face and the
next on the ranking list was between the sans serif and the next, which
was 30 per cent less legible than the former. Pyke even emphasized that
his second best typeface (the sans serif) was one of the most uncommon
(Pyke 1926, p. 52).
While discussing his results Wendt also refers to Crosland &
Johnson (1928) and Kerr (1926) (both from the same period of time as
the two studies by Paterson & Tinker and Pyke) who ‘reported an
inferiority of a sans serif typeface’. Wendt does this without
acknowledging Cyril Burt as his flawed secondary source of information,
and furthermore, without realising that the two publications in question
in fact do not support either Burt’s claim253 or Wendt’s claim. Never-
theless, he also refers in his discussion to the more recent works of
Brachfeldt (1964)254 and Zachrisson (1965), where very little difference
in legibility between roman and sans serif typefaces was found.255
This leads on to Wendt suggesting that earlier reported differences
between sans serif and serif typefaces in favour of serif typefaces (i.e.,
Paterson & Tinker, Pyke, Kerr, and Crosland & Johnson) have changed
to no difference (Brachfeldt, and Zachrisson) due to increased
habituation to sans serif typefaces, and consequently changed to an
inverted situation:
253. See Hartley and Rooum 1983, p. 205. In fact, Kerr argued a priori that sans serif
typefaces are less legible than serif typefaces due to irradiation (p. 552), while at the
same time, on p. 554, referring to Roethlein’s study of 1912 where the sans serif
typeface News Gothic ‘was found most legible’. However, this inconsistency is likely
based on a confusion of Kerr he refers to News Gothic as a serif typeface ‘similar to
those on p. 554’, that is, a typical ‘modern’ 19th century serif typeface.
254. The article in the trade journal Verla g s-Praxis by Professor Dr Oliver Brachfeldt is
basically an a priori argument against the use of sans serif typefaces in books. Within
the article there is a brief mentioning of what may have been an experimental
empirical study.
255. However, Brachfeldt’s own brief interpretation of the study in question is that the
relatively small difference of 3 second per page in favour of serif typefaces will be
considerably larger when extrapolated to 20 or 30 pages of reading material.
A review of empirical studies
/
159
Finally, in our own study[256] we found even a slight superiority of a
sans serif typeface over a roman one, though this finding does not yet
reach statistical significance. If this impression is correct, the decrease
in the difference of legibility between roman and sans serif typefaces
may well be caused by the general increase in use of sans serifs in print.
([1969]/1994, p. [21]/306).
This argument, the ‘hypothesis of habit’, which Wendt also advocated
elsewhere and in designer fora (Wendt 1970a), does not sound
unreasonable. Wendt argues the following way further on in his
discussion:
Therefore, familiarization with a typographic design is one of the main
difficulties in legibility research: Based on experimental procedures,
only very seldom do we discover whether a new typographic design is
actually inferior to an established one due to intrinsic qualities, or
simply rates lower because readers are less familiar with it. With
respect to the perceptions concerning roman and sans serif typefaces, it
may be the case that sans serifs were initially less legible because they
were less familiar, but due to their increasing daily use in printed
matter of all kinds they became more and more familiar, their
inferiority to roman typefaces decreasing proportionally. (1994, p. 306)
This may very well be so. However, the problem is that the seven
studies (from Pyke and onwards and including his own) which Wendt
painstakingly refers to in chronological order in order to show a trend
which can substantiate his ‘hypothesis of habit’ do not after all,
according to what is demonstrated above, support his ‘hypothesis’.
Paterson and Tinker only employed one sans serif typeface among a total
of ten typefaces in their experiment. Pyke only employed one sans serif
typeface together with seven roman typefaces in his experiment. And
more importantly, the second best typeface in Pyke’s experiment was not
only a sans serif typeface, but also the most uncommon. And further-
more, the biggest difference between any face and the next on Pyke’s
ranking list was between the sans serif and the next. The studies by
Crosland & Johnson and Kerr do not, after all, support Wendt’s claims.
Whether the Brachfeldt article deserves status as an empirical study,
and whether Wendt’s interpretation of its results is reasonable, are open
questions.
256. That is, an even more recent study (my comment).
A review of empirical studies
/
160
Furthermore: although Wendt qualifies his statement about
‘a slight superiority of a sans serif typeface over a roman one’ by stating
that ‘though this finding does not yet reach statistical significance’, he
still relates this ‘finding’ from his own study to a trend in order to
support his familiarity thesis. What is more, it is not even nominally
correct to say that the study showed ‘a slight superiority of a sans serif
typeface over a roman one’. It was the sans serif typeface family that on
average showed a ‘slight superiority’, while the opposite was the case for
the regular variants of the two typeface families (the most common
variant in legibility studies). And finally, the seven studies that are
included in Wendt’s argument represent only a fraction of relevant
studies available for trend comparison.
To conclude: Wendts study is an ambitious and large scale
experimental study, and the results seem reasonable. However, Wendt’s
trend reasoning is false, and his familiarity thesis therefore falls apart.
Nevertheless, as a consequence of his trend reasoning, Wendt elsewhere
suggests that the future, when sans serif typefaces have become more
and more common, will possibly ‘demonstrate objectively a better
legibility of sanserif types’ (Wendt 1970a, p. 43).
A review of empirical studies
/
161
Robinson, Abbamonte, and Evans 1971:
*
Why serifs are (still) important
In 1971, at the high point of ‘legibility research’, a study of the relative
legibility of roman and sans serif typefaces was published in the journal
Visible Language. The authors of ‘Why serifs are important: the
perception of small print’ were David Owen Robinson, Michael
Abbamonte, and Selby H. Evans. The legibility of roman and sans serif
typefaces had been explored on many occasions beforehand, and
although interest in ‘legibility research’ would pretty much evaporate
only ten years later,257 experimental studies on this seemingly finicking
question are even today produced at regular intervals. Robinson and his
co-authors’ topic which is more legible, typefaces with or without
serifs?is thus in one sense unremarkable. It seems to represent a neat
dichotomous question which has apparently fascinated many
researchers.
But their paper was unusual in two respects compared to most
legibility studies. First: not only did its authors attempt to establish the
* David Owen Robinson, Michael Abbamonte, and Selby H. Evans. 1971. ‘Why serifs
are important: the perception of small print’. Visi bl e La ng ua ge , vol. 5, no. 4,
pp. 353359.
257. For som e refe re nces t o crit ic is m of ‘l egib il ity re se arch a modest ‘research program’
that peaked in the 1960s and 1970s see my review essay in Information Design
Journal (Lund 1995). See also Richard Venezky’s (1984) article ‘The history of reading
research’, where legibility studies are placed within a broader historical context of
socio-behavioural reading research. See Tinker 1963, and Spencer 1969, for
summaries of legibility research.
One of the most serious objections which repeatedly have been raised against
experimental legibility research points out that ‘factors’ such as typeface are very
much peripheral to the reading process, and thus that it will be extremely difficult to
quantify any meaningful differences in legibility, for example by employing an
operational method which measures the speed of reading. This point has been made
by several researchers and typographers; among them, R.L. Pyke in 1926 (pp. 6061),
David Sless in 1981 (p. 170), and more recently, albeit obliquely and within a slightly
different context, by Andrew Dillon in the chapter ‘Describing the reading process at
an appropriate level’, in his Designing usable electronic text (1994). However, this
objection does not imply a rejection of the need carefully to design, modify, or choose,
appropriate typefaces for specific applications (as well as the need carefully to
manipulate other interdependent typographic variables; be it on paper, on screen or
on any substrate) on the basis of aesthetic, craft-based, and ergonomic
considerations.
A review of empirical studies
/
162
superior legibility of roman over sans serif typefaces, they actually
suggested a theory for ‘why serifs are important’. This was in striking
contrast to the atheoretical approach of most previous legibility research.
Second: the study was not based on behavioural experiments involving
human ‘subjects’, but on a cognitive science approach, and it used a
computer model of human visual perception.258 The published article
has often been cited, especially in recent years.259 For all these reasons
the article known in the literature of legibility studies as ‘Robinson,
Abbamonte, and Evans 1971’ still deserves attention.
The content of ‘Why serifs are important’
The authors ask in a slightly ironic tone why serifs which appear
to be ‘at best only decorative and at worst merely superstitious’ have
not disappeared, as long as sans serif typefaces exist as an alternative to
seriffed typefaces (p. 353). After briefly referring to the research of
the psychologists Christopher Poulton and Miles Tinker, and to the
typographer Geoffrey Dowding’s discussion of sans serif typefaces,
Robinson et al. put forward several arguments and counter-arguments
about the usefulness of serifs. They bluntly discard as ‘unconvincing’ the
suggestion that people’s preference for serifs is first of all an expression
of aesthetic judgement. They sweepingly discard the possibility that
preference is merely a question of conditioning or habituation. Further-
more, they discard the possibility that serifs enhance the eye’s horizontal
movement along lines of typeset text. Here they argue that ‘since adults
only make a few eye fixations in reading each average-length of print, it
seems unlikely that the continuity from one letter to the next should be
an important factor’ (p. 354). Finally, they discard the possibility that
letters with serifs convey more information due to their more complex
structure with the counter-argument that serifs can just as well be
regarded as unnecessary visual ‘noise’.
258. The research in question was ‘supported by the Department of Defense’ and the
authors were affiliated to the ‘Institute for the study of cognitive systems’ at the
Texas Christian University in Fort Worth.
259. For exa mp le i n 1975 , 1 982, 1984 , 1990 , 1 991, thre e time s in 1 99 3, 199 4, and i n 1997 .
See the subsection ‘The paper’s reception’ below.
A review of empirical studies
/
163
Instead, the authors suggest that ‘the neurological structure of the
human visual system benefits from serifs in the preservation of the main
features of letters during neural processing’ (p. 353). They state that
their theory depends on ‘the physiological structure of the human visual
system’. As a tool for finding support for their theory, the authors employ
a computer model which, they say, simulates human visual perception.
In the next few paragraphs I attempt to describe Robinson and his
co-authors’ procedure. My attempt is based partly on the insufficient
explanation given in their paper, and partly on a more detailed but
nevertheless ambiguous and vague explanation given elsewhere
(Evans et al. 1968).
Dot representations of four letters, each in two different sizes, are
entered into a computer. Since the size of the dots are the same for the
two sizes of letters, the relative representation of the smallest letters is
far cruder than the representation of the larger letters. The letters are
E, T, f , an d h ; f r o m t w o I BM S e l e ct r i c ty p e w r i te r t y pe f a c e s, t he s a n s
serif face Artisan and the seriffed face Courier (figure 9). After the input
operation, the representations are detected and processed by the
computer program’s ‘horizontal and vertical line operators’, which are
sensitive respectively to horizontal and vertical lines. These line
operators are meant to imitate ‘feature detectors’ which are assumed to
be located in the brain’s striate cortex as part of the human visual
system. The line operators can be described as matrices of 5 ⨉"5 cells,
while the dot representations of each letter are positioned within a
matrix of 48 48 cells (Robinson et al. 1971).
FIGURE 9. The original letters used for the experiment, before they were turned
into dot matrix representations ready for computer input. (From Robinson et al.
1971, p. 356: figure 1.)
A review of empirical studies
/
164
The cells within each line operator carry differentiated values,
such that the cells along the centre column of the vertical operator are
assigned the highest values, the cells along the two adjacent parallel
columns lower values, and the cells of the two outer columns zero
value (Evans et al. 1968).
Each operator ‘reads’ the input pattern of the letters in such a
way that it successively covers all ‘distinct’ locations of the input matrix
in discrete overlapping steps. At each location the value of each cell of
each operator is multiplied by either one or zero, depending on whether
or not a ‘dot’ is assigned to the corresponding cell of the input matrix.
Eventually the products of each cell are, in one way or another,260 added
and then transferred to a new resulting matrix where eventually the
cells with the largest sums will be constituted by dots in the print-out
pattern (Evans et al. 1968).
FIGURE 10. Dot matrix represen- FIGURE 11. Dot matrix represen-
tations of the smaller-sized letters, tations of the larger-sized letters,
before and after ‘the computer before and after ‘the computer
model of the line detectors of the model of the line detectors of the
human visual system’ was applied. human visual system’ was applied.
(From Robinson et al. 1971, p. 357: (From Robinson et al. 1971, p. 357:
gure 2.) figure 3.)
260. The authors provide neither information about the exact values of the cells of the
operators, nor details about the further mathematical operations.
A review of empirical studies
/
165
The resulting print-out pattern shows that for relatively small
sizes of letters (in ‘ordinary bookprint’ size), serifs help to preserve the
original image (figure 10). For the capital E in particular, the serifs are
crucial. However, for larger letters serifs are not found useful (figure 11).
The authors explain the result for the larger letters by suggesting that
such letters are probably perceived by different kinds of feature detectors
by edge detectors rather than by line detectors. However, they add that
when larger letters are viewed from a distance, for example on
billboards, the letters will be perceived by line detectors, and serifs will
thus still be of importance.
The authors conclude:
If the computer model has any validity as an imitation of the human
visual system, then one may conclude that serifs are important in
preserving the image of small letters when they are represented in the
neurological structure of the visual system. (p. 359)
The words ‘if’ and ‘may’ reflect their only detectable caution about the
validity of their research. Elsewhere they are more categorical in their
exposition.261 However, as I shall argue below, the validity of this study
does not rest, as its authors suggest, on the fidelity of the computer
model alone, as if it is a matter of fact that ‘line detectors’ exist as
part of the human visual system and that there is a simple and direct
relationship between how the visual system works and the computer
model.262 First of all the study rests on a multi-level chain of theoretical
constructs and assumptions about the human visual system.
A cognitive science approach
In order better to understand the claims which the authors make,
I need to extend my discourse beyond the scope taken by Robinson,
Abbamonte, and Evans.
261. See for example their claim on p. 354: ‘The explanation which this article proposes
depends on the physiological structure of the human visual system.
262. See for example the categorical references to ‘the computer model of the line detectors
of the human visual system’ and ‘the model which imitates human visual feature
detectors’ on p. 356. See also the abstract on p. 353.
A review of empirical studies
/
166
On a fundamental level the authors rely on a cognitive science
approach employing an explanatory computer model of human visual
perception, implying that the brain functions like a digital computer that
runs programs consisting of algorithms. This is in line with ideas which
dominated cognitive science at the time of writing, and which continued
to do so until the mid-1980s.
However, not only is it highly questionable to assume that the
brain operates like a digital computer,263 it is also highly questionable
to assume that the brain functions like a digital computer, that is, that
it performs algorithmic manipulations of symbolic representations. Well-
articulated criticisms of this widespread assumption an assumption
which is still vigorously defended have been offered from various
stances.264
For example, the philosophers Hubert Dreyfus and John Searle
have argued against what is both the central conceptual belief of
cognitive science and the basis for its central method of enquiry:265 that
cognition has a computational information-processing character and can
be treated as ‘device-independent’, and thus that cognitive processes,
including visual perception, can be simulated or modelled compu-
263. Think of the physical hardware level with its wires, integrated circuits and magnetic
discs; and the way particular operations are realised in registers.
264. For exa mp le, f ro m the ph ilos op hers H uber t Dr eyfu s ([19 72 ] 1992 , pp. 163 188), John
Searle (1990; 1992, pp. 197226) and Peter Hacker (1987); from the neuroscientist
Gerald Edelmann (1994, pp. 211252); the psychologist Thomas Hardy Leahey (1992,
pp. 444461); the sociologist Jeff Coulter (Button et al. 1995); and a lthou gh more
implicitly but not less forceful, from the computer scientist Terry Winograd together
with Fernando Flores (1986).
265. I am concerned with Dreyfus’s and Searle’s critique of ‘cognitive reason’ or
‘cognitivism’ (the AI version of cognitive psychology, as opposed to applied AI
engineering or cognitive psychology per se). For the sake of order: I am here not
concerned with their rejection of the possibility of creating computer-based ‘general
purpose’ artificial intelligence, where th ey ba sical ly ar gue t hat i ntell igence canno t be
non-biological, disembodied, and independent of intentionality and of humans’ tacit
and pragmatic background knowledge of the world. See Searle’s famous ‘Chinese
room’ argument (‘syntax is not sufficient for semantics’); it appeared in a paper first
published in Behavioral and Brain Sciences (Searle 1980). The paper has since then
been republished on numerous occasions in anthologies and readers; e.g. together
with Boden 1990 (pp. 6788). It has also on numerous occasions been both vigorously
contested (e.g. Boden 1990, pp. 89104), vigorously defended (e.g. Kelly 1993,
pp. 173184), and vigorously yet conditionally defended (Button et al. 19 95,
pp. xi, 1–3, 1522).
A review of empirical studies
/
167
tationally by programs in order to generate relevant empirical data.
To put it another way: that the best way to understand cognition is to
study computer programs which try to reproduce cognition. To push it
further: what the brain does may adequately be described by algorithms:
If a machine could be designed which could ‘see’, its computer program
could constitute the implementation of a theory of how seeing is
achieved by humans. (Bruce and Green 1990, p. 77)
These ideas imply that due to similarities between inputs of both
human cognition and a computer model of human cognition, and
similarities between outputs of both human cognition and a computer
model of human cognition, there is a symmetrical relation between
human cognition on the one hand and a computer model of human
cognition on the other. This is as if minds are non-biological and
disembodied computer programs only accidentally implemented
in brains, and thus that computer programs and minds function on
the same internal principles. John Searle offers the following polemical
argument against such a view:
the word processing program simulates a typewriter better than any AI
program I know of simulates the brain. But no sane person thinks: ‘At
long last we understand how typewriters work, they are
implementations of word processing programs.’ It is simply not the case
in general that computational simulations provide causal explanations
of the phenomena simulated. … we do not suppose that because the
computer simulates a typewriter, therefore the typewriter simulates a
computer. (Searle 1992, p. 218)
And again:
The mistake [of cognitive science] is to suppose that in the sense in
which computers are used to process information, brains also process
information. (p. 223)
In the same vein, but more specifically on visual perception,
the philosopher Peter Hacker attacks David Marr’s currently very
influential computational theory of both human visual perception
and ‘machine vision’:
what Marr describes constitutes, as might be expected from a
specialist in artificial intelligence, the outline of a novel approach to
the problems of the theory of ‘machine-vision’, i.e. the theory
underlying the design of machines that can identify what lies before
A review of empirical studies
/
168
them as a consequence of light falling upon a sensor. But Marr’s
theory is neither an explanation nor even a description of animal or
human vision. (Hacker 1991, p. 119)
[Marr’s theory’s] fruitfulness and coherence as a computational
theory for artificial-intelligence research is a matter for evaluation for
computer scientists. … My sole concern is with the question of
whether his theory is (as it purports to be) a theory of vision, in
particular of human vision, or indeed whether it even provides a
coherent framework for such a theory. (p. 121)
Similarly, in an issue of the Annual Review of Psychology, William
P. Banks and David Krajicek spell out a point that is obvious but not
often emphasised:
Does machine perception always tell us something about biological
perception? The AI approach to perceptual theory is but one area of AI
research. Many AI researchers in perception are simply trying to design
machines that accomplish certain perceptual tasks, and they study
human perception in the hope of learning something useful in that
effort. However, it is unlikely that the constraints of biological evolution
and the multiple competencies required of a living organism have
produced solutions that apply directly to such single-purpose
engineering problems. To attempt to implement the biological strategies
of perception in a perceiving machine seems to some as foolish as
designing airplanes with flapping wings. By the same token a machine
model of perception may tell us surprisingly little about human
perception. (Banks and Krajicek 1991, p. 310)
This scepticism points to the danger of invalid inference from
computer models that only partially, or maybe even hardly, corres-
pond to the natural phenomena being modelled, because of too many
disanalogous elements. But that same scepticism does not deny the
general usefulness of metaphorical models in science, in everyday
thought and language, or in other contexts. However, the writers I have
quoted so far point to a conceptual confusion and a too fuzzy boundary
between the model and what is being modelled; a confusion which easily
arises within a paradigm where cognitive models are viewed as
‘functional’ and thus ‘device-independent’.266
266. It is worth pointing out that although the study in question (i.e. Robinson et al. 19 71)
is preoccupied only with human vision, the underlying context is not necessarily
limited to vision proper. This is revealed by the authors’ ‘AI-like’ approach, their
institutional ‘cognitive science’ affiliation, and the implicit ‘machine-vision’ aim in one
of the co-authors’ former publications (Evans et al. 196 8), on whic h the pape r in
A review of empirical studies
/
169
A multi-level chain of theory
With regard to broad theoretical positions on visual perception:
there is a noticeable lack of consensus among both philosophers and
psychologists. The most basic divide can be said to be between
information processing theories on the one hand and ‘Gibsonian’
ecological theory on the other hand. Information processing theories
emphasise the indirect, inferential and constructive nature of visual
perception (whether mediated by computational cognitive processes or
not). ‘Gibsonian’ ecological theory claims that visual perception has a
more direct, unmediated nature; that it relies not on algorithmic
processing of abstract symbolic representations of more or less static
two-dimensional retinal ‘snapshots’, but on non-algorithmic detection,
where a very information-rich surrounding optic array of light provides
continuous input for the vision of locomotive humans.267
Robinson and his co-authors’ theory of ‘human visual processing’
is based on an information processing approach to visual perception.
Their theory includes a ‘feature detection’ theory and a ‘degradation’
thesis, both of which I will discuss below. I will suggest that it is highly
questionable whether their theory reflects the biological reality of human
visual perception. The same doubt applies to the computer model of the
theory, which is basically an algorithm, assumingly encoded in a high-
level programming ‘language’ as a computer program. That doubt
further applies to the actual computer implementation: that is, the
running of the program on a computer, performing operations on
representations of the input, and producing output as results of these
operations.268
question is partially based. In fact much of the early work on pattern recognition in
an AI context focused exactly on recognising alphanumeric characters, in order to
develop useful applications, for example for automatic sorting of letters with hand-
written postal codes (Bruce and Green 1990, p. 180).
267. See for example the exposition in Bruce and Green’s widely referred to textbook on
visual perception (1990). Bruce and Green aim at an integration of aspects of both
theoretical stances; as does Thompson (1995, pp. 215250). For further expositions of
‘Gibsonian’ ecological theory, see Gibson 1979; Michaels and Carello 1981; Turvey
et al. 1981; and Reed 1988.
268. The encoded program can of course be regarded as part of the implementation.
A review of empirical studies
/
170
Feature detection theory
The authors’ digital computer ‘model of human visual processing’
suggests that letterforms are recognised by ‘feature detectors’ in the
visual cortex. The model is apparently based on a low-level ‘feature
analysis’ theory, one among several projective ‘feature analysis’ theories
popular among psychologists and computer scientists between the 1960s
and early 1980s. Such theories in turn represent one of several kinds of
approaches to ‘object recognition’.269 More specifically, the computer
model is based on a postulated ‘feature detector’ theory, which was
inspired by David Hubel and Torsten Wiesel’s classic neurophysiological
research of the late 1950s and 1960s, in tracing the visual pathway in
the brain from the retina to the striate (‘visual’) cortex. For this work
Hubel and Wiesel shared half a Nobel prize in 1981. Hubel and Wiesel
implanted micro-electrodes in single cells in the striate cortex in cats
who were anaesthetised, had their eyes immobilised, and received
artificial respiration. They recorded activity produced by relatively
focused retinal stimulations, each of one-second duration (Hubel and
Wiesel 1959, 1962).270
Hubel and Wiesel showed that many or most cortical cells reacted
to specific geometrical features illuminating corresponding receptive
fields (heavily overlapping areas of photo-receptors in the retina which
feed single cortical cells). This applied especially to thin and long
rectangular slits, edges and bars, with specific orientations, that is:
vertical, horizontal, and all possible obliques. Hubel and Wiesel did
not find any predominant orientation (Hubel and Wiesel 1962, pp. 110,
151; Hubel 1988, p. 71). The horizontal and vertical ‘Hubel and Wiesel
line detectors’ applied by Robinson et al. (p. 355) apparently expresses
269. See for example Bruce and Green 1990, pp. 5272, 175202.
270. There is a broad consensus in the literature implying that it is reasonable to assume
that the visual system of humans is not very different from the visual system of cats,
the most popular non-human research species in vision research. However, Kolers
questions the presupposition by arguing that activity shapes structures in the
nervous system (Kolers 1983, p. 5455). Nevertheless, for a thorough discussion on
this question, see Crawford et al. 1990.
A review of empirical studies
/
171
a ‘feature detector’ theory inspired by Hubel and Wiesel’s neuro-
physiological findings.271
It should be pointed out that since the 1970s, many psychologists
of various theoretical stances have rejected feature detector theories in
general although admittedly these rejections have been most strongly
directed against higher-level theories. Nevertheless, low-level feature-
detector theory has also been seriously challenged. Moreover, Hubel and
Wiesel’s experimental conditions and set-up have been questioned,272
and doubts have been expressed about what has been described as a
‘reductive’ approach to visual perception.273
If low-level feature detector theory is accepted on its own terms,
then possibly its most cogent rejection is based on what Bruce and Green
refer to as the ‘problem of ambiguity in feature detection’. Although the
responses of cortical cells to edges and lines and their orientation are
prominent aspects of their activity, they are far from the only aspects.
The response from a cortical cell is in fact a multivariate function: that
is, it is interdependent on many variables, for example orientation,
contrast, position, width, length, direction of movement, velocity of
movement, binocular disparity, and vestibular and auditory
stimulation.274 Thus Bruce and Green conclude that ‘we cannot identify
single cells with feature detectors’ (1990, p. 65), and John and Schwartz
conclude that ‘The very richness of the repertoire of trigger features
seems to undermine the theoretical position that has been inferred from
their existence’ (1978, p. 4). In the same vein, Wade and Swanston argue
that although a particular orientation is optimal with regard to response
from a single cell, other orientations, although decreasingly and within
271. These highly metaphorical and projective ‘feature detectors’, which are referred to in
a literal and teleological manner by Robinson et al. (an d els ewhere in th e lit eratu re on
visual perception), seem not to have originated as such with Hubel and Wiesel. No
trace of the explicit term or concept ‘feature detector’ can be found in Hubel and
Wiesel’s classic papers from 1959 and 1962, or in Hubel’s popular Eye, brain and
vision, published in 1988 .
272. See for example Kolers 1983, p. 54; Rock 1984, p. 141; and Wade and Swanston 1991,
p. 193.
273. See for example Wade and Swanston 1991, p. 173; John and Schwartz 1978, pp. 2425;
Marr 1982, pp. 216, 340341; and Michaels and Carello 1981, pp. 6669, 174, 184.
274. Bruce and Green 1990, pp. 6165; John and Schwartz 1978, pp. 34.
A review of empirical studies
/
172
certain limits, will also produce a response. Accordingly: a whole range of
‘orientation-selective neurons’ will to a varying extent be excited by a
particular line stimulus on the retina (1991, p. 84). Moreover, Michaels
and Carello cogently demonstrate that perceiving higher-order
properties does not entail detecting lower-order properties and doing
computations (1981, pp. 6669, relying on Runeson 1977).275
This brief exposition of the disagreements about whether or not
‘feature detectors’ can be said to exist suggests that Robinson and his co-
authors’ explanation, which claimed to simply rely ‘on the physiological
structure of the human visual system’ (pp. 354, 356, 359), in fact relies
on a specific and seriously contested theory or theoretical interpretation
of some physiological findings.
It is a mystery why Robinson and his co-authors chose to include
only horizontal and vertical line operators, and not also obliques at
various orientations. The inclusion of obliques would be indicated
by Hubel and Wiesel’s findings, and also by the references made by
Robinson et al. to an earlier study which one of them co-authored (Evans
et al. 1968). The authors of this earlier study refer to ‘straight lines of
various slopes’, and to several line operators that were ‘successfully’
employed in a similar application which to a limited extent managed to
retrieve seriously distorted images of letter-like shapes (Evans et al.
1968, p. 404). So when Robinson et al. say that ‘In Figure 2 it is clearly
shown that serifs perform an important function in a perceptual system
with horizontal and vertical line detectors’, this points to a problem.
What about line detectors at oblique orientations? Surely, this appli-
275. Keith Rayner and Alexander Pollatsek’s widely read textbook The psychology of
reading prominently gives credibility to ‘feature detection theory’ (1989, pp. 1115).
They conclude: ‘While there are criticisms of feature detection as a theory of object
perception in general, it provides a reasonably satisfactory model of how letters and
words are processed’ (p. 14). Rayner and Pollatsek refer to Hubel and Wiesel as
literally discovering feature detectors, an d cla im in a ci rcula r way th at Hu bel a nd
Wiesel’s work is the ‘best known physiological evidence in favour of feature detection
theory’ (p. 12, my italics). In fact it is the feature detection theory in question that is
an interpretation of Hubel and Wiesel’s findings. However, it should be noted that
Rayner and Pollatsek’s stance first of all seems to reflect their rejection of more
implausible ‘template matching’ theories of human ‘pattern’ or ‘object recognition’
(described on pp. 1115; also in Bruce and Green 1990, pp. 180182).
A review of empirical studies
/
173
cation of line operators at only two orientations could pose a damaging
threat to the ecological validity of their research.
With regard to the widely acknowledged complexity276 of the
‘global’ human visual system it seems that the research in question by
Robinson et al. is based on a very poor model basically limited to only
two kinds of ‘feature detectors’, and furthermore detectors from only one
particular location in the global human visual system. This can only be
amplified by a point made by Hubel and Wiesel in 1962 (p. 141), and
which Hubel repeats in his Eye, brain and vision: ‘The striate cortex is
just the first of over a dozen separate visual areas, each of which maps
the whole visual field’ (Hubel 1988, p. 219). Hubel modestly adds:
We are far fr om unders tan ding th e p ercepti on of obje cts, eve n s uch
comparatively simple ones as a circle, a triangle, or the letter A
indeed, we are far from even being able to come up with plausible
hypotheses. (p. 220)
Degradation theory
A traditional view of visual perception since Descartes, still current in
various influential guises,277 is that an impoverished and insufficient
two-dimensional retinal image has to be reconstructed by cognitive
operations into a rich three-dimensional image in the brain.278 Robinson
and his co-authors offer an argument based on what seems to be a
similar view.279 By referring to the fact that there are far fewer fibres in
the optic nerve leading from the eye than there are photoreceptors (rods
and cones) in the retina they imply that the transmission capacity of the
276. See for example Uttal 1990.
277. Also reflected in literature not primarily concerned with visual perception, for
example in Richard Rubinstein’s admired Digital typography: an introduction to type
and composition for computer system design. Writing about legib ility and visua l
perception, he states: ‘In fact the “wiring” of the retina performs a great deal of
feature detection and signal processing, creating a partially processed image to send
on to the brain for higher-level analysis’ (1988, p. 28).
278. This view is vigorously contested by the ‘Gibsonian’ ecological school. See for example
Gibson 1979; Michaels and Carello 1981; Turvey et al. 1981; and Reed 1988.
279. See Robinson et al. 19 71, p. 354 a nd passim. S ee al so Ev ans et al. 1968, passim.
A review of empirical studies
/
174
optic nerve is too low.280 They further imply that, therefore, the
‘information’ (or nerve impulses) necessarily has to be enhanced when
reaching the visual cortex, and that it is the ‘Hubel and Wiesel line
detectors’ which are effective in ‘retrieving degraded patterns’ and
preserving the main features of letters. Nevertheless, they still refer to
the resulting ‘neural image’ as being degraded to a certain extent. The
idea seems to be this: line detectors and serifs are naturally and
especially adapted to each other in such a way that at the stage of a
resulting ‘neural image’ in the brain, recovered letterforms with serifs
will still be partly degraded (the serifs seem to have gone), but they will
be degraded to a lesser extent than letterforms without serifs (where
part of the main strokes and other structural elements such as the
middle bar in the E have gone) (p. 358).
The authors’ explanation of why, after all, degraded ‘neural images’
of sans serif letters do not have a ‘disastrous’ effect on legibility is more
fantastic than serious:
… the considerable influence of context. For example, if one
erases one third of the letters in a sentence it is still readable:
An *xa*pl*of*se* te*ce*it* mi*si*gl*tt*rs. (p. 359)
I will limit myself to one further comment on the ‘degradation’
line of argument. With regard to the discrepancy in the ratio of photo-
receptors in the retina to fibres in the optic nerve, David Hubel himself
gives a plausible anatomical explanation for why this condition does not
necessarily result in inadequate transmission capacity, and how detailed
visual information can be preserved ‘without our having hopelessly crude
vision’ (1988, p. 3739). His explanation is simply that the details of
connections between the retina and the optic nerve vary according to the
distance from the fovea, ‘which corresponds to exactly where we are
looking our center of gaze, where our ability to make out fine detail is
highest’. That is, that in the fovea a single receptor connects with a
single nerve fibre, and that further out in the periphery of the fovea
there is an escalating degree of convergence from many receptors to
single nerve fibres.
280. There are about 1 million fibres in the optic nerve, and 125 million photoreceptors in
the retina.
A review of empirical studies
/
175
Implementation, interpretation, and reasoning – some remarks
The implementation of Robinson and co-authors’ theory and model
is problematic. Not only is the very crude dot matrix representation
of the ‘stimulus-material’ dubious, but also the decision to represent
seriffed typefaces by a monoline and monospaced typewriter typeface
with highly exaggerated serifs. One cannot possibly induce, as the
authors do, from the specificity of IBM Selectric Courier (or similar
typefaces) to seriffed (i.e. roman) typefaces in general. Furthermore,
it is an open question whether the letters T, E, f and h represent an
appropriate sample of the 52 most common alphabetic glyphs.
While the authors state that ‘In Figure 2 it is clearly shown that
serifs perform an important function in preserving the original image
of a small letter’ (p. 358), the figure actually shows us that of the four
letters T, E, f and h, it is only the E which is ‘considerably degraded’.
The E has lost its skeletal/structural form, by losing its original middle
crossbar. The other three letters seem only to have contracted slightly
(see figure 10).
Further, the authors’ statement that ‘Serifs are not useful when
large letters are presented to a line detector system, as shown in
Figure 3 … [and] do not help to preserve the main features of a letter’
(p. 358), is in fact an understatement. If we are to judge according to
what Figure 3 actually shows, we have to conclude that serifs are not
only ‘not useful’ and ‘not helpful’, but directly damaging (see figure 11).
This conclusion is amplified by the additional verbal information given
by the authors: that similar results occurred in the three other letter
pairs.
Finally, the rationale of Robinson and co-authors’ study seems to
be based on a circular and slightly absurd argument. They claim that
since seriffed typefaces are more often preferred to sans serif typefaces,
seriffed typefaces must necessarily possess an intrinsic quality more
easily adapted to how the human visual system works. Then, in their
concluding discussion, the authors claim that readers’ preference for
seriffed typefaces supports their theory (p. 359).
In fact, the paper’s line of logical reasoning is not sound. It goes like
this: seriffed typefaces are, without any obvious reason, more popular
A review of empirical studies
/
176
than sans serif typefaces. The theory (interchangeably referred to as an
‘explanation’ and proclaimed to ‘depend on the physiological structure
of the human visual system’) says that serifs are important because they
help to preserve the original shapes of letterforms ‘during neural
processing’. The computer model of the theory is employed in order
to test the theory. The computer simulation shows that serifs are useful.
What is more, the popularity of seriffed typefaces gives further support
for the theory. The theory is confirmed. However, simply to demonstrate
that a model can work in a certain way does of course not demonstrate
that what is being modelled works in the same way.
Theoretical assumptions or physiological facts?
It should by now be clear that Robinson and co-authors’ explanation
relies first and foremost on an intervening multi-level chain of
theoretical constructs and assumptions (right or wrong), and not as they
claim: that it simply relies on ‘the neurological structure of the human
visual system’. In short: what has been described above seems to be
reasonably encapsulated by the following complaint:
In psychology the distinction between the perceptual algorithm and the
biological hardware on which it hypothetically runs has not been so
clearly maintained. Thus several theoretical terms are ambiguous:
‘Feature detectors’ [etc] … suggests physiology but are in fact aspects of
certain algorithms that hypothetically explain processing. … When
theoretical terms masquerade as physiological ones, speculation
advances unchecked by reliable biological underpinnings. In such
circumstances the purely theoretical components of research may not be
properly tested, and the physiological ones fail to be verified by
physiological experiment. (Banks and Krajicek 1991, p. 307, my italics)
Furthermore, not only does the authors’ explanation rely on a chain
of theoretical assumptions while purporting to rely on physiological facts,
but also on poor logical reasoning, a poor model, a crude implementation
of the model, and a dubious interpretation of the results. Much seems to
be wrong. To conclude: Robinson and co-authors’ paper demonstrates
lack of ecological validity and is unconvincing.281
281. In defence of the paper it can be pointed out that it is not based on a pure top-down
model, frequently used in cognitive science (Wilkes 1990), where the theory (of what
A review of empirical studies
/
177
The paper’s reception
All the documents I have come across which cite ‘Why serifs are
important’ accept it more or less uncritically.282 The only somewhat
cautious citation is one in (several editions of) Baird and Turnbull’s
textbook The graphics of communication. Although they treat it as an
important paper, they do refer to it as being ‘based on how the visual
system is believed to function’ (1975, pp. 6465, my italics).
Rolf Rehe discusses the relative legibility of roman and sans serif
typefaces in his book on legibility research, which has been published in
at least five editions in English and one in German. He refers to ‘Why
serifs are important’ in a positive way as ‘an extensive study, [which]
suggested that “the neurological structure of the human visual system
benefits from serifs in the preservation of the main features of letters”’
(1984, p. 32). A similar mention can be found in a fairly recent American
master’s thesis on computer screen legibility which compares roman and
sans serif typefaces (Williams 1990, p. 12). The same applies to a recent
American doctoral dissertation on the legibility of roman versus sans
serif typefaces (Kravutske 1994, pp. 1419).283
Pedersen and Kidmose are even more positive. They explain that
Robinson et al. ‘decided to establish their [i.e. serifs’] importance’ (1993,
p. 70, my italics). And de Lange, Esterhuizen, and Beatty, in a paper in
the journal Electronic Publishing, claim that this ‘interesting study
must be going on) is the sole starting point (for making algorithmic descriptions of
how), and where physiological data is regarded as irrelevant.
282. Admittedly, I do not take into account a possible tacit negative reception. On the
other hand, I do not claim that my list of papers which uncritically cite ‘Why serifs
are important’ is exhaustive. For the sake of order: Jeremy Foster’s annotated
Legibility research abstracts 1971 contains a neutral description of ‘Why serifs are
important’ (1972, p. 29). Michael Macdonald-Ross and Eleanor Smith’s partially
discursive bibliography Graphics in text contains a simple entry (1977, p. 46). ‘Why
serifs are important’ was ignored by the cognitive psychologists who, with various
aims, worked on ‘letter perception’ or ‘letter recognition’ in the 1970s and early 1980s.
It may have been ignored because it was deemed to be improbable, or because it was
seen as irrelevant (having too much of a machine-like optical character recognition
approach). Nevertheless, the research on ‘letter perception’, largely based on various
‘feature models’, addressed a different agenda, and it also suffered from its own
validity problems. For an example of such research, see McClelland and Rumelhart’s
well-known paper (1981).
283. Robinson et al.’s met hod o f inv estig ation is seriousl y mis inter prete d by Kravu tske.
A review of empirical studies
/
178
determined the importance of serifs in the perception of individual
letters’ (1993, p. 242, my italics). Another research paper which refers to
the Robinson et al. article attaches great importance to it (Gallagher and
Jacobson 1993, p. 101).284
Although Karen Schriver, in her ambitious and well-publicised
Dynamics in document design: creating text for readers, calls attention
to some of the limitations of legibility research in general as well as its
lack of homogeneous results (1997, pp. 276277, 301), she nevertheless
refers to ‘Why serifs are important’ in such a manner that it can only
be perceived as an important paper (pp. 274, 276, 301).
The re-design of the journal Radiology back in 1982 provides an
example where legibility research explicitly has guided, or at least post-
rationalised, design decisions. The editors of the journal explained: ‘Our
decision [to use a typeface with serifs] was not arbitrary but based on
evidence that serifs enhance legibility’ (Eyler and Stewart 1982, p. 248,
my italics). The ‘evidence’ referred to in the editorial is Robinson et al.’s
paper.
It is tempting to ask why a study like the one in question can
receive such uncritical acceptance. One of several possible answers is
simply the seductive appeal, at least for outsiders, of the authority of
scientific language, and in particular the language of cognitive science.
Sidney Berger, himself a scholar in the humanities, puts it this way:
‘Robinson … gives us a scientific reason for preferring serifs: “the
neurological structure of the human visual system benefits from serifs in
the preservation of the main features of letters during neural
processing”’ (1991, p. 7, my italics). Another reason is simply the way
‘research findings’ often get summarised:
Unfortunately, what is served in the guise of a literature review
is, in many instances, little more than a tiresome listing of studies and
findings without even a hint that the author has been thinking when
reading the literature, not to mention critically evaluating what he or
she has been reading. (Pedhazur and Schmelkin 1991, p. 191)
284. Although Gallagher and Jacobson’s paper is seemingly serious and thorough, it offers
a remarkably flawed reading of Robinson et al.’s pa per.
A review of empirical studies
/
179
Finally: the paper’s uncritical reception is obviously related to the fact
that many of the underlying premises of the authors’ approach, though
seriously contested, are still very influential and even dominant.
Certainly, ‘Why serifs are important’, with its cognitive science
approach, and its computer-modelling method of investigation, did not,
after all, answer the question: which are more legible typefaces with
or without serifs? My investigation of this study and of its reception can
only reinforce the truism that the translation of ‘findings’ from experi-
mental research to the practice of typographic design has to be done with
great caution. And, as I believe I have shown, that translation is even
less assured when experimental validity cannot be taken for granted in
the first place.
Harris 1973
*
In 1973 John Harris published the paper ‘Confusions in letter
recognition’ in the trade journal Professional Printer. It was based on
work carried out at the Computer Science Division of the National
Physical Laboratory, and with support from Watford College of
Technology.
Although the aim of this study was not primarily to investigate the
relative legibility of serif and sans serif typefaces, it is of interest to us.
Harris’ starting point is that Poulton’s research confirms that some
typefaces are more legible than others. Harris is however more
ambitious, and he wants to find out which features of a typeface
contribute to its legibility. Although he acknowledges that adults do not
read letter by letter, he nevertheless suggests that the key to the
legibility of a typeface is found in the forms of its individual characters.
He supports his view by pointing out that Poulton’s experiments
involved exactly the same pieces of text, only set in different typefaces.
Therefore, he reasons, the differences in legibility found by Poulton must
to a large extent be due to the form of the individual characters.
* J. Ha rri s. 19 73. ‘ Con fusi ons in let ter rec ogni tio n’. Professional Printer, vol . 17, n o. 2 ,
pp. 2934.
A review of empirical studies
/
180
Thus, his study is based on the identification of individual letters of
three of the typefaces employed by Poulton the sans serif typefaces Gill
Sans medium and Univers medium, and the roman typeface Baskerville.
The operational criterion employed is the time of exposure method,
performed with a ‘two-channel’ tachistoscope, on characters with
equalised x-height.
Harris refers to evidence which shows that it is very likely that the
time of exposure method as well as the variable distance method favour
typefaces with relative large strokewidth. He points out that the
employed variants of Gill Sans and Univers are almost of identical
stroke width. He slightly increases the exposure time for Baskerville to
compensate for its slightly thinner main strokes. After having performed
the experiments, he then takes into consideration the frequencies in
‘English prose’ of the individual letters in order to calculate the relative
legibility among the three typefaces when used in ‘text’.
With regard to Gill Sans, his results are in accordance with
Poulton’s. Single letters in Gill Sans are more easily identified than
those in the other two typefaces. Harris asserts that the superiority of
Gill Sans over Univers is concentrated in the letters c, e, f, and t. The
ranking order from Poulton’s experiment with regard to Univers and
Baskerville is however not maintained. Harris calculates the ‘deficit in
text’ of Univers compared to Gill to be 4.19% in Gill’s favour, and of
Baskerville compared to Gill to be 11.13% in Gill’s favour. The author
promptly gives an explanation: the apparent inferiority of the roman
typeface Baskerville is caused by the ‘threshold effect’ on the thin strokes
of certain letters something which probably will not have an effect
when read as continuous text. This explanation seems to oppose the
rationale of the study.
Harris refers to Tinker, Paterson and Webster, who found neither
significant differences in reading speed nor any correlation between
distance legibility and speed of reading legibility, and he objects to their
methodology. He suggests that the very short periods of reading
employed by Tinker and Paterson in their experiments was not long
enough to produce significant differences. He also suggests that
differences in stroke width may have confounded the results in their
distance experiments.
A review of empirical studies
/
181
Harris also analyses the confusions of individual letters by his
subjects, i.e. mistaking one letter for another. By doing this he arrives at
some interesting conclusions (p. 32):
serifs on adjacent verticals like h, n, and u, increase uncertainty about
identity. ‘Confusions of b for h and n for u are significantly more likely
to happen in Baskerville.’
serifs on isolated verticals may have a beneficial function. ‘The letter i
was correctly identified more often in Baskerville than in other faces,
perhaps because the top serif serves in some way to emphasise the
gap between the vertical and the dot. The scores for confusions of
i with l and j with l, when combined, were significantly lower in
Baskerville.’
the gap size of the semi enclosed counters of c and e is important. ‘The
letters c and e are less well-recognised in Univers than in Gill. … In
Univers this gap is much smaller’.
With regard to the perceptive last point made by the author, he
nevertheless overlooks that it is not only the gap size which represents a
considerable difference between the letters in question of Gill Sans and
Univers the curved end strokes of the same circumference are cut at
different angles, respectively vertically ( | ) for Gill Sans and
horizontally ( ) for Univers.285
The author first concludes that ‘the evidence suggests that the
legibility of single letters may be a major determinant of readability’.
Then he more cautiously concludes that ‘Some evidence has been found
to support the view that at least one of the variables underlying the
readability of a typeface is the ease with which its individual characters
can be recognised’ (p. 33). He supports this statement by pointing out
that his results with regard to single letter identification are in
accordance with Poulton’s results generated from reading continuous
text. Nevertheless, he cautiously points out that other determining
variables may exist, and that other experimental procedures (like visual
scanning of arrays of letters) might give other results.
285. For an ex tend ed dis cu ss ion on the se mor ph olog ic al or to po logi ca l pr op er ties o f sans
serif typefaces, see Lund 1993.
A review of empirical studies
/
182
Hvistendahl and Kahl 1974
*
In 1974, professor of journalism at Iowa State University, J.K. Hvisten-
dahl and Mary R. Kahl at Kansas State University published their study
‘Roman v. sans serif body type: readability and reader preferences’. It
was published in the News Research Bulletin of the American Newspaper
Publishers Association, and later reprinted in the journal Typographic,
published by the International Typographic Composition Association, in
the United States.286
This experimental study was undertaken on request from the
newspaper Minneapolis Tribune. It wanted advice on whether or not
sans serif typefaces are suitable and legible for continuous newspaper
text. This was at a time when many magazines in the United States had
switched from roman type to sans serif type, and many newspapers had
been experimenting with sans serif type. It is not unlikely that the
possibility of switching to a sans serif type for ‘body text’ was seriously
considered by many American newspapers in the mid 1970s a period
when radical re-design of newspapers became more and more usual.
Hvistendahl and Kahl’s study, on the legibility of sans serif
typefaces compared to roman typefaces in newspapers, was a two-part
study one experimental part measuring the speed of reading, and one
part recording the reader’s preferences.
The experimental study, employing altogether 200 subjects,
measured the time it took the subjects to read ‘at normal speed’
newspaper articles of a certain number of words. No comprehension
check was involved. Four different articles of different length were each
* Hvistendal, J.K., and Mary R. Kahl. 1976. ‘Roman v. sans serif body type; readability
and reader preference’, Typographic, vo l. 8, n o. 2 , 4 pa ges. Publ ished by t he
International Typographic Composition Association, in the USA. (According to the
article in Typographic, and other sour ces, it was repri nted from The ANPA [American
Newspaper Publishers Association] News Research Bulletin, no. 2, 1975, pp. 311.
However, British Library’s interlibrary loan services found no trace of the article in
the 1975 volume (as well as in the 1974 and 1976 volumes) of the bulletin. The article
is probably based on Mary Ruth Luna Kahl’s M.S. thesis: ‘A study of reading speed
and reader preferences between roman and sans serif type’, Iowa State University,
1974).
286. Not to be confused with the British journal Typographic of the Society of Typographic
Designers.
A review of empirical studies
/
183
printed in one roman version and one sans serif version. Altogether eight
typefaces were employed four dedicated newspaper typefaces with
serifs,287 and four sans serif typefaces. However, each typeface from the
one typeface category was compared to only one from the other typeface
category. Thus,
Intertype’s newspaper typeface Imperial, was compared to the sans
serif typeface Helvetica
Royal, Intertype’s version of Linotype’s newspaper typeface Corona,
was compared with the sans serif typeface Futura
The newspaper typeface News no. 2, was compared to the sans serif
typeface Sans Heavy.
The newspaper typeface News Bold, was compared with News Sans, a
News Gothic look-alike
The two first pairs compared above, were set at a column width of
10.5 pica; while the two others were set at a column width of 14 pica. The
‘stimulus material’ was set by newspapers, and printed in specially made
‘experimental’ tabloid newspapers.
The result, as interpreted by the authors, is that the newspaper text
set in roman type was read significantly faster in two of the four compar-
isons; and faster, but not significantly so, in one; and more or less the
same to sans serif type in one. With regard to the preference recordings,
a clear majority preferred roman to sans serif type. The authors also
found that the widest columns gave the best results. Their conclusion is
clear: ‘a well designed Roman type in combination with wider columns
would produce optimum legibility’, and further that ‘Until more evidence
is available, newspaper typographers would be advised to take a cautious
approach toward using sans serif type for textual matter.’
Ye t, it i s wo rt h no ti ci ng t ha t b ot h th e ‘s ig ni c an tl y fa st e r’ ( in t wo o f
the comparisons) and the ‘not significantly faster’ (in one of the compar-
isons) results express less than two seconds in difference of roughly
around one minute reading time. It is therefore again appropriate to
stress that statistical significance does not correspond to practical
287. For a g en eral i nt rodu ct ion to t he h is tory o f dedi ca te d ne ws pape r type fa ces, se e Le vel
1989; and for a comprehensive visual index, see Gürtler 1988.
A review of empirical studies
/
184
significance. The more or less identical result (in the fourth comparison)
expresses 1/5 of a second in favour of the sans serif being compared.
There is one confounding factor which poses a damaging threat to
the internal validity of the experiments. One of the ‘comparisons’
yielding ‘significant’ differences is illustrated in Hvistendahl and Kahl’s
article showing News no. 2 and News Sans. This illustration clearly
shows that while the nominal line spacing, and probably the nominal
point size as well, is the same in both examples, the x-height of the sans
serif is considerably larger. Thus, the roman typeface looks generously
linespaced, displaying distinctive articulated lines of text. The text set in
the sans serif typeface looks badly in need of more interlinear spacing, or
better, a smaller point size, which visually would make it match the text
set in roman type.
To conclude: the internal validity of this ‘controlled readability
experiment’ is flawed with respect to the integrity of its stimulus mater-
ial. The typographic variables are ‘controlled’ with respect to nominal
and arbitrary quantities but not to actual and visual quantities. It is
thus impossible to decide whether the results are based on differences in
typeface style or differences in the x-height to line-space ratio.
Eventually, the authors explicitly make a small but not unimpor-
tant one-word reservation about inducing from the specific typefaces
employed in the experiment to typeface categories: ‘If the type faces used
in this study are typical, it can be said that on the whole Roman type is
more readable when used for textual matter than is sans serif type.’288
In the mid-1980s, in a manual on Typography and design for
newspapers, published by the IFRA, the International Association for
Newspaper and Media Technology, Hvistendahl and Kahl’s study is the
only legibility study referred to in the section on ‘text type’:
Sans serif type designs are not the best choices for newspaper text type.
The lack of serifs causes an impersonal, monotonous look, resulting in
lower reader appeal. Prof. Hvistendahl of the University of Iowa found,
in a 1974 study, that readers prefer serif over sans serif type in
newspaper text matter. (Rehe 1985)
288. On the first page (of this unpaginated article), my italics.
A review of empirical studies
/
185
Vanderplas and Vanderplas 1980
*
Title of the paper: ‘Some factors affecting legibility of printed material
for older adults’. Published in the journal Perceptual and Motor Skills.
Authors: James M. Vanderplas and Jean H. Vanderplas at Washington
University, St. Louis.
The rationale provided for this study is reasonable. The authors
display a good command of relevant literature, and they argue that
almost all previous legibility studies have employed as subjects college
students, young adults, children or military personnel. The many visual
impairments that are common among older people suggest that
recommendations based on studies of young adults may not be wholly
applicable for older adults. The authors make a case about older people
who often need to engage in crucial and demanding non-leisurely reading
activities, for example of prescription labels, medicinal instructions,
contracts, and application forms. They suggest that small type or
inappropriate typefaces or particular layout-solutions may be ill suited
for the visual needs of the elderly.
The study seems to be thorough with regard to its experimental
design. Furthermore, the authors reflect in a mature manner on their
object of study. The study consists of two experiments based on
recordings of reading speed, as well as preference ratings. The first
experiment includes the variables type size (several sizes), typeface
‘style’ (‘Roman’ and ‘Gothic’, i.e., serif and sans serif) and typeface ‘font’
(the serif typefaces Century Schoolbook, Times Roman and Bodoni; and
the sans serif typefaces Helvetica, Trade Gothic and Spartan).289
* Van de rp las , J ame s M. , a nd Je an H. Van der pl as. 19 80 . ‘So me f act or s a ff ec tin g
legibility of printed material for older adults’. Perceptu al a nd M otor S ki lls, vol . 50,
pp. 923932.
289. Trade Gothic is equivalent to American Type Founders’ well known sans serif
typeface News Gothic. The Spartan typeface in question is an equivalent to the Bauer
type foundry’s well known sans serif typeface Futura (cf. Wheatley 1983).
A review of empirical studies
/
186
The second experiment deals with line length and interlinear
spacing.290 Twenty-eight subjects between the age of 60 and 83
participated in the first experiment (the one primarily of interest to us).
The material presented to each subject was contained in a 90-page
booklet with 30 combinations of type size, type style and typeface. Every
three pages is occupied by one page with prose passages (i.e., the
stimulus material), one page with a multiple choice comprehension test
for control purposes only, and one page with a questionnaire for
subjective preference ratings.
Among other things, the results
indicates that Roman styles are associated with higher over-all reading
speeds than Gothic styles, with Century Schoolbook most consistently
associated with the higher speed. (p. 927)
However, the authors add some important qualifications: two
typeface/size combinations resulted in higher reading speeds than
others, among them the sans serif typeface Helvetica in 16 point and the
serif typeface Bodoni in 16 point. Increase in type size generally
produced higher reading speed, except for the 14-point type which
produced a decrease in reading speed.
The authors conclude:
If one were to use the present results as a basis for recommendations,
one might not be ill-advised to choose Century Schoolbook or a similar
Roman type style over Gothic styles, with size in the 12- to 14-point
range, as opposed to the usual choice of 8- to 12-point type for younger
persons. (p. 931)
The authors are however aware of that these style/size combinations also
will interact with combinations of line length and inter linear spacing,
and possibly produce additional variations in performance and
preference. Furthermore, the authors admirably point out that this
study only included a small fraction of available typefaces of which many
290. The article is not illustrated, but information is given about how to obtain photocopy
or micro-fiche samples of the stimulus material. The authors’ verbal description of the
stimulus material suggests a wise elimination of typographic nuisance variables, for
example by proportionalization of type size to line length from page to page of the
stimulus material; in accordance with recommendations given by Tinker in his 1963
Legibility of print.
A review of empirical studies
/
187
might be more legible or less legible than the ones employed. With these
qualifications in mind, they suggest that instead of relying on existing
empirical research results, one should rather be prepared to test for
legibility any particular material that is being prepared for print.
Interestingly, this idea (expressed in 1980) is similar to today’s ‘usability
testing paradigm’ in information design.
Suen and Komoda 1986
*
Title of the paper: ‘Legibility of digital type-fonts and comprehension in
reading’. Published in Text processing and document manipul ation,
edited by J.C. van Vliet, and published by Cambridge University press
on behalf of The British Computer Society. Authors: C.Y. Suen and
M.K. Komoda, at with Concordia University in Montreal, respectively at
the Computer science department and the Psychology department. This
paper, which is published as part of the proceedings from an inter-
national electronic publishing conference, presents two experiments.
The authors argue that advances in computer technology have
brought many new methods of conducting research in typesetting and
typography and that this especially benefits evaluation of ‘type-font
design’. Characters from typefaces can be input by an optical scanner
and converted into digital images that can be easily reproduced by
computers and matrix and laser printers (it seems that the authors are
ignorant about the contemporaneous existence of high resolution digital
typefaces). Therefore, the authors argue, legibility can be studied much
more rigorously than before, because variables such as ‘font style’,
character shape, format and spacing, exposure time, and presentation
speed, can be controlled precisely by the computer. However, when it
comes to the authors’ own two experiments it is only exposure time and
presentation speed that in the end is relevant. Furthermore, although
* C.Y. Suen and M.K. Komo da. 1986. ‘Legibilit y o f digital type -fonts and comprehension
in reading’. In Tex t p roc ess ing an d d ocum ent man ipu lati on. Proce eding s of the
international conference, University of Nottingham, 1416 April 1986. Edited by
J.C. van Vliet, pp. 178187. Cambridge: Cambridge University Press on behalf of The
British Computer Society.
A review of empirical studies
/
188
the text refers to an illustration of the typefaces employed, the
illustration is not present. It is therefore impossible to judge visually
whether potential nuisance variables are ‘controlled precisely’ or not.
Two ‘typewriter’ typefaces and one primitive printer typeface are
studied, respectively Letter Gothic, Courier, and DECwriter. The Letter
Gothic font represents sans serif typefaces and Courier represents serif
typefaces, whereas DECwriter (with absence of real ascenders and
descenders) represents alternative coarse ‘dot-matrix’-typefaces. All the
three fonts have been digitised by an OCR device, with resulting
representations of each letter positioned within a matrix of 21 42 cells,
presented on a ‘high resolution’ computer screen. All three fonts are
typewriter/printer fonts, and all are probably monospaced. The fact that
Courier, as opposed to most serif fonts, is monoline, and that its serifs
are highly exaggerated (in order to compensate for it being monospaced)
does not seem to bother the authors.
The first experiment investigated the legibility of individual letters
(both capitals and small letters), while the second investigated the
legibility (‘readability’) of continuous texts.
The first experiment was based on tachistoscope-like short
exposures of individual letters on a computer screen followed with a
masking stimulus after three different periods of time, ‘the inter
stimulus interval’ (0, 16.7 and 33.3 ms). The approach is inspired by
work by the cognitive psychologist Ulrich Neisser, and is referred to as
‘the visual backward masking paradigm’.
the legibility of letters is increased due to the increased amount of time
available for the processing of the letters before they are degraded by
the mask. With respect to the present experiment, then, letters
presented in inherently more legible fonts should lead to superior
performance, especially at the shorter ISIs [inter stimulus intervals].
(p. 179)
Six subjects, with normal vision, participated in the experiment.
The results show that with no interval (0 ms) between exposure and
masking, performance was highest for Letter Gothic and poorest for
DECwriter. With the longest interval (33.3 ms) performance were
similar for all three fonts.
A review of empirical studies
/
189
These results, then, suggest that the sans serif font Letter Gothic is the
most legible among the three fonts and that the DECfont is the least
legible among the three. (p. 180)
The second experiment, however, is based on an acknowledgement
of that it is ‘not clear’ whether the results from the first experiment ‘can
be generalized to reading’ (p. 184). The authors refer to the cognitive
scientist David Rumelhart’s suggestion that ‘reading is accomplished
through a set of interactive processes’ where skilled readers constantly
employ their pre-established knowledge of language to recognise
individual words while reading. The authors here echoe an insight that
some commentators since the latter part of the 19 century have
expressed in various ways: that ‘factors’ such as typeface are very much
peripheral to the reading process, and thus that it will be extremely
difficult to quantify any meaningful differences in legibility by various
operational methods.291 However, the authors do not seem to realise the
implications of this ‘insight’, and go on devising yet another laboratory
experiment, this time maybe one that has marginally more ecological
validity.
In this second experiment, which basically is a comprehension test,
the subjects read paragraphs taken from Reader’s Digest. The words are
presented on a computer screen one at a time at a rate of 600 words per
minute, and questions for checking comprehension have to be answered.
This method is called ‘the rapid sequential visual presentation (RSVP)
procedure for comprehension’. Thirty-six subjects with normal or
corrected vision participated. Their reading proficiency was tested
beforehand.
The results: As in the first experiment, the DECwriter font appears
to be the least legible. The authors suggest that the explanation lies in
the fact that the DECwriter font is a coarse dot-matrix font, and, that its
small letters lack ascenders and descenders. They also point out that the
results of the two experiments are not entirely consistent: whereas the
sans serif font Letter Gothic was most legible in the first experiment, in
the second experiment ‘very little, if any, differences was observed in the
reading performance of texts presented in’ the serif and the sans serif
291. See the section ‘Peripherality to the reading process’, in chapter 4.
A review of empirical studies
/
190
typeface. They therefore suggest that ‘the reading skills and strategies,
brought to the reading situation can apparently attenuate the effects of
the purely visual characteristics of the text being read’ (p. 186). Although
this ‘reading situation’ certainly is an extreme laboratory situation, their
suggestion is plausible: it is exactly the extreme unfamiliarity of the
DECwriter font that has played a role in a ‘more’ real reading situation
than in the first experiment.
Having in mind that this paper probably was written in the first
half of the 1980s, one of the conclusions seems appropriate and timely:
The authors conclude by questioning the adequacy of contemporaneous
primitive dot-matrix printer fonts that lack proper ascenders and
descenders such as the DECwriter font.
Taylor 1990
*
Jeffrey Taylor is preoccupied with ‘the learning process and how effective
[sic] the transfer of information takes place’ (p. 3). In the rationale
provided for this study, Taylor describes a new situation after the advent
of ‘desktop publishing’, where authors who produce manuals and
technical documentation are making all the decisions, from content to
printing style. Questions such as what typeface to use which ‘have
traditionally been the domain of publishing houses and printing
companies’ are now dealt with by the authors. However, authors and
their organisations
do not have a great deal of research to guide them in making basic
decisions about how to prepare effective and efficient technical manuals.
(pp. 12)
* The title of this PhD thesis: ‘The effect of typeface on reading rates and the typeface
preferences of individual readers’. Author: Jeffrey Lynn Taylor. Dissertation
submitted to Wayne State University in Detroit. Major: ‘instructional technology’.
This is the second of three recent PhD theses on the relative legibility of serif and
sans serif typefaces, submitted under the label ‘instructional technology’ at Wayne
State University (the other two are Stahl 1989, and Kravutske 1994).
A review of empirical studies
/
191
And Taylor goes on:
There is a plethora of advice, but little in the way of empirical studies to
help make the decision on serif versus sans-serif typeface. Which
typeface, serif or sans-serif, is easiest to read? (p. 4) … The reference
material and literature, currently available, are filled with opinions,
suggestions and inferences regarding the use of a serif or a sans serif
typeface. Most of these opinions are intuitive based or based upon
centuries of past practice by the printing and/or publishing industry.
The past practice is usually provided as the rationale for choosing either
the serif or sans-serif typeface. The printing and/or publishing industry
has traditionally been responsible for the look of texts and technical
documentation. They take the written material and apply their skills to
make the printing of texts and textual material ‘look’ good from a page
perspective. Little effort was taken to evaluate the design of these
printed textual materials from the learners perspective. (p. 15)
And, not unexpectedly, Taylor contrasts the situation he describes with
the potential of his findings:
The significance of this study lies in its potential for providing a
foundation for developing effective and efficient texts, documentation,
and technical manuals. (p. 11; and in the abstract, p. 86)
Taylors study is based on an unsubstantiated postulate and
adjoining premise: that the most basic decision to be taken when
designing text is whether or not to use a serif or a sans serif typeface:
The process of producing texts, technical documentation, or manuals
should begin with the decision that is usually the last. What typeface
should be chosen for the document, i.e. should it be a serif or sans-serif
typeface? (p. 4)
There are many issues to resolve … but they all must start with the
same choice, typeface. (p. 11; and in the abstract, p. 86)
This explicit premise is far from the view of typography in which type-
faces play a relatively peripheral role in the reading process.292 What’s
more, Taylor point of view is carried to the extreme:
If a serif or sans-serif typeface will permit the reader to read faster,
then the documenter can consciously choose a serif or a sans-serif
typeface to accomplish the readability desired for either the entire
document or various parts. (p. 5)
292. See the section ‘Peripherality to the reading process’, in chapter 4.
A review of empirical studies
/
192
Using typefaces that can speed the reader up or slow the reader down
can be important. (p. 22)
Accordingly, in his final conclusion Taylor claims that by changing
between a serif and a sans serif typeface
designers and developers of texts … can move the reader along with a
serif typeface and then slow them down with a sans-serif area which
they are drawn to by a preference for that typeface. (p. 75; and,
prominently, in the abstract, p. 87)
This remarkable conclusion is based on the author’s interpretation of his
data: Text set in serif typefaces was read faster than text set in sans-
serif typefaces, whereas the subjects preferred text set in sans serif
typefaces. However, not only is the author’s conclusion sensational and
far from convincing. I will argue, further on, that also the author’s
interpretation of his data is far from convincing.
Nevertheless, elsewhere in his thesis without fully acknowledging
the implications the author obliquely indicates that factors such as
typeface might be peripheral to the reading process. While reasoning
around his results he suggests that ‘good’ readers are the ones who can
benefit the most by clever manipulation of typographic variables such as
typeface, whereas less good readers (who assumingly struggle more with
merely grasping the content) don’t benefit.293 However, the author
seems to forget that it is manuals and technical documentation that
constitute the context-rationale for his study. The test-material actually
used does not match this context-rationale. The test material
generating the data from which his conclusion stems – is not text taken
from manuals and technical documentation, but relatively ‘easy’ subject-
matter prose à la Readers Digest. Manuals and technical documentation
are typically read or consulted in less than ideal situations, and in such a
‘reading for action’294 situation, the perceptual task of ‘mere pick-up and
registration’ of the text more than in other contexts often comes
second to trying to grasp a less than accessible written exposition while
293. In any case, the author also seems to contradict himself: ‘The reader who must
struggle with a typeface that slows the reading process down will also more likely be
distracted by other elements contained in the material.(p. 41).
294. For an in tere st in g di sc ussi on o n ‘re adin g as g oa l-driven behaviour’, see Bouwhuis
1990.
A review of empirical studies
/
193
trying to solve an awkward problem.295 Thus, in such a situation, the
typeface is probably even more peripheral than in most reading contexts.
This means that the ‘peripherality of the typeface perspective’296 is even
more relevant here than otherwise (because of the mismatch between
context-rationale and test material in the study). Thus, when taking the
context-rationale into account, then Taylor’s conclusion is not only
sensational as suggested above, but also extreme.
Two methods were employed in this study: One ‘objective’, that is,
by measuring the speed of reading, and one ‘subjective’, by recording the
(implicit) preferences of the readers. The study seems in many respects
to be well designed. It is based on a double blind design, where neither
the subjects nor the administrators of the test knew precisely what was
being tested and when. The tests were integrated with ordinary reading
activities and regular periodic reading rate tests. The 74 subjects (the
ones who participated all the time) were all high school students. High
school students were chosen as population ‘because they are regularly
reading textbooks and technical manuals’.
Only two typefaces were employed, Pacific Data Corporation’s
version of the serif typeface Times Roman, and their version of the sans
serif typeface Helvetica, both in 12 point size. Nominal interlinear
spacing was 20% of the nominal point size. The text was excerpts from
articles carrying titles such as ‘Amazing partnerships in nature’ and
‘Tropical forests’, mainly taken from a collection called Be a better reader.
No illustration of the test material is shown in Taylor’s thesis.
The subjective study showed a clear preference for sans serif
typefaces (pp. 5862, 70). In the experimental study the mean serif
reading rate was 260.3 words per minute, and the mean sans serif
reading rate was 259.0 words per minute (computed by me, on the basis
of a table on p. 50). The reading rate difference between serif and sans
serif typefaces is thus negligible.
295. Nevertheless, I am not implying that the point size of the text does not matter when
consulting an auto manual in a badly lit garage (this example is used in Krull 1997,
p. 171). But Krull’s situational factor is not necessarily relevant to the point I have
just made.
296. See the section ‘Peripherality to the reading process’, in chapter 4.
A review of empirical studies
/
194
However, the reading rate result does not deter Taylor from trying
to find information in his data that points to alternative conclusions. He
does this by ranking the reading rates into four quartiles, and excluding
the first and the fourth as ‘extreme’ quartiles. However, not only does he
trim his data in order to obtain a central tendency measure, he also
excludes the second quartile as below average. Eventually, Taylor is then
left with the third quartile (the ‘good readers’); which is used to test for
significance.
By doing so Taylor found that ‘students in the third quartile of
reading rate rankings (the ‘good’ readers) do indeed read serif typeface
text faster than they read sans-serif typeface material’ (p. 62). For the
third quartile the serif mean was 262.7 words per minute, while the sans
serif mean was 254.5 words per minute (see p. 55). He adds (in order to
support this finding) that ‘for all students the overall mean reading rate
for the sans-serif typeface material was slightly lower than the mean for
the serif typeface reading rates (p. 63, my italics). The expression
‘slightly lower’ is certainly not an understatement. (It can easily be
interpreted as conveying a bigger difference than the actual mean
reading rate difference of 1.3 words per minute.) Taylor concludes that
the difference for the third quartile readers was ‘statistically significant’
(p. 64). He attaches great importance to this finding:
The statistically significant difference between the serif reading rates
over the sans-serif reading rates in the 3rd quartile of rankings is very
important. This group is the good readers who can gain the most by the
techniques and factors of manipulating texts, documentation, and
textual material. … These are the readers who can read above average
and should be able to benefit from manipulations of the text,
presentations, or other factors which may permit them to read faster.
As evidenced by the statistically significant outcome of this group, the
reading of a serif typeface permitted them to read faster than with the
sans-serif typeface. (p. 66)
What is more, in the final summary, ‘statistical significance’ is confused
with substantial or meaningful ‘significance’:
The speed of reading either the serif or the sans-serif typeface had a
statistically significant outcome in this study. The preference study was
also certainly significant. These high school students who were tested in
the study had a strong preference for a sans-serif typeface. (p. 70; my
italics)
A review of empirical studies
/
195
The choice of a serif typeface permits [that] ‘good’ readers read
significantly faster than [with] a sans-serif typeface. (p. 74, my
italics)297
Moreover, one page further on in the final summary, the ‘good readers’
have become the ‘readers’:
Even though the readers could read faster with a serif typeface, they
still preferred the sans-serif typeface. (p. 75, my italics)
I feel uneasy about the move constituted by ‘the third quartile
argument’. Furthermore: The sudden and unacceptable (but almost
invisible, when not reading the thesis from beginning to end) semantic
transition in the final summary, from ‘statistical significance’ to
‘significance’, and from ‘good readers’ to ‘readers’, adds to this unease.
In fact, the way the results are interpreted is not convincing, and only
too convenient in relation to the final ‘interesting’ and sensational
conclusion (on deliberately slowing down readers on occasions by using
sans serif typefaces).
In addition, possibly confounding factors are over-looked. Because of
the author’s non-adequate domain knowledge, the study lacks crucial
awareness of confounding nuisance variables residing in the stimulus
material; related to appearance size,298 x-height, and interlinear
spacing. To compare text set in Times Roman with text set in Helvetica,
with the same nominal attributes with regard to type size and
interlinear spacing (and at the same time disregarding a most likely
crucial difference in x-height), introduces a whole set of nuisance
variables that undermines the internal validity of this study.
The irony in this respect is that the author has made it clear on
pages 2 and 11 that ‘typeface’ may influence ‘the outcomes attributable
to other aspects of text presentation’ (he refers to ‘leading’, ‘type styles’,
297. The fact that it has become common practice in some quarters to drop the word
‘statistical’ before ‘significant’ (see Pedhazur and Schmelkin 1991, p. 202), may
explain the use of the expression ‘significance’ in the final summary. However, it does
not explain the authors’ shift from ‘statistical significance’ to ‘significance’.
298. Without realising the implications, the author touches upon the problem of nominal
size versus appearance size, while explaining basic typographic terms on pp. 1011.
Here, while explaining the (now obsolete) term ‘stamp’, he writes that ‘the printed
letters may vary in size from specific type style to type style.
A review of empirical studies
/
196
‘page layout’, ‘cueing’, ‘bolding’ and ‘justification’, ‘etc’), and thus that
‘typeface contamination’ may have influenced the results of other studies
on content, lay out, justification, leading, etc. Thus, the author here
acknowledges that the typeface is only one of many typographic
variables that may influence ‘the outcome’. However, the author only
appreciates that the typeface variable may influence other variables. He
does not appreciate that other typographic variables (both the micro and
meso variables I mention above and the meso and macro variables the
author mentions) interact with the typeface variable and thus may
influence the effect of the particular typeface chosen. By isolating and
privileging the typeface variable as a basic prerequisite, typeface is no
longer a variable; it has become a one way parameter, able to influence,
but not to be influenced.
Further lack of adequate domain knowledge is revealed on many
occasions throughout the thesis. For example: Taylor wrongly claims
that the point size incorporated on most desktop publishing software is
‘slightly less than the 1/72 ... inch’ (p. 10).299 He also uncritically
circulates a piece of information (which certainly lacks empirical
verification) that stems from the graphic design consultant Jan White,
claiming that whereas the serif typeface is the default typeface in the
United States, the sans serif typeface is the default in Europe (p. 72).
More revealing: The author displays a (not un-common) confusion when
it comes to the use of bold typefaces (see p. 73). ‘Bolding’ is wrongly
understood as an alternative to regular type, and not as a contrastive
navigational device or complementary articulatory device.300 I believe it
is the same kind of confusion that (mis)-leads the author to his final core
conclusion (about slowing readers down when convenient with a sans
serif typeface, based on his ‘findings’ that readers both prefer and read
more slowly, text set in sans serif typefaces). The author does not seem
to realise that a contrast such as the one created by displaying with
another typeface chunks of text within a larger text, in the first place is
simply done in order to signal some kind of semantic or structural
299. For int ro duct io ns to di scou rs e on t yp ogra ph ic m ea sure me nt, s ee B oa g 19 93 , and
1996.
300. For two p aper s on t he u se o f bold t yp efac es for a rt icul at or y an d navigational
purposes, see Twyman 1986, and 1993.
A review of empirical studies
/
197
difference, and not necessarily ‘important material’ that the reader will
‘prefer’ to read (see p. 70).
The lack of a more theoretically inclined domain knowledge is also
revealed in Taylor’s thesis. In his final discussion, while discussing other
areas that ‘need to be explored’, Taylor suggests ‘headings or guidelines
used throughout the textual materials’. Certainly, such a high-level
question on overall topic structure, and consequently the access
structure of written text is, assumingly of great importance. However,
when the author asks: ‘does the use of headings or guides within the
textual materials give the reader a more structured technique and
therefore is a more efficient and effective textual document?’ (p. 74), he
reveals an understanding of text structure as something posterior and
external to the text itself. Such an understanding is contradictory to the
idea about ‘the textual function’,301 that is, that any text in order to be a
text is more than mere content; text is content and structure in the same
instance. Text is thus material and not only a heap of data and loose or
inter-related ideas. The concept of ‘the textual function’ suggests that
structure is intrinsic to text. Content as text, becomes structured when it
becomes text, that is, the textual function does not only give the ‘content’
a linear structure, but also an hierarchical semantic structure (on both
the inter-sentential paragraph level and on the overall rhetorical
discourse level). Whether or not to further structure the text (to reveal a
latent structure or by re-arrangement) by inserting headings that
reflects, or generates, a semantic and hierarchical topic structure is not
a question that can be solved empirically by posing a dichotomous ‘either
or’ question. Any text, to be a text, depends, more or less and on several
levels, on linguistic and extra-linguistic devices that signals or indicates
structure and thematic progression (such as for example backward and
forward pointing cohesion cues for the bridging of gaps; and punctuation,
301. See Halliday 1970: ‘Finally, language has to provide for making links with itself and
with features of the situation in which it is used. We may call this the textual
function, since this is what enables the speaker or writer to construct “texts”, or
connected passages of discourse that are situationally relevant; and enables the
listener or reader to distinguish a text from a random set of sentences. One aspect of
the textual function is the establishment of cohesive relations from one sentence to
another in a discourse’ (p. 143).
A review of empirical studies
/
198
paragraphing and headings for articulation of boundaries and labelling
of segments).302
I point to these examples of inadequate domain knowledge, not in
order to score cheap points, but because I suspect that the examples are
symptomatic of many legibility studies. As I point out on several
occasions elsewhere in this thesis, it is exactly the lack of adequate
domain knowledge that seems to be responsible for the problems many
legibility studies have struggled with. ‘In practice, lack of insight and
expertise in the domain of application can lead to naive or misleading
conclusions and recommendations.’ (Schumacher and Waller 1985,
p. 379).
The concept of ‘domain knowledge’ is used in the field of ‘human
computer interaction’ and ‘usability studies’, while referring to ‘task’,
‘target’ or ‘application’ domains such as accounting or for that matter
typography. It is not an unproblematic concept. There can be no sharp
well-defined borders between ‘domain knowledge’ and ‘non-domain
knowledge’. In typography, domain knowledge can be argumentative,
descriptive or prescriptive, and codified in manuals of typography.
Domain knowledge is also represented by practitioners’ practical
knowledge (whether tacit or discursive). In addition, knowledge from a
diversity of academic disciplines such as psychology, computer science,
text preoccupied disciplines such as literary studies and linguistics, and
text as physical object’ preoccupied disciplines such as analytical
bibliography and palaeography, have clearly contributed to establish a
sophisticated typographic ‘knowledge domain’.303 It follows from this
that Taylor’s thesis also contributes to this typographic knowledge
domain; it is of course not separate from what it describes. However, it
302. This argument has been informed by the following works: Halliday 1970; Halliday
and Hasan 1976; de Beaugrande 1980, 1984; Nash 1980; Pace 1982; Bernhard 1985;
Southall 1989; Nunberg 1990; Waller 1985, 1991; Stiff 1994; and not least, Vagle
1995. For a f ai rl y re ce nt emp ir ical s tu dy o n the us e of h ea ding s in t ex t, see Wi ll iams
and Spyridakis 1992. Nevertheless, Taylor must be excused if his concern is only with
the problem of ‘guidelines’ and ‘over-organised’ text, as in some examples of distance-
teaching materials; see Whalley 1993. For a relevant discussion on text structures
(in plural) and the ontological status of texts see Renear, Mylonas and Durand
1996; Huitfeldt 1995; and Biggs and Huitfeldt 1997.
303. The references in the previous footnote may well serve as examples of such
contributions.
A review of empirical studies
/
199
does not follow from this that all knowledge claims belonging to or
related to a knowledge domain are equally valid, plausible or
persuasive.304
To sum up: the author’s concern withthe learning process’ and his
seemingly timely concern with providing ‘desktop publishers’ with
empirically based knowledge sounds reasonable. However, much seems
to be wrong with his thesis. It is based on an un-substantiated postulate
and adjoining premise (by privileging the typeface ‘factor’ without
presenting any sound reasons for that). The way the results are
interpreted is not convincing. The final (sensational) conclusion is far
from convincing. The way ‘statistically significant’ changes to ‘significant’
is dubious. And finally and most crucially due to lack of adequate
domain knowledge the study lacks internal validity.
de Lange, Esterhuizen, and Beatty 1993
*
At the ‘Raster imaging and digital typography’ conference in Germany in
April 1994 (‘RIDT94’), Rudi de Lange from the Department of Graphic
Design at Technikon OFS in Bloemfontein in South Africa presented the
paper: ‘Performance differences between Times and Helvetica in a
reading task’. The paper has been published in the journal Electronic
Publishing, and was based on de Lange’s 1993 dissertation ‘The
legibility of sans serif typefaces, an experimental and comparative study’
(de Lange 1993), where his co-authors appeared as supervisors. The
results of the study had been made public around a year earlier, in the
304. For a c on trib ut io n to t he que st ion of wha t do main k nowl ed ge m ay or ma y no t be
with references to the ‘underdeveloped literature of typography’ and ‘a rich and
diverse body of knowledge won from and tested by practice’ (p. 150) see Paul Stiff’s
instrumental case study ‘The end of the line: a survey of unjustified typography’
(1996).
* Rudi W. de Lange, Hendry L. Esterhuizen and Derek Beatty. 1993. ‘Performance
differences between Times and Helvetica in a reading task’. Electronic Publishing,
vol. 6, no. 3, pp. 241248. See also Rudi Wynand de Lange. 1993. ‘The legibility of
sans serif typefaces, an experimental and comparative study’. Master’s Diploma
dissertation. Bloemfontein: Technikon OFS; and [Lange, Rudi de]. 1992. ‘The
legibility of sans serif typefaces’. Graphix, Dece mber/Janua ry, 199 3, pp. 1516.
A review of empirical studies
/
200
South-African printing trade magazine Graphix ([de Lange] 1992). For
clarifications, I will on a few occasions refer to de Lange’s unpublished
dissertation. Otherwise I will only refer to the article in Electronic
Publishing. No further references will be made to the article in Graphix.
The starting point for this study is a situation, as described by the
authors, where typographers and printers even today regard roman
typefaces as more legible for continuous text than sans serif typefaces.
De Lange and his co-authors substantiate their claim by referring to a
whole range of present-day American and British manuals and popular
handbooks on graphic design, typography, and desk top publishing,
including Richard Rubinstein’s textbook Digital typography. In these
sources they find that:
Authors argue that readers prefer serif typefaces, read them faster,
recognize them easier and that there could possibly be a higher
comprehension rate with material printed in these typefaces. (p. 241)
This is an interesting observation. Typographers today often reveal
indifference towards the question of whether sans serif typefaces are less
or more legible than roman typefaces or not, or alternatively see it as a
false or dead dichotomy. Sans serif typefaces are used extensively for
many different applications, including continuous text. However, it is
still true that sans serif typefaces are used relatively little for continuous
text in books and newspapers. De Lange and co-authors’ suggestion that
most manuals on typography promote the idea that sans serif typefaces
are not as well suited for continuous text as roman typefaces are, is also
worth noticing. Whether this is due to the aesthetic preferences of the
authors of these books, it nevertheless, according to de Lange et al.,
seems like this reluctance against sans serif typefaces is justified by
referring to a notion of ‘legibility’.
De Lange et al. contend that
Most of these opinions are, however, not based on any satisfactory
empirical evidence, but appear to reflect the personal opinion of the
authors. (p. 242).
Assumptions about the superior legibility of serif or roman typefaces
appear to be untested generalizations. The author found no supporting
evidence during an extensive literature study to confirm them. Many
typographical practices are still based on the belief that romans are the
most legible typefaces to use for text. This unsubstantiated belief, the
A review of empirical studies
/
201
importance of legible instructional text, and the central part that
typography plays in the graphic design process, motivated this study.
(p. 243)
De Lange also refers to researchers who suggest equal legibility
between sans serif and roman typefaces (Tinker 1963; Poulton 1965;
Zachrisson 1965; and Moriarty & Scheiner 1984), and researchers who
suggest roman superiority (Robinson, Abbamonte and Evans 1971)
(pp. 242, 246). He also refers to the graphic design consultant and author
of many books on typographic design, Jan White, who suggests equal
legibility.
If we accept these research results as ‘empirical evidence’ (pointing
as they do in another direction than most of the ‘untested’ ‘assumptions’
in manuals of typography), it is reasonable to ask why de Lange et al. felt
it necessary to carry out yet another large scale experimental study on
typeface legibility. No answer to this question is found in the Electronic
Publishing article. However, an answer is present in the dissertation
(pp. 7, 2932). There de Lange refers to ‘errors’ in the research designs of
the studies of Tinker, Poulton, and Zachrisson. According to de Lange,
Tinker’s experimental design suffered from several errors, among them
relying on nominal instead of actual typesize; relying on only one sans
serif typeface among the ones being tested; and relying on the Chapman-
Cook test which according to Zachrisson is too much an attention test
and not a speed reading test. De Lange also points out that readers today
are much more used to sans serif typefaces, and that there exist a much
larger variety of sans serif typefaces, of which some are possibly more
legible than the one employed by Tinker. Zachrisson’s experiments are
criticised with regard to formal experimental procedure; that the ‘visual’
sizes of typefaces varied; and that he employed too small a sample size
with regard to population. Poulton’s experiment measuring
comprehension rate is implicitly criticised for measuring comprehension
and not ‘legibility’. Robinson, Abbamonte and Evan’s seriously flawed
‘pro-serif’ study, which, in opposition to the above mentioned studies, is
not only mentioned but also evaluated in the Electronic Publishing
article, is referred to in an uncritical manner and only rejected by de
Lange because it did not investigate the legibility of continuous text: ‘[an]
interesting study ... [which] determined the importance of serifs in the
A review of empirical studies
/
202
perception of individual letters and not the legibility of continuous text.’
(p. 242).
De Lange’s experiments305 were of four different kinds: a word
recognition test (oral reading of discrete word units); a speed reading test
(silent and subsequent repeated oral reading of discrete word units); a
comprehension marathon test (answering questions after having read a
text), and a scanning test (a search task in continuous text). These
experiments involved 450 subjects, all primary school children. Method
with regard to the selection of subjects, experimental procedures, and
statistics, seems to be well considered and sound. However, the same can
not necessarily be said of the overall experimental design.
Commendably, several sans serif and roman typefaces (cf. the
dissertation) were employed to represent the two typeface categories in
question, and in each experiment the two alternative typefaces were
matched closely with regard to actual size, weight, letter spacing and
line spacing. Times Roman and Helvetica (laser printed) were the
typefaces employed in the scanning experiment, the only one of several
experiments which are described in some detail in the article (hence its
title). This experiment was regarded by the authors as ‘the most
convincing experiment’. In fact, the other three experiments are, in the
dissertation, but not explicitly in the article, more or less discarded as
methodologically invalid.
The authors present the results of their study the following way:
The authors obtained sufficient evidence during the word recognition,
speed reading, and comprehension test not to reject the research
hypothesis of equal legibility between roman and sans serif typefaces.
Serifs do not appear to have a noticeable effect on legibility, as measured
by all the tests employed in this study. The subjects did not read and
recognize words with serifs faster, their comprehension did not increase,
and they were also not able to find a word in a portion of text more
easily, when the text was set in a roman typeface. Sufficient evidence
was also found during the scanning experiment not to reject the
hypothesis that Times Roman and Helvetica are equally legible. (p. 243,
my italics)
But exactly how ‘convincing’ is the scanning experiment for ‘determining’
the legibility of continuous text? The authors contend that since
305. That is, the ones reported in Electronic Publishing.
A review of empirical studies
/
203
searching for a certain word does not require ‘verbal fluency’ and
comprehension, the ‘physical structure’ of the text and typeface will
cause the only interference with the scanning process. And since the
‘physical structure’ of the text is the same for both typefaces, the
different typeface designs are what is left as possible causal factors in a
otherwise carefully constructed experiment. Since de Lange found no
significant difference between the roman typeface and the sans serif
typeface as measured by the scanning procedure, he concludes that there
is no significant difference between the legibility of the two typefaces
(p. 246).
However, in the dissertation, and only there, de Lange wisely points
out that the experiment might lack internal validity: it is possible that
the experimental design is simply not sensitive enough to measure small
differences in the design of two common typefaces. In order to verify the
sensitivity of the experiment, he therefore devised a second scanning
experiment comparing Times Roman with the ‘obviously less legible’ and
mannered chancery cursive script typeface Zapf Chancery (reported in
the dissertation only). Here he found a significant statistical difference
in favour of the roman typeface, confirming the scanning approach as
sensitive and valid.
Nonetheless, I am not convinced: the fact that a highly mannered
and ‘obviously less legible’ typeface like Zapf Chancery produced the
difference which de Lange found, does not rule out the possibility that a
crude scanning procedure (which undoubtedly is appropriate when
certain aspects of the visual organisation of documents are being tested),
simply is not sensitive enough to detect possible differences of legibility,
whether such differences exist or not, among well designed ‘common’
typefaces such as Times Roman and Helvetica.
Whether this study has fulfilled its aim ‘in the absence of
satisfactory empirical data ... to determine the comparative legibility of
sans serif and roman typefaces’ or not, is an open question.
As a conclusion, in order to justify their study, the authors state
that ‘The results of this study differ from the opinion of most authors on
the subjects of typography, legibility and printing.’ (p. 246,). Echoing Ken
Garland, and Cheetham, Poulton and Grimbly back in 1965, de Lange
continues:
A review of empirical studies
/
204
These results can be interpreted as promising for designers and
typographers, as it appears that legibility will not necessarily be
sacrificed when sans serif typefaces are used for textual matter. (p. 246)
De Lange initially pointed out that ‘Many typographical practices
are still based on the belief that romans are the most legible typeface to
use for text.’ (p. 243). Nevertheless, even if we accept de Lange’s
conclusion at its face value (which with regard to its content (‘equal
legibility’) is very reasonable), one problem still remains, the possibility
that de Lange’s notion of legibility (or any measurable legibility
construct) simply does not correspond with the legibility notion of the
‘many typographical practices’ he refers to. That is, that the preference of
many designers for roman typefaces in continuous text, although
justified by a perhaps vague reference to ‘legibility’, is simply based on
something else.
Noteworthy, in addition to the experiments, de Lange also includes
an interesting and informed pro et contra discussion where he presents
sensible counterarguments to many of the arguments which promote
roman typefaces as more legible than sans serif typefaces a discussion
which by the way inevitably enforces the impression that his original
motivation and starting point is as a fond advocate of sans serif
typefaces. One of the pro-serif arguments which de Lange refutes
through his counterarguments is the one which says that serifs assist
the horizontal flow of reading and eye movements: de Lange points out
that eye-movement research has shown that the eyes while reading
continuous text do not follow lines of text in a strict horizontal flow, but
in quick saccadic jumps.
A review of empirical studies
/
205
Silver and Braun 1993
*
The authors of this paper are N. Clayton Silver and Curt C. Braun. At
the time of publication, they were both affiliated with the University of
Central Florida in Orlando. Together with various co-authors, they have
published extensively in journals like Safety Science and Ergonomics, as
well as in conference proceedings, during the 1990s on warning design
and similar topics.
The rationale provided for this legibility study is reasonable. The
context is the design of warning labels for potentially dangerous
consumer products. The fact that ‘many consumer product related
injuries occur despite warnings’ prompted this investigation of ‘factors
influencing the willingness to read a warning’. Details of the study’s
design as well the statistics indicate high standards.
The authors suggest that there are several attributes of warning
labels that influence the willingness to read a warning. Among these
factors are ‘perceived hazardness’, ‘attractiveness and understand-
ability’, and ‘print conspicuity’. ‘Print conspicuity’ has according to the
authors recently received increased attention, and they refer to a
recent study that demonstrated that warnings set in 18 point Times with
orange highlighting ‘was recalled better than’ warnings set in 12 point
Helvetica with no highlighting.306 On this background Silver and Braun
want to find out whether it is the typeface or the typesize that
contributes the most to such an effect.
The authors refer to two comparative studies on sans serif versus
serif typefaces which involved long passages of text: To Vanderplas and
Vande rpla s ( 1980 ), wh er e text se t i n th e s erif ty pef ace Cen tury
Schoolbook was read faster than text set in the sans serif typeface
Helvetica, and to Moriarty and Scheiner (1984), who found no differences
in reading speed between Helvetica and Times. Although the results of
* N. C la yt on S ilve r an d Curt C. Bra un . 1993. ‘Per ce iv ed rea da bi lity of warn in g labels
with varied font sizes and styles’. Safety Science, vol. 16, no. 5/ 6, pp. 615625, in a
special issue on ‘Warning and risk communication’, edited by D.M. DeJoy and
M.S. Wogalt er.
306. The fact that a warning set in a large highlighted typeface ‘was recalled better’ can
hardly be surprising.
A review of empirical studies
/
206
the two cited studies apparently are heterogeneous (with ‘differing
results’ according to Silver and Braun), and only one of the studies thus
found a difference, Silver and Braun still want to find out if there are
‘similar differences’ (sic) on shorter warning labels. Nevertheless, an
important premiss for the study is that Helvetica is considered to be the
‘standard’ typeface to use on warning labels, according to several
manuals on the subject.307 The authors want to verify the rationale
behind this ‘standardisation’.
Four type-related variables ‘associated with readability’ were
examined: typeface (while focusing on the question of serif vs. sans serif
typefaces), typeface weight, typeface size, and size contrast between the
signal word and the main body of the warning. Three typefaces were
included: The sans serif typeface Helvetica, and the two serif typefaces
Times and Goudy. The warnings varied across the four typeface-related
variables and were presented on 24 mock-up detergent labels. Forty
undergraduate students and 22 elderly non-geriatric persons
participated in the study.
The study can best be described as a ‘subjective legibility study’,308
rather than a ‘subjective preference study’, because it deals with
perceptions of how legible the text is (the authors use the expression
‘readable’), and not the participants preferences for this or that
alternative. Confusingly however, the authors use the term ‘readability’
in three different ways. First, they refer to readability as the lexical and
semantic level of understanding of the text used in their material (‘the
Kincaid readability index’ was used on the material in order to ascertain
that the text could easily be understood by the participating subjects).
Second, ‘readability’ is one of the three variables that the subjects should
rate the material according to (the two other variables are: ‘how likely
they would read the warning’ and ‘its salience’). Third, the three
variables (because they showed a high degree of correlation) were
‘merged’ into the composite variable ‘perceived readability’.
307. The authors refer to FMC Corporation 1985; and Westinghouse 1981.
308. For a u se ful di sc ussi on on ‘ subj ec ti ve’ st ud ies in thi s co ntex t, see t he sub se ctions
‘experimental studies’, ‘field studies’, and ‘subjective studies’ in Edworthy and Adams’
Warning d es ig n (1996, pp. 4968).
A review of empirical studies
/
207
Thus, the final and composite dependent variable ‘perceived
readability’ was based on subjective ratings in response to a 24-page
questionnaire with predetermined anchors of ‘how likely they would read
the warning’, ‘its salience’, and ‘readability’. The authors point out that
the construct ‘perceived readability’ should not be confused with actual
reading performance’, and they modestly admit that this might represent
a limitation of the study.
Among the results: Helvetica was perceived to be more readable
than Times and Goudy (‘a significant main effect’). And bold type was
perceived as more readable than normal type.
The superior perceived readability of Helvetica bold when compared to
the other font type and weight combinations conforms to the current
standards (p. 622)
However, and of great consequence for the validity of the study:
Out of the blue, in the middle of their paper,309 the authors almost by
accident reveal that the Helvetica font used in the study is Helvetica
Condensed (the two serif typefaces included in the study are not
condensed). This information is not provided anywhere else in their
paper, and their final conclusion reads accordingly:
The findings outlined in this study have implications for warning
design. To increase the perceived readability of warnings, warning
labels should be printed in a bold sans serif font such as Helvetica
(p. 623)
Although Helvetica can easily be regarded as a more or less generic
sans serif typeface, the condensed variant is certainly not, something
which undermines the generalisability and thus external validity of this
study (based on the authors’ own conclusion). Also: the condensed
variant of Helvetica is not compared to condensed variants of the serif
typefaces Times and Goudy (whether designed so, or electronically
modified on the fly). This confounding condition, which introduces
appearance width as a nuisance variable, alone undermines the internal
validity of this non-experimental study: the authors’ conclusion could as
well be that warning labels should be printed in condensed typefaces.
The distinctive lesser area occupied by the text set in the condensed
309. In the main text and in a caption on p. 619.
A review of empirical studies
/
208
typeface than for the two serif typefaces only amplifies this confounding
condition.
Furthermore, the reason for including the serif typeface Goudy in
the study (of all possible serif typefaces), is not exactly convincing:
because it is a serif typeface ‘widely used in marketing applications’ that
‘has yet to be studied in terms of readability’. Nevertheless, to include
two serif typefaces but only one sans serif typeface when it is exactly
generic differences which are focused upon is unreasonable. There is
always a chance that an additional sans serif typeface could have altered
the mean result in either direction for the sans serif category.
Although in their introduction the authors are somewhat cautious
about the validity of their approach (‘not measuring actual
performance’), they are nonetheless far more categorical in their
conclusion (‘warning labels should be printed in’). Systematic
investigation of peoples’ subjective preferences ought to yield valuable
information; humans are after all not automats. However, to ask if a bold
variant of a typeface is more ‘salient’ than a normal variant of the same
typeface, must necessarily produce a near unambiguous answer. This
indicates another problem with this research: that the design of the
questionnaire is not entirely sound. Nevertheless, the invalidating
problem inherent to this investigation is not its main ‘subjective’
approach, but the authors’ lack of relevant domain knowledge about
typographic variables that might function as ‘nuisance variable’ in
studies like this.
This lack of domain knowledge also manifests itself in other ways
than in the authors’ inability to perceive confounding typographic
variables. The authors’ discussion on the use of bold type versus normal
type is one instance where they display their lack of relevant (and
crucial) typographic domain knowledge. Like Taylor (1990; see above),
the authors display a not un-common confusion when it comes to the use
of bold typefaces. The use of bold type is wrongly understood as in
general being an alternative to regular type, and not a contrastive
navigational device or a complementary articulatory device. So the
authors inform the reader that: ‘Ralph (1982) suggested that boldface
type should be used for stressing terms and ideas, rather than for all
textual material’. This understanding is hardly groundbreaking,
A review of empirical studies
/
209
reflecting, as it does, a practice of textual articulation which has existed
since the mid 19th century.310 Moreover, by citing the discredited
researcher Cyril Burt without reservations (p. 623), the authors also
display lack of sufficient knowledge of relevant research literature.311
There are interesting ‘findings’ in this paper (on size contrast
between the signal word and the main text of a warning), but that is
another story, and not of relevance to my investigation here. But at the
same time, there are also some implausible suggestions. Among other
things, the authors suggest that a possible reason why 10-point type is
more readable than 8-point type, ‘is because the majority of type sizes
used in general text range from 9-point to 11-point’ (sic). The ‘familiarity
thesis’ of legibility is here stretched in absurdum.
The results of this research seem to be taken for granted by people
in the human factors research community, eager to translate ‘research
findings’ into guidelines for practical design. A recent paper on the
development of design guidelines, published in a book on ‘human factors
in consumer products’, simply states: ‘See Silver and Braun (1993) for
readability issues of warning labels’ (Bonner 1998, p. 253). Bonner’s
paper, which for the most part is a reflective and useful piece of writing,
even promises that its aim is to deal with ‘broad issues at a high level of
abstraction from the interaction process’ (p. 241). Nevertheless, Bonner’s
reference to Silver and Braun‘s low level and categorical and
problematic research findings, has slipped through without
reservation. Interestingly, the researchers Judy Edworthy and Austin
Adams, in their recently published monograph Warning design, are more
modest on behalf of the potential of research on ‘conspicuity’ and
‘attention getting’, while suggesting that
a good textbook on graphic design and typography, such as Baird et al.
(1993), will be more informative than an attempt to use any of the
available behavioural research data. (1996, p. 25)
310. For two p aper s on t he u se o f bold t yp efac es for a rt icul at or y an d naviga ti on al
purposes, see Twyman 1986, and 1993.
311. See Rooum 1981; and Hartley and Rooum 1983; as well as the section ‘Burt, Cooper
and Martin 1955 / Burt 1959’, above in this chapter.
A review of empirical studies
/
210
Nevertheless, to conclude: Silver and Braun’s categorical conclusion
is inappropriate, and their study lacks validity, largely due to lack of
adequate and crucial typographic domain knowledge.
Silver, Kline and Braun 1994
*
This study, ‘Type form variables: differences in perceived readability and
perceived hazardousness’, published in the Proceedings of the Human
Factors and Ergonomics Society 38 annual meeting 1994, is another
study by N. Clayton Silver, Curt C. Brown, and a co-author.
The primary object of interest in this study is the effective design of
warning labels, as in Silver and Braun (1993). The rationale provided is
similar: many injuries occur because people don’t read warning labels.
The authors suggest that this can be caused by many factors, like a high
degree of familiarity with the product, underestimation of the risk, or
inadequate labelling (whether lexical-semantically or graphically). While
focusing on the labelling issue, the authors ‘first purpose’ is to ‘determine
if there is greater perceived readability’ for Helvetica than for the two
serif typefaces Century Schoolbook and Bookman.
Warning labels adap ted from a common household insecticide are
employed. The three typeface related variables in question are: Typeface;
size of the signal word; and size contrast between the signal word and
the main body of the warning. The resulting composite variable
‘perceived readability’ is averaged from two of the questionnaire
variables: the variables ‘salience of the warning’ and ‘warning
readability’.
The results are generated on the basis of subjective preferences
expressed by 50 undergraduate students who rated 36 insecticide
warning labels which varied across the three type related variables
(3 3 4 factorial design). The ‘font variable’ covered 3 typefaces: The
sans serif typeface Helvetica and the two serif typefaces Bookman and
* N. Cl ay to n Si lver, Pau l B. Kline, and Curt C. Braun. 1994. ‘Type form variables:
differences in perceived readability and perceived hazardousness’. Proceedings of the
Human Factors and Ergonomics Society 38th annual meeting 1994, pp. 821825.
A review of empirical studies
/
211
Century Schoolbook. One of the illustrations reveals that this time the
Helvetica in question is not Helvetica Condensed (as in Silver and Braun
1993; see above), but an Helvetica variant of a ‘normal’ or ‘regular’
appearance width.
Among the results: the serif typeface Century Schoolbook is
perceived as more ‘readable’ than the serif typeface Bookman, and both
serif typefaces are more ‘readable’ than the sans serif typeface Helvetica.
Thus, although compared to different serif typefaces than in Silver and
Braun 1993, in this study Helvetica comes last, in contradiction to the
former study.
Whether this result (and its striking contrast to the result in the
former study) has something to do with individual differences between
the serif typefaces employed in the two studies, or whether the answer
simply is that condensed typefaces are the most readable (see my
argument above, in the section ‘Silver and Braun 1993’), or whether the
two groups of participants have different preferences, is an open
question. However, the only possible explanation (implicitly) offered by
the authors, suggests that the difference in stimulus material (domestic
insecticide labels versus domestic detergent labels) has produced the
different results (see below).
Although the authors discuss the results of this study in the light of
heterogeneous results in previous studies conducted by themselves and
other researchers, they still point out that
the superior perceived readability … of Century Schoolbook as
compared to Helvetica and Bookman corroborates with earlier research
(p. 824) 312
By ‘earlier research’ the authors here refer to a paper ‘in press’ by one of
the authors together with another co-author (Smither and Braun 1994)
published in the Journal of Clinical Psychology in Medical Settings, and
the paper by Vanderplas and Vanderplas (1980) (both papers are about
‘reading speed … among older adults’). In this context they also refer to
typographer Ruari McLean’s widely used textbook on typography in
312. However, on the first page in their paper, the authors refer to research that points in
another direction (i.e. Moriarty and Scheiner 1984; Silver and Braun 1993; and an
unpublished paper by Silver, Braun, Zeigler and Witt (‘Perceived readability of
various fonts’, 1993).
A review of empirical studies
/
212
order to substantiate their argument: ‘McLean (1980) pointed out that
serif typefaces are easier to read when longer amounts of text are used.’
However, they refer to McLean as if McLean’s book is a research paper,
without mentioning explicitly that it is a textbook on typography. The
bewildering thing is that the authors, who only seems to trust ‘empirical
evidence’, suddenly refer to typographer McLean’s point of view as if it
carries the same authority as ‘empirical research’; but without
mentioning explicitly that McLean’s point of view does not represent
‘empirical research’.313 Accordingly, in Silver and Braun 1993, while
doing a brief literature review, they referred to McLean in an even more
ambiguous manner: the word ‘found’ in the expression ‘McLean (1980)
found that serif typefaces are easier to read’ (p. 622, my italics), can
denote ‘in his opinion’, but in the parlance of research papers like the one
in question it denotes ‘research finding’.
The authors modestly deliver ‘obligatory’ qualifying statements
about the validity of their research: First, the limitation that the
research is not about ‘objective’ performance measurements. And
second, that the stimulus material was insecticide labels: ‘It is equivocal
as to whether these results will generalise to other products that are
perceived as more or less hazardous’. I doubt that these reservations are
the most urgent.
Except for the single qualifying word ‘consideration’, the authors
conclude categorically on the choice of a typeface for warnings:
These results have important implications for warning design. To
increase the perceived readability (or perceived hazardousness),
consideration should be given to printing warning labels in Century
Schoolbook. (p. 824)
The recommendation is even more low level than the recomm-
endation in Silver and Braun (1993). This time, not a typeface category,
but a particular typeface, is recommended. And furthermore, it even
belongs to the opposite category than the category recommended last
time. Furthermore, the fact that a particular typeface has been
recommended and that that typeface has only been compared to two
313. McLean carries forward a reasonably plausible argument in favour of using serif
typefaces for running text; but that is not my point here.
A review of empirical studies
/
213
other typefaces does not seem to bother the authors, while ignoring the
large number of readily available typefaces that was not included in the
study.
The authors are apparently unaware about the inherent problems
connected with the translation of low level and possibly chance results
like the one in question to general recommendations. In addition: the
heterogeneity of the results between this study and Silver and Braun’s
1993 study suggests on its own that translations of this research into
recommendations for practical designing is hardly feasible.
The other composite dependent variable in this study, ‘perceived
hazardousness’, and especially its correlation with ‘perceived readability’
is problematic (see pp. 823824).
the labels printed in Century Schoolbook were perceived as more
hazardous than those printed in either Bookman or Helvetica (p. 823)
There was a significant linear relationship between perceived
hazardousness and perceived readability … Hence, the greater
perceived readability, the greater the perceived hazard. (p. 824)
First: To subjectively rate or ‘perceive’ a typeface as more or less
readable (and salient!) is more about utilitarian aspects that possibly has
something to do with the experience of mediated effectiveness, or some
kind of experience of ease of seeing. However, to rate typefaces on
‘perceived hazardousness’ is more about semantic connotations. Second:
the fact that ‘perceived readability’ and ‘perceived hazardousness’
strongly correlate should get the alarm bells ringing: Do the participants
really know what they are rating? Are there any confounding ‘nuisance’
variables? What if other (control) variables were introduced, and they
correlated as well? Maybe then one conclusion that could be drawn is
that the ratings simply have to do with ‘saliency’ (or something else, for
that matter)?
That a typeface perceived as the most ‘readable’ also is associated
most with hazard, is puzzling. It sounds reasonable that a typeface that
is perceived as the most ‘readable’ can also be the most effective (in a
utilitarian way) in conveying hazardousness. However, it is doubtful that
such an explanation is in accordance with a reasonable interpretation of
Silver, Kline and Braun’s exposition, where they recommend that
A review of empirical studies
/
214
To increase the perce ived readabili ty (or perceiv ed hazardousness),
consideration should be given to printing warning labels in Century
Schoolbook. (p. 824)
That the (relative) subtle design differences between, say two rather
anonymous bread and butter typefaces like Century Schoolbook and
Bookman, on its own should have impact regarding ‘hazardousness’ is
nothing but fantastic. My suggestion is that the overall design and
validity of this study is seriously flawed.
It is admirable that the authors in this paper not only cite the
research literature and manuals on warning design, but also Ruari
McLean’s manual of typography (i.e. Mclean 1980). However, it should be
pointed out that there is a puzzling lack of references to instances of
highly relevant research literature. Although there are references to
Paterson and Tinker, Reynolds, and two typeface legibility studies per-
formed by someone else (i.e. Van de rp las and Van de rp las 198 0, an d
Moriarty and Scheiner 1984), there are no references to any of the many
existing typeface preference studies,314 or to any existing typeface
connotation studies.315 Accordingly, there are no references to papers
where the correlation or non-correlation of various operational constructs
are critically discussed. What is more, Salcedo et al.’s (1972) extensive
and fairly well known legibility study that explicitly discusses316 (and
discards) the correlation of operational constructs (including speed of
reading, comprehension, and reader preference), which is based on the
same concerns as Silver et al.’s study, is not mentioned. The rationale is
the same (people don’t read the warnings), the topic is similar (pesticide
labels), and a similar factorial design is employed (three typeface
variables).
A final note: In a later published semantic typeface study (about
‘perception of product hazard’) published in 1995 (Braun and Silver
314. See the section ‘Subjective preference studies’, in chapter 2.
315. See the subsection ‘A note on “semantic” studies’ above in this chapter.
316. Where the authors, in contradiction to Silver et al., discard such correlation.
A review of empirical studies
/
215
1995),317 the authors acknowledge that the results of their legibility
studies and the other few studies they have cited, are contradictory
(pp. 984985). Furthermore, although the authors believe that the
presence or absence of serifs affects legibility, they after all seem to have
reached a new understanding, while suggesting that serifs or not may
not affect semantic connotations of hazardousness:
the underlying hazard continuum for typefaces might not be adequately
described by the presence or absence of serifs [sic] (p. 985)
Wheildon [1984] 1995
*
Back in 1984 the Australian newspaper and magazine editor Colin
Wheildon published a booklet with the tickling title Communicating, or
just making pretty shapes? (Wheildon 1984). The third edition of this
booklet came out in 1990, reaching a total of 20,000 printed copies
(Wheildon 1995, p. 14). According to the preface in the original booklet,
the study had been accepted as a ‘mass communication research paper’
at the Virginia Commonwealth University in the USA, and had also been
presented at various seminars, among these at the Poynter Institute for
Media Studies in Florida. The aim of the study was to subject ‘some of
typography’s maxims to research’. One of the questions Wheildon carried
out research on was the relative ‘comprehensibility’, and ease of reading,
of continuous text set in either roman or sans serif type.
317. ‘Legibility’ appears in this study only as an independent variable in a factorial
design. That is, there are two levels of legibility, one represented by capitals set in the
‘high legibility’ sans serif typeface Helvetica, and the other by capitals set in the ‘low
legibility’ decorative art noveau display typeface Arabia. (Arabia is an imitation of the
typeface Arnold Böcklin, supplied with the illustration software package
CorelDRAW).
* Colin Wheildon. 1995. Type & layout: how typography and design can get your
message across or get in the way. Edi ted a nd with an i ntrod uctio n by Mal Wa rwi ck.
Berkeley, California: Strathmoor Press. See also Colin Wheildon. 1984. Communi-
cating, or jus t mak ing p retty shapes ? a study of va lidit y or otherwise of some
elements of typographic design. Sy dney: Newspap er Adv ertis ing B ur eau o f Aust ralia .
This section of the thesis has, except for a few minor alterations, appeared as a book
review in Information Design Journal (Lund 1998).
A review of empirical studies
/
216
This research on typeface ‘comprehensibility’ was based on a test
which involved 224 subjects (in two groups, one of which was a control
group) who had read magazine- or newspaper-like articles with content
of ‘direct interest’ to themselves. In addition, answers and comments
from the readers to ‘leading’ questions about their attitudes ‘were
collected for anecdotal rather than scientific value’ (1984, pp. 89).
Texts set in the newspaper typeface Corona and the sans serif
typeface Helvetica were compared (pp. 1820). The sensational results
showed that 67 per cent of the readers reached a high comprehension
level and 14 per cent a low comprehension level when reading the
material set in Corona. By contrast, only 12 per cent of the readers
reached a high comprehension level, and 65 per cent a low
comprehension level, when reading the material set in the sans serif
typeface Helvetica.
In addition, of the 67 readers who had reached only a low level of
comprehension when faced with the sans serif type: ‘53 complained
strongly about the difficulty of reading the type’, ‘11 said the task caused
them physical discomfort’, ‘32 said the type was merely hard to read’,
‘10 said they found they had to backtrack continually to try to maintain
concentration’, ‘5 said when they had to backtrack to recall points made
in the article they gave up trying to concentrate’, ‘22 said they had
difficulty in focusing on the type after having read a dozen or so lines’.
Wheildon stressed that when the same group afterwards read an article
in the serif typeface ‘they reported no physical difficulties, and no
necessity to recapitulate to maintain concentration’. In order to stress
the implications of his ‘findings’ Wheildon pointed out that this means
that if the body text of an advertisement in a magazine with one million
readers is set in a sans serif typeface, then ‘the message will be
comprehended thoroughly by only 120,000 of our readers’ (1984, p. 4).
Wheildon did not leave the question with this ‘empirical evidence’.
He extended his argument against the use of sans serif type for
continuous text, by referring to the (postulated) optical phenomenon of
irradiation which allegedly interferes harmfully with the perception of
A review of empirical studies
/
217
print.318 He did this by including an oblique reference to R.L. Pyke’s
Report on the legibility of print prepared for the British Medical
Research Council in 1926; but without mentioning the author’s name or
including any bibliographical information:
In research collated by the British Medical Council [sic] in 1926, it was
asserted that the absence of serifs in sans serif body type permitted
what the council referred to as irradiation, an optical effect in which
space between lines of type intruded into the letters, setting up a form
of light vibration, which militated against comfortable reading. Serifs
the research said, prevented this irradiation; thus serif types were
easier to read. (1984, p. 18)
It was not the Medical Research Council or Pyke who ‘asserted’ or ‘said’
this about serifs and irradiation. What Pyke did, in his very
comprehensive review of legibility studies, was simply to include, in a
duty-bound and non-approving manner, coverage of six papers where the
authors on a non-experimental basis ascribed their preference for serif
typefaces by arguing that serifs helped to prevent harmful irradiation
(see Pyke 1926, pp. 21, 76, 99101).
After this rather dubious appeal to the authority of past research,
Wheildon showed a reproduction of a magazine page set in a sans serif
typeface. He commented: ‘An attempt to read this, and comprehend it
thoroughly, will lend a measure of credence to the irradiation theory.’
(1984, p. 19).
Wheildon also vigorously, amusingly, and earnestly, argued against
the idea that the legibility of sans serif typefaces is a question of habit
and of people getting conditioned to them:
This is nonsense. It’s analogous to saying that instead of feeding your
children wheatie pops, you should feed them wood shavings. They’ll get
used to them and in term will learn to love them. (1984, p. 20)
No wonder Wheildon’s conclusion was unambiguous ‘body type
must be set in serif type if the designer intends it to be read and
understood’ (1984, p. 20).
318. ‘“Irradiation” is the apparent extension of edges of an illuminated object as seen
against a dark ground. A bright point appears bigger than reality, while a dark point
appears smaller’ (Watts and Nisbet 1974, p. 32).
A review of empirical studies
/
218
Then, in 1995, Wheildon published Type & layout: how typography
and design can get your message across or get in the way. It is a
collection of research material formerly published in three separate
booklets. The content of his 1984 booklet constitutes a substantial part of
this new book, which claims to be ‘an international benchmark study on
the effects of typographic elements on the reader’ (p. 244).319
If we are to believe Wheildon, his research has been ‘adopted by the
New South Wales State Government, and influences design standards
for all new state legislation aimed at improving comprehension in
Australia’ (p. 244).
It is an interesting publication, not only for its dubious content, but
also for its persistent appeal for external praise and legitimacy. In fact, this
becomes excessive. The eight pages of praise at the front of the book, as well
as a foreword and the editor’s introduction, are nothing short of hilarious.
Among the persons praising the book on these pages are the famous
advertising specialist David Ogilvy, the distinguished graphic designer
Milton Glaser, the renowned graphic design consultant Jan White, a
university professor, an ex-university professor, the director of design at the
University of California Press, and the editor-in-chief and director of
magazine development at Hearst Magazines. Here are some samples:
Now there’s nothing left to argue about! At last, conjecture, tradition,
and hearsay have been replaced by the irrefutable logic of numbers and
scientific survey results.
It’s reassuring to find that the typographic truths I have practised
for years have some basis in hard evidence.
Hitherto designers have to rely on their guesses as to what works
best in choosing the typography and lay out. … Thanks to Colin
Wheildon, they no longer have to guess. No guesswork here. Only facts.
Unfortunately, many of the ‘rules’ of effective design and
typography I’ve offered have been based primarily on age-old truisms or
minor research studies. But Colin Wheildon has changed all that. For
the first time there is a body of statistically significant research to show
which techniques achieve the objective of all printed material:
maximum communication with readers.
It scientifically proves some long-held assumptions and makes
other surprising and important discoveries.
319. The material on sans serif versus roman typefaces is found in the book on
pp. 1924, 5360.
A review of empirical studies
/
219
This book reports the results of nine years of his hard-nosed,
rigorous research.
Wheildon himself contrasts his research with the former state of affairs
in the following way:
Texts on typography freq uently allude to research into some of t he
elements to be examined, but, regrettably, discussion of this research is
usually anecdotal rather than empirical. (1995, p. 186) 320
Moreover, the author elsewhere in the book lists the name of four
advisors, of whom three are university professors and one is a Dr and
director of research (pp. 190192). In addition Wheildon claims to have
sought the advice of research consultants and academics in the United
States, Britain, and Australia, and [I] submitted my proposed
methodology, and later the results, to them for reservation, comment, or
dissent. The consensus: that the study was both valid and valuable
(p. 35)
Among the people from whom Wheildon claims to have sought
advice are professor David Sless (the information design specialist),
David Ogilvy, professor Rolf Rehe (who once published a book on
legibility research), members of the academic staff at the University of
Reading, and members of the academic staff at the Royal College of Art
in London (p. 192). My guess is that it is highly doubtful that these
persons were involved in Wheildon’s work in the way that he implies,
and that they would not authorise such use of their names.
Interestingly, Wheildon’s original 1984 publication, together with
one of the two other booklets on which Type & layout is based, recently
received approval in a book from the Policy Studies Institute in London:
Designing public documents: a review of research (Kempson and Moore
1994). In this book, commissioned by the Department of Social Security
and partly based on work funded by the British Library, Wheildon’s
booklets are the most frequently cited sources of recent research on
typography. At least twelve references are made to Wheildon’s research:
on sans serif versus roman type, justified versus unjustified text, lower
case versus upper case, the use of bold type, and the use of italic type
(Kempson and Moore 1994, pp. 5156, 283286). Statements like: ‘Most
320. Also in Wheildon 1984, p. 8.
A review of empirical studies
/
220
readers found it difficult to hold their concentration when reading the
sans serif text’ are offered without reservation. The authors’ statement
‘We hope … that we have covered all the significant research’ (p. 1), is
depressing in light of what a close reading of Wheildon’s work reveals.
It is worth asking whether this lack of discrimination is unique to
Kempson and Moore’s book, or is symptomatic of how ‘research findings’
often get summarised. Karen Schriver, assessing ‘the research literature
on typography’, in her impressive Dynamics in document design, more or
less uncritically includes two references to Wheildon’s book (Schriver
1997, pp. 274278). One is to Wheildon’s research on roman and sans
serif typefaces. Furthermore, in a review of the book in the journal
Tech nical Comm unication, it is recommended
to all technical communicators who are interested in making sure that
their content is delivered as comprehensibly as possibly. Furthermore,
I encourage them to use it as a starting point for further research and
consideration of typographic design. (Skarzenski 1996, p. 426)
At the time of its publication Wheildon’s book received coverage in
the American typography magazine x-height (1995). The irony is:
whereas this unpretentious non-scholarly magazine admirably treats
Wheildon with tongue in cheek, two books on information design, as well
as the journal Tech nical Communicat ion, reveal difficulties in
discriminating between seriousness and blatant outrageousness.321
Van Rossum 1997
*
In 1997, a typeface legibility study appeared in the Dutch/Belgian
scholarly journal Quærendo (a leading journal within the field of
analytical bibliography and printing history). The author, Mark van
Rossum, a scientist in theoretical physics, with a doctorate from the
321. Wheildon’s book has produced many more reactions, and among the ones I have come
across, only positive, except for a contra and pro exchange (of rather unsubstantial)
letters to the editor in the American typography magazine Serif (no. 5, 1997, p. 7;
no. 6, 1998, p. 4).
* Mark van Rossum. 1997. ‘A new test of legibility’. Quærendo, vol . 27, n o. 2,
pp. 141147.
A review of empirical studies
/
221
University of Utrecht,322 claimed to have developed a brand new and
easy to perform method for measuring legibility; a method ‘more
accurate and more objective than traditional tests’ that also ‘clarifies the
… the function of serifs’. What’s more, according to the author, the
method is based on nothing less than ‘the reading process and the
properties of the human eye’. In other words, bold claims, put forward by
a scientist in a highly regarded scholarly humanities journal.
Van R ossu m ac know ledg es that an op erat ion al m eth od l ike ‘s pe ed of
reading’ has seldom demonstrated any significant differences in the
legibility between typefaces in general or between serif and sans serif
typefaces. However, he states that it is ‘still very important to know
which typeface is the most legible, even if the differences are slight’. The
reason for this, according to van Rossum, is that the more legible
typefaces can be used in smaller sizes, and thus yield ‘massive savings in
printing and paper costs’.
Van R ossu m ha s d evis ed an oper ati onal me thod whe re imag es of
text scanned into a computer are blurred by a Gaussian function. The
blurred images are inspected visually by the researcher in order to
determine the relative legibility of the typefaces involved.323 The
theoretical construct behind his method is based on the fact that when
we read, in each fixation only a limited number of characters are seen
sharply in the visual field, and that the eye’s acuity is somehow
increasingly diminished outside the centre of fixation (both within and
outside the visual field). On this background van Rossum theorises that
letters have to stay legible also at the edges of the visual field (where
they are increasingly blurred), in order to allow a perceptual span as
large as possible. Van Rossum’s blurred text images are thus claimed to
simulate text as it is perceived around the edges of the visual field.
The typeface that withstands the detrimental effect of the blurring
with respect to the number of recognised letters will be the most legible.
322. Quærendo, vol. 27, no. 2, ‘ Notes on c ontri butors’.
323. The details about the practical implementation of van Rossum’s test procedure do not
come through clearly. This is partly due to inconsistencies between what the
illustrations show, what the captions say, what the main text say and what examples
in the main text say.
A review of empirical studies
/
222
Note that all typeface sizes are normalised according to the x-height, the
dominant dimension of a typeface.
Interestingly, this legibility study does not only involve ‘standard’
serif and sans serif typefaces, but also the new serif typeface Gulliver,
designed by the author’s fellow countryman, typeface designer Gerard
Unger. The study was actually ‘prompted by the arrival of Gulliver, a
new typeface that has been designed to be more legible than the common
typefaces, meaning that it can be set smaller’.324
The results: Serif typefaces were more legible than sans serif
typefaces. Gulliver was deemed to be 5.7% more legible than the serif
typeface Times, and respectively 7.7% and 8.9% more legible than the
two sans serif typefaces Helvetica and Argo. A cynic would claim that
this research whether intentionally or not can easily be mistaken for
a not too subtle marketing ploy for the typeface that best fit the test
parameters, namely Gulliver.
Nevertheless, not only does van Rossum’s study resemble a
marketing ploy, while being immersed in over-confident rhetoric. His
test and test-results are based entirely on unsubstantiated assumptions.
It is correct that the eyes only fixate on a limited number of letters
during a fixational pause, and that the surrounding letters are somehow
looking increasingly diffuse outside the centre of fixation. However, van
Rossum provides no evidence that suggests that the letters around and
outside the edge of the visual field appear blurred in the same or even
similar smeary way in real human vision. And even if there to some
extent should exist a similar blurring effect, then there is no guarantee
that another similar mathematical blurring function (or an optical filter
of one kind or another) would produce the same results.325
Subsequently the author comes up with something which at first
sight promises to qualify the bold claims about his method’s excellence:
He suggests that the test he has demonstrated is only ‘a first step on the
way to a wholly objective test of the legibility of different typefaces’.
However, it is soon revealed that by this statement van Rossum does not
324. Gulliver has large counters and a relatively large x-height.
325. Compare for example, that although the author states that Helvetica’s ascenders
‘soon disappear in the “mist” ’, he also admits that Helvetica ‘benefits from its clean
style’ in some letter combinations.
A review of empirical studies
/
223
refer to the basic quality of his method, but rather to an aspect of its
technical implementation. He suggests that in the future, computers
using neural networks for executing the visual inspection of his test
procedure ‘will make the test truly objective’. And this is not all; at this
stage the author suggests that type designers, in order to ensure
maximum legibility of their end product, can use his easily implemented
test during the design process.
To conclude: This study, based on unsubstantiated assumptions,
with categorical and over-confident formulations, while published in a
highly respected scholarly journal, is ‘truly’ a suspect piece of legibility
research.
224
6 Discussion: knowledge
production and technical
rationality
A comment on the reviewed studies
As suggested in the introduction, to synthesise or summarise the face
results of the reviewed studies (on the relative legibility of serif and sans
serif typefaces) would represent a betrayal of the intention and approach
of this thesis. This thesis has never been meant as yet another legibility
study, only in a comprehensive review format. On the contrary: the
thesis is preoccupied with the production and validity of knowledge that
is intended to be applied in typography and information design. The
thesis is meant as an epistemological study about the history of legibility
research, all the time while documenting one ‘genre’ of legibility research
and using that ‘genre’ as a lens.
However, one thing can be said for sure: The face results are not
conclusive. Some studies found serif typefaces most legible, some studies
found sans serif typefaces most legible, while some studies did not find a
difference. The tremendous effort that has gone into producing the 72
typeface legibility studies,326 based on a wide variety of rationales and of
operational methods, has not resulted in an incremental body of
326. Note that the actual number of publications is bigger, since many of these studies
appear in two or three guises; for example as a thesis, in an academic journal, and in a
printing trade journal.
Discussion: knowledge production and technical rationality / 225
knowledge. Nor has it resulted in any clarifying theories. By some
arguments, we are no further than we were in 1886.
While assessing the empirical studies one by one, focusing in
particular on one assessment criterion internal validity, the sine qua
non of experimental research it has been demonstrated in the previous
chapter that the overall state of past as well as contemporary
comparative research on typeface legibility is not too impressive.
However, the quality is uneven. On one end of the scale we find Tinker
and Paterson’s relatively sober studies (and impressive research
program). Their studies certainly have faults, but can nevertheless be
situated in an historical context dominated by positivist views of social
science. Their research program seemed timely and it represented a
laudable effort by social scientists to provide a scientific basis for
typographic designing. It is therefore ahistoric to dismiss the work of
Tinker in the way that David Sless does: ‘His [Tinker’s] work
consistently reflects a choice of inappropriate units of analysis.’ (Sless
1981, p. 165).
On the same end of the scale as Tinker we also find in many ways
interesting studies, with modest claims, like English 1944, Zachrisson
1965, Harris 1973, Vanderplas and Vanderplas 1980, and de Lange et al.
1993, which are all relatively sober (but nevertheless not without faults
and inherent problems).
On the other end of the scale we find studies of dubious value, but
which are all the same categorical and self-confident, like Robinson,
Abbamonte and Evans 1971, Taylor 1990, Silver and Braun 1993, and
van Rossum 1997. At the extremity we find studies that are nothing but
outrageous, like Burt 1959 and Wheildon 1984. Nevertheless, as I have
shown above, these are frequently and indiscriminately cited in the
contemporary discursive realms of ‘document design’ and ‘information
design’.
Discussion: knowledge production and technical rationality / 226
Domain knowledge
It has become clear that the lack of internal validity reflects
inappropriate manipulation (or lack of manipulation) of the typographic
stimulus material employed. This, in turn, is to a large extent based on
lack of adequate domain knowledge (of typographic design).327 However,
to speculate that ‘more’ domain knowledge could have weeded out all
serious problems with regard to internal validity is probably too far
stretched. There are many possible pitfalls and many more validity
problems and reliability problems than the internal validity problems
posed by the quality of the typographic stimulus material. This last point
is indicated by the three studies mentioned below.
Some of the authors have in their studies actually demonstrated
crucial domain knowledge. That is, an awareness of the necessity to
manipulate typographic variables that might function as confounding
nuisance variables so that they do not vary systematically with the
independent experimental variable with potential detrimental effect on
the internal validity of the studies in question. I refer specifically to
English 1944, Poulton 1965, and surprisingly, Burt 1959. Interestingly,
this ‘employment’ of target domain knowledge seems to correlate with
extensive contact between researcher and typographic expertise.
Although the study by English (1944) has its limitations (a t-scope
tests of newspaper headlines), it is in many ways a sober study and it
contains among other things a good literature review and a thorough
discussion of theoretical and operational criteria of legibility. But more
importantly, the author has been in close contact with typeface experts.
That is, the famous type designer Frederic Goudy, and Douglas
McMurtrie, director of the Ludlow Typograph Company. I assume that
this contact has had something to do with English’s careful manipulation
of interacting typographic variables, in order to avoid confounding of the
results.328
Although Poulton’s study (1965) suffers from severe weaknesses, it
is balanced (as many sans serif as serif typefaces are included, and each
327. See for example my review of Taylor 1990, in chapter 5.
328. See the section ‘English 1944’, in chapter 5.
Discussion: knowledge production and technical rationality / 227
category is represented by three stylistic idioms), and he adjusts for the
dominant x-height dimension by varying the nominal point size from
typeface to typeface, and he also intervenes on the ratio of size to
interlinear spacing, by varying the interlinear space. And like English,
Poulton has closely co-operated with ‘typographic advisors’, of whom at
least one is a trained typographer.329
Burt’s (1959) otherwise discredited publication reflects a similar
sensitivity by emphasising the lack of correspondence between nominal
type size and actual appearance size, and he also emphasises the
importance of the x-height. It is reasonable to believe that Burt’s
extensive contact with the typeface expert Beatrice Warde of the
Monotype Corporation played a role so that he could articulate these
insights of his target domain.330
An extreme position
The perhaps most serious ‘external’ objection to legibility studies is
expressed by the ‘peripherality of the reading process’ perspective.331
My review has revealed instances of an extreme opposite position
(explained in the next paragraphs). This position is represented by
Poulton (1965), Wheildon (1984), and Taylor (1990). To these three, at
least one of the studies that have not been reviewed could be added, that
is, Kunst (1972).
In order to determine the relative legibility of typefaces Poulton
bases his study on the speed of reading and comprehension. However,
Poulton’s comprehension measure is not merely a check, but equally
weighted with the speed of reading measure. Poulton argues vehemently
for the appropriateness of his operational method and refers to it as ‘rate
of comprehension’. However, as pointed out above, to claim that rather
subtle differences in the design of ‘ordinary’ typefaces can be reflected in
the degree of comprehension (measured by answering questions after
329. See the section ‘Poulton 1965’, in chapter 5.
330. See the section ‘Burt, Cooper and Martin 1955 / Burt 1959’, in chapter 5.
331. See the section ‘Peripherality to the reading process’, in chapter 4.
Discussion: knowledge production and technical rationality / 228
having read a passage for a limited time), seems to this writer to be
wholly implausible. It certainly represents an extreme position compared
to the view that the impact of typefaces is very much peripheral to the
reading process.332
Taylor (1990) concluded that readers could be moved along with a
serif typeface and slowed down with a sans serif typeface. This
(sensational) conclusion was based on his interpretation of his face
results, that readers prefer sans serif typefaces while they perform
better with serif typefaces. Taylor’s conclusion certainly represents, from
a perspective where typefaces are regarded as relatively peripheral to
the reading process, an extreme position.333
Accordingly, the typeface legibility study by Kunst (1972), which
relied solely on a comprehension measure, expresses an extreme position
on a par with Poulton’s and Taylor’s positions.
Observe that the introduction of the ‘peripherality perspective’
makes it much more difficult than before to have confidence in the opera-
tional constructs and conclusions of the four studies just mentioned.
Negative knowledge
What can be said is that all the typeface studies assessed in this thesis,
except for occasional insights, at least represent a body of negative
knowledge. Many pitfalls have been revealed in my assessment, and
inappropriate approaches to typographic knowledge production have
been made clear.
The escalating number of typeface legibility studies produced since
the 1980s seems not to have taken notice of the many criticisms of
legibility studies. Furthermore, the insights reached, and warnings
uttered by people like R.L. Pyke (as early as 1926), Michael Macdonald-
Ross and Robert Waller (1977), David Sless (1981), as well as many
design practitioners,334 seem not to have had the deserved impact.
332. See the section ‘Poulton 1965’, in chapter 5.
333. See the section ‘Taylor 1990’, in chapter 5.
334. See chapter 4, ‘ Critiques of legibility research’.
Discussion: knowledge production and technical rationality / 229
Research that breaks with ‘received wisdom’
However, this does not mean that all legibility studies (of all ‘genres’) are
of the same doubtful value. Practical legibility studies that resemble
contextual usability tests are often less dubious. These studies might
suffer from problems, for example with regard to internal validity due to
the configuration of the stimulus material. However, external validity is
less of a problem here because these studies do not claim general
validity; and furthermore, construct validity is less of a problem, because
there is a rather direct relationship between theoretical and operational
definitions. I am thinking of studies like Christie and Rutley’s practical
letterform legibility study on distance visibility for the British Road
Research Laboratory,335 and practical typeface legibility studies on the
image degradation of typefaces created by fax-transmission, like Birkvig
1990 (not reviewed in the previous chapter).
Furthermore, as suggested in the material on eye movement
measures,336 there are approaches and studies that bring our
understanding forward, whether these studies express ‘truth’ or not.
There are also studies that go contrary to received wisdom. Such studies
can provide insights of great value; like Emile Javal’s finding that the
eyes move in saccadic jumps when we read. Of similar value, if not scale,
is for example the insight of Don Bouwhuis that elderly people with
partially impaired eyesight may need relatively large type to read with
comfort, in spite of the fact that this large type may actually slow down
the speed of their reading. In some studies, it seems that the researcher
has stumbled over valuable or at least provocative insights which are
secondary to their main aim. For example in a comparative typeface
legibility study by Prince (1967) on people with impaired vision, where
perhaps the most interesting and important result is the recommen-
dation that it is necessary to enlarge the punctuation marks in texts for
people with impaired vision.
Similarly, Arnold Wilkins’ intriguing monograph Visual stress
focuses on pattern-sensitive epilepsy, headaches and discomfort. He has
335. See the section ‘Christie and Rutley 1961’, in chapter 5.
336. In the section ‘Experimental performance studies’, in chapter 2.
Discussion: knowledge production and technical rationality / 230
discovered through his work that pattern-sensitivity, triggered by the
omnipresent stripes in our modern urban environment (think of clothes,
furnishing fabrics, grills, and computer screen flicker), is more prevalent
than earlier thought. Wilkins has also discovered that the stripy pattern
of text has the same negative effect on predisposed persons. He suggests
that one of the strategies that can be employed to reduce this problem is
to rely on liberal amounts of interlinear spacing and liberal amounts of
(inter) word spacing. The suggestion of employing liberal amounts of
interlinear spacing resonates (within limits) with expert typographic
advice; if not with the idiom of high modernist Swiss typography, where
the even ‘grey colour’ of text areas on a page could seem to be more
important than the visual articulation of each line of text (with ‘white
channels’ of interlinear spacing). However, the second suggestion, of
liberal (inter) word spacing does not resonate with expert advice. The
rationale behind such an expert consideration is not only based on
aesthetic sensitivity, but also on ideas inspired by, or at least in accord
with, gestalt psychology about perceptual grouping (the distance
between lines should be markedly bigger than the distance between
words, etc).337 As long as only the interlinear spacing and (inter) word
spacing increases but not the type size, the pattern created by the
individual words of the text will visually ‘fall apart’. If the interlinear
spacing is kept relatively narrow while only the (inter) word spacing is
extended, as in an illustration of a preferred text in Wilkins’ book
(pp. 7475), the result will be ‘rivers’ in the text, something that is
regarded as amateurish, ugly, and bad for legibility by typographic
designers.338 I am not saying this in an attempt to repudiate Wilkins’
argument. On the contrary; I want to point out that research can
sometimes come up with surprising results or suggestions that go
contrary to expert domain knowledge, and that such results may
represent the better solution (at least for special applications). What
I am trying to say (regardless of the validity of Wilkins’ suggestions), is
that it is exactly this role that research should play to generate
337. For a s tu dy o n pe rcep tu al g ro upin g an d ty po grap hy, s ee R iv li n 19 87 .
338. For dis cu ssio ns on ‘ ri vers in typ es et t ex t, see D owdi ng 196 6, pp. 47; and Rubinstein
1988, pp. 182184.
Discussion: knowledge production and technical rationality / 231
knowledge that breaks with ‘received wisdom’ (cf. Bourdieu and
Wacquant 1992).339
There is of course no guarantee that studies that go contrary to
received wisdom, are free of validity problems. The contrary is probably
most often the case. One such contrary example appears in a chapter on
‘words and symbols’ in David Oborne’s prominent textbook Ergonomics
at work (1987). On ‘line spacing’, while citing a study by Herman Bouma
from 1980, Oborne claims that ‘within reason, therefore, for long lines of
text the interline spacing should be reduced’ and further, that ‘with very
long text lines … the interline spacing would have to be extremely small’
(pp. 77, 78). This sensational claim, whether it derives from Bouma or
not, is certainly contrary to received wisdom. However, it is not only
contrary to received target domain knowledge, but also contrary to
research findings (on the typographic variables line length and
interlinear spacing). My a priori position is that Oborne’s (and Bouma’s?)
claim is nonsense. The fact that Oborne does not reveal any signs of
awareness of how sensational this claim is, enforces my suspicion that
there is some kind of misunderstanding about, or alternatively, that
Bouma’s study, or conclusions, are somehow invalid. Nevertheless,
I chose to interpret Oborne’s recommendations on appropriate
interlinear spacing as yet another instance of inadequate domain
knowledge (as well as inadequate knowledge of the relevant research
literature). In addition: it is an expression of how unreliable
recommendations can be; recommendations based on uncritical and
indiscriminate citations of whichever research ‘finding’ the collector
stumbles over.
Passing by other points of comparisons, Mary Dyson and Gary
Kipping’s findings in the context of computer screen typography
(mentioned above),340 that long line lengths are better, can also pass as
a finding which is contrary to received wisdom. However, as pointed out
earlier, the fact that Dyson and Kipping only compared long lines to
339. I choose deliberately not to describe the domain knowledge in question as ‘everyday
common sense knowledge’, because I think expert domain knowledge is at least one
level removed from ‘everyday common sense knowledge’.
340. See the section ‘Experimental performance studies’, in chapter 2, ‘The construct of
legibility’.
Discussion: knowledge production and technical rationality / 232
short lines and not with something in between (a point that easily can be
missed), suggests that caution should be exercised in interpreting their
results.
Nevertheless, communication between research domain and target
domain is helpful. This is exactly what Pierre Bourdieu, in An invitation
to reflexive sociology (1992), refers to as ‘participating objectification’
(briefly referred to above while expounding on my own methodological
strategy in the chapter ‘A review of empirical studies’). ‘Participating
objectification’ is what Bourdieu promotes as the most appropriate
methodological approach in social science research. Bourdieu not only
emphasises that it is impossible to establish watertight walls between
research on the one hand and everyday language and intimacy with the
‘subject’ on the other hand, he also emphasises that it is not desirable.
Bourdieu stresses the necessity of maintaining both intimacy with and
distance from the subject in order to construct a fertile object of study.
Bourdieu is thus eager to transgress the contradiction between
‘participant’ and ‘spectator’, or phrased with Bourdieu’s own words,
between ‘reductive subjectivism’ and ‘reductive objectivism’. Never-
theless, he also stresses that what is interesting from a research
viewpoint is the epistemological rupture in relation to the immediate
conceptions or doxa of the subject domain (as well as research reflexivity,
that is, the necessity to reflect on and aim at an epistemological rupture
with the doxa of the researchers’ own domain as well). Bourdieu sees the
systematic aspect of research as a major contribution to achieve
satisfying results (see Bourdieu and Wacquant 1992; as well as the
exposition in Solli 1998).
Conceptions similar to Bourdieu’s ‘participating objectification’ are
expressed by the British sociologist Anthony Giddens’ concept of a
‘double hermeneutics’ (1976, p. 158; 1987). Giddens emphasises that
concepts and insights flow back and forth between research and the
social sphere, and furthermore, that the social scientist depends on the
subjects’ own concepts and motives in order to construct an object of
study.
The question of contact and flow of information in both directions
between the domain of experimental psychology and the target domain of
typography and information design has been dealt with by Patricia
Discussion: knowledge production and technical rationality / 233
Wright (1978) and by Robert Waller (1979). In the context of language
comprehension Patricia Wright focuses on the contact between ‘pure’ and
‘applied’ research (while acknowledging that it is an oversimplification to
classify research as one or the other). However, she is also concerned
with the contact between research domain and the domain of designing
written communication (information design). Both Wright and Waller
emphasise the importance of mutual respect and the importance (and
difficulties) of bi-directional contact. Wright identifies three existing
viewpoints on the relationship between the domains in question. The
first viewpoint sees successful application as dependent upon advances
in basic knowledge; the second sees applied research as the ideal starting
point for advances in basic theories; while the third considers basic and
applied research as separate but equal. Wright proposes a fourth
viewpoint where ‘a flow of information from applied to pure and back to
applied could be a very fruitful pattern of interaction’ (p. 253). In
addition she also points in the direction of the alternative epistemo-
logical strategy that has become known as usability testing: ‘There will
never be a substitute for pretesting written material empirically’
(p. 297). Furthermore, in a way similar to Michael MacdonaldRoss
(1978, p. 24), Wright also focuses on the problem of extracting applicable
recommendations from research, not least because research findings are
often based on comparisons of incompetently designed materials:
‘Knowing that flow charts are an improvement on legal language is no
guarantee that they are better than good prose’ (p. 297). However,
Wright also acknowledges the problem of not necessarily knowing
beforehand what is better or worse. Nevertheless, although there is no
safe way out of this dilemma, Wright emphasises that researchers
should
be concerned to optimise within formats. It is logically necessary to
know how to write good prose or how to design a good flow chart before
one can ask questions about the circumstances in which one of these
formats is more useful than the other. (p. 297)
Discussion: knowledge production and technical rationality / 234
and she also echoes Hillier et al.’s ([1972] 1984) and Macdonald-Ross and
Waller’s (1975 ) papers inspir ed by Karl Po pper:341
In advancing an information flow from applied to pure and back to
applied it has been suggested that perhaps the most worthwhile
starting point is with applied solutions, rather than with applied
problems. (p. 302)
Similar to the positions of Bourdieu and Habermas on social science
research, Wright is recognising the importance of taking into account the
‘everyday’ or ‘expert’ knowledge of the target domain before undertaking
research meant for application in the very same target domain in the
next round. However, she also acknowledges the problem that
The researcher who starts with a real problem, but extracts only some of
the variables to take back for examination in the laboratory, may very
well find difficulties in moving from the laboratory findings back into
the problem domain. (p. 252)
Thus, intimate contact and knowledge of the target domain is
necessary in attempting to avoid many of the pitfalls of applied
experimental research. This acknowledgement resonates with the
‘findings’ of this thesis, that lack of adequate (target) domain knowledge
is responsible for faulty stimulus material, which leads to lack of
internal validity, the basic requirement of experimental research.
Shadow positivism
It is now several decades ago since positivism342 was repudiated as an
adequate philosophy for the social sciences. However, the empirical
approach of this thesis has provided substantiation of that positivism
still lives on as a methodological platform for legibility research. This
‘finding’ resonates with recent observations on the state of sociology as
well as the state of psychology (see for example Bickhard 1992;
341. See the section ‘Changing paradigms: from legibility to usability’, in chapter 3, ‘A
century of legibility research’.
342. For a g oo d ex po sition o f po si ti vism , critic is m of p os itiv is m, and p ost-positivist theory,
see Outhwaite 1987.
Discussion: knowledge production and technical rationality / 235
Wacquant 1993; Giddens 1996). The Norwegian sociologist Ragnvald
Kalleberg has suggested ‘shadow positivism’ as a description of this state
of affairs; that is, when positivism is left behind as a philosophical basis,
but still implicitly informs and permeates practical research (Kalleberg
1996; 1997). This ‘shadow positivism’ influences what are regarded as
valid questions to ask, what are regarded as valid empirical data, what
are legitimate methods to employ (controlled experiments and
quantitative measures), and is described by ‘technical rationality’ and a
prevalent ‘nomologism’. ‘Shadow positivism’ is realised through the
‘conventional view’ that only experimental research can generate valid
knowledge and that quantitative data are better than qualitative. As
Robert Waller poignantly put it in 1987 (cited above in chapter 4,
‘Critiques of legibility research’.): The expression ‘nothing is known
about’ in the research literature means that no experiment has been
performed on this question (Waller 1987, p. 73).
Kalleberg also suggests that the tendency of ‘shadow positivism’
thrives best in immature ‘hyphen areas’ of the social sciences. This last
suggestion may very well fit the findings of this thesis. Most of the
typeface legibility studies performed during the second half of the
twentieth century, do not belong to dominant or major areas of social
science such as psychology, sociology or economics, but rather modest
and minor areas such as educational technology, journalism, and
ergonomics. Other reasons suggested by Kalleberg for this ‘shadow
positivism’ is a lack of a minimum of adequate competence in the theory
and philosophy of science and social science, in the social sciences, and
furthermore, the rather abstract and non-accessible character of much of
science philosophy.
For the sake of order, note that a rejection of positivism does not
imply a general rejection of quantitative methods or statistical analyses.
Furthermore, opposition to narrow positivist conceptions of science,
and the insight that knowledge is constructed and contingent of culture
and social interest, does not imply that truth and knowledge does not
exist or can not be achieved. Christopher Norris, while paraphrasing
the British (‘realist’) philosopher of science Roy Bhaskar (1986), asserts
that
Discussion: knowledge production and technical rationality / 236
Where the relativists err is in confusing ontological with epistemological
issues. Thus they take the sheer variety of truth-claims advanced (and
very often subsequently abandoned) down through the history of scien-
tific thought as evidence that no truth is to be had (Norris 1995, p. 111)
Operationalism
The ‘shadow positivism’ suggested above is also realised through the
prevalent operationalism that has been indirectly demonstrated in my
review of empirical studies: Find something that can be measured and
voilà, ‘legibilityis not only operationalised, but also defined.343 The old
saying about positivist science, ‘if you can’t count it it doesn’t count’,
certainly describes prevalent notions exposed in this thesis. Further-
more, it may very well be that exactly the easily invented operational
constructs within the frame of ‘legibility’ has distorted questions of
design which are of importance to designers and readers alike.
Interestingly, operationalism has in fact not only been criticised by
philosophers of science, but also by critics of legibility research from the
domain of typography and graphic design. The typeface designer and
retired director of British Linotype, Walter Tracy, perceptively hit the
(easy going operationalism)-nail on its head in his 1988 The typographic
scene:
Psychologists, and engineers like Lucien Legros, need to measure
things: cognition, parts of letters, anything that can be identified and
then evaluated by time or dimension. They make up a theory, devise a
testing procedure, tabulate the results, and draw a conclusion. (Tracy
1988, p. 83)
And R.L. Pyke’s complaint from 1926, in his extensive review of
legibility research, reminds us that much of early legibility research did
not only lack theoretical definitions but even operational definitions:
343. The last study reviewed in the previous chapter (Van Rossum 1997), can stand as an
example of an interesting and original idea about a particular measure of legibility
that changes to an uncritical reification of the same measure.
Discussion: knowledge production and technical rationality / 237
four times as many writers have measured legibility as have defined it.
Three out of every four writers have been attempting to measure
something the exact nature of which they have not paused to examine.
(Pyke 1926, p. 10)
Joycelyn Chapman in an article about ‘re-thinking research
into visual communication’ published in the late 1970s, in an issue of
the design journal Icographic, expressed similar concerns as Walter
Tracy:
In complex ‘real’ life situations, speed of reading is only one, rather
unimportant factor amongst a mass of other influences. Yet it is the
main measuring tool used in legibility studies. It is the one human
factor of concern. Why? Partl y b ecause y ou can ea sil y m eas ure it . Time
can be quantified in minutes or seconds. How do you quantify interest?
How do you discover deep motivations affecting reactions to print?
(Chapman 1978, p. 28, my italics)
And Donald Norman, in the context of humancomputer interaction,
recently formulated the same concern the following way:
Unfortunately, things we can measure and quantify, if not selected with
care, can be things that don’t have great practical value. (Norman 1995,
p. 35)
Nevertheless, easy-going ‘operationalism’ is not confined to legibility
research and humancomputer interaction. Experimental psychology in
general has also been described in a similar manner. For example by the
psychologist Paul Kline (in a polemical book-length essay on the state of
psychology):
the demand for precision has led empirical psychologists to choose
variables because they can be so measured and not for reasons of
theoretical psychology. This has led to research into trivial issues.
(Kline 1988, p. 222, my italics)
And furthermore, as indicated above,344 the emerging criticism of
positivism in the 1960s focused among other things exactly on easy-going
use of operational definitions in social science research, including
psychology. So, although Kalleberg’s ‘shadow positivism’ may fit as a
descriptive label of contemporary legibility research, it would be wrong
historically to regard legibility research as a ‘shadow’-exception, that is,
344. See the section ‘Operational definitions’, in chapter 2.
Discussion: knowledge production and technical rationality / 238
as a poor and peripheral relative to ‘flawless’ and dominant social science
research domains (or, for that matter, natural science research domains
like vision research and ophthalmology).
Translation of ndings
It goes without saying that the ultimate aim of applied research on
legibility (or, for example, humancomputer interaction research) is to
improve the design of informational artefacts. However, as pointed out
above,345 the problem is not only to translate with great care ‘findings’
from experimental research to the practice of typographic design (or
‘information design’ or ‘humancomputer interaction’), especially not
when experimental validity cannot be taken for granted in the first
place. There are frequent instances in the information design literature
(and even more specifically in the ‘document design’ literature) where
the application of findings of experimental research is reduced to a
problem of translation and access for design practitioners. See for
example Mary Beth Debs’ article ‘A history of advice: what experts have
to tell us’ in the collection Effective documentation: what we have learned
from research (Debs 1988, pp. 1123); Paul Buckley in the article
‘Expressing research findings to have a practical influence on design’ in
the collection Cognitive ergonomics and humancomputer interaction
(Buckley 1989); and Robert Krull’s thoughtful and thorough article
‘What practitioners need to know to evaluate research’ recently
published in a special issue on the topic of ‘Bringing communication
science to technical communication advancing the profession’ in
IEEE Transactions on Professional Communication (Krull 1997).346
Interestingly, Krull employs Miles Tinker’s legibility research as
example-material in his article. Krull’s paper is an interesting and
345. See the section ‘Robinson, Abbamonte and Evans 1971’, in chapter 5.
346. In the introduction to this special issue, it is revealed that ‘Technical communicators
who base their recommendations on established research are able to support their
recommendations better and over time, establish better credibility with the scientists,
engineers, and technical specialists with whom they work.(Grove and Zimmerman
1997, p. 158).
Discussion: knowledge production and technical rationality / 239
thorough exposition of all precautions that ought to be taken when
applied research is to be interpreted and utilised by practitioners.
Nevertheless, Krull’s underlying epistemological assumption is less
pragmatic. As an example he asks (in the opening paragraph): ‘Should
practitioners set body copy in serif type?’. He continues: ‘Practitioners
could answer such questions without themselves conducting research if
they had methods of finding, evaluating, and applying existing social
science research’ (p. 168). Thus, in Krull’s universe, there exists only one
epistemology, that is, one based on experimental research. Either you do
the experimental research yourself,347 or you find the right answers by
(critical) reading of research papers.
Certainly, if the research described in this thesis on the relative
legibility of serif and sans serif typefaces is representative of legibility
research, we may polemically conclude that the existence of the ‘problem’
of translation and access represents an advantage rather than a
problem.
Parallels between legibility research and
humancomputer interaction research
There are parallels between the critiques of legibility research that have
been described in this thesis (as well as the critique constituted by this
thesis), and a similar critique from the last fifteen years of psychological
humancomputer interaction research. The growing field of human
computer interaction (HCI) is, like information design, a research and
practice domain, and information design and HCI overlap each other.
HCI has ‘inherited from software psychology and human factors both a
laboratory-based, experimental program and a strong belief that such a
program could alone answer whatever question applied and basic
computer and human interaction might arise’ (Nyce and Löwgren 1995,
p. 37). Critiques of this belief has been articulated by several
347. Note that Krull is not talking about pragmatic ‘on-site’ usability testing, but ‘applied’
research on less context-dependent questions.
Discussion: knowledge production and technical rationality / 240
researchers, most notably by John Carroll and John Landauer.348
Landauer expresses a well-articulated scepticism of borrowings from
psychology, and among these borrowings
we are sceptical of the practical value of classical comparative
experiments to determine the relative merits of two or more features or
systems and conventional testing of significance. (Landauer 1997, p. 224)
and the reasoning behind this scepticism echoes earlier critiques of
legibility research
Because the parts can interact in subtle and unpredictable ways, it is
not sufficient to study them in isolation. (Landauer 1997, p. 224)
And not unexpectedly, Landauer points to ‘human performance analysis’
and ‘formative evaluation’ as epistemic alternatives to experimental
laboratory research. John Long, who in an article is preoccupied with
specifying relations between research and the design of humancomputer
interactions, concludes that ‘in spite of that [humancomputer
interaction] research would be expected to support design’ and ‘the
optimism that pervades the proceedings of the CHI-conferences’, ‘little of
the knowledge has been incremental’ and ‘effectiveness in supporting
design has not been generally demonstrated.’ (Long 1996, pp. 878880).
Epistemic alternatives to legibility research
Donald Schön describes the ‘conventional view’ of ‘technical rationality’
as prevalent within most professions. That is, researchers are perceived
as producers of valid knowledge that should be applied by practitioners
(Schön 1983). Schön is extremely sceptical of this view, and he claims
that it creates a wrong picture of the rich knowledge of practitioners.
The view of ‘technical rationality’ is most often implicit, but
sometimes it is made explicit. Frank Smith, the editor of Tech nical
Communication, the journal of the Society of Technical Communicators,
348. See for example Carroll and Campbell 1986, 1989; Carroll 1990; Landauer 1991;
1997. For val ua ble me ta -commentaries on HCI-research and HCI-rhetoric, see Nyce
and Löwgren 1995, and Cooper and Bowers 1995.
Discussion: knowledge production and technical rationality / 241
has explicitly expressed this view the following way (which also reveals
an instrumental second reason for why practitioners should utilise
research: in order not to be challenged by clients on the choices taken):
I submit that we must change our habitual approach to our jobs.
Typically we work on the basis of intuition and folklore, and when a
client asks us why we want to change his expression or his table or his
organization, our only answer is that we think it’s more effective our
way. The client is perfectly justified in that case to say that he thinks it
isn’t. We need to able to say that experimental research has proven
conclusively that our recommended approach is superior. And if we are
to do that, we must learn what has been proven and who is doing the
work and where it is being published. And those of us who have the
proper training and bent of mind and circumstances must begin or
accelerate controlled experiments designed to test the old saws and
establish new truths (Frank Smith in an editorial of Techni cal
Communication, p. 5, 4th quarter, 1985; quoted by Doheny-Far ina 1988,
pp. 23, Doheny-Far ina ’s ital ics )
And similarly, from T. Brooks of IBM a couple of years later, in the same
journal, Technical Comm unication:
most [technical communicators] would probably agree that text set in all
uppercase letters is harder to read than mixed-case text. That a well-
designed serif type is easier to read than sans serif. But are you really
sure why, or do you just know that? If you are challenged on a question
like that, it helps to be able to back up your opinion with published
research results or studies. (Brooks 1991, p. 183; quote taken from
Campbell 1995, p. 4)
However, this thesis has so far substantiated that research on the
relative legibility of serif and sans serif typefaces, in spite of the large
number of studies carried out, has not been incremental or conclusive. It
has also been indicated that traditional legibility research, regardless of
genre, has been a blind alley that probably should be abandoned the
sooner the better. If that is the case, it might be appropriate to ask if
there exist alternative and more promising epistemologies. That is,
alternative epistemologies for research that are meant to, or can, guide
practice.
As suggested in chapter 4, ‘Critiques of legibility research’: an
epistemic alternative that has been advocated by Michael Macdonald-
Ross and others is to focus on knowledge in the design domain, that is,
the ‘tacit knowledge’ of expert practitioners and to admit designers’
Discussion: knowledge production and technical rationality / 242
‘informed reasoning’ higher status. However, my exposition in the same
chapter has shown that the explanatory potential of the concept of ‘tacit
knowledge’ has its limits. But again, design domain knowledge is more
than the personal tacit knowledge and contextual informed reasoning of
expert practitioners, it is also ‘the rich and diverse body of knowledge …
[and] current of critical reflection on practice’ that typography conceals
‘not far below the surface’ of its ‘underdeveloped literature’. This last
point has been demonstrated by Paul Stiff in his instrumental case study
‘The end of line: a survey of unjustified typography’ (1996a).
Also suggested in chapter 4: usability testing (and other contextual
evaluation methods) represent pragmatic and powerful epistemic
alternatives to experimental laboratory research. Observe that usability
testing has seen a tremendous growth within the areas of ‘information
design’ and ‘humancomputer interaction’ during the last two decades.
As suggested above in this chapter (in the section ‘Domain
knowledge’), there is also a certain potential for improving experimental
research. This is suggested by the argument that inadequate domain
knowledge is in the last instance responsible for the lack of validity in
much existing research. Thus, adequate domain knowledge will yield
better research. A potential for such improvement is also suggested by
Patricia Wright’s emphasis of bi-directional contact between research
domain and design domain. Michael Macdonald-Ross takes for granted
the potential for improved research based on adequate domain
knowledge:
There have been few empirical studies of total designs. Such studies require
researchers who are prepared to grasp the basics of typographic design and
the manufacturing processes. (Macdonald-Ross 1994, p. 4691)
The epistemic alternative of ‘design-based theory’
John Carroll, who in the contexts of document design and human
computer interaction has developed a powerful critique of much
psychological research (including information-processing theories) meant
to service the practical development of user interfaces (e.g. Carroll and
Discussion: knowledge production and technical rationality / 243
Campbell 1986), has suggested ‘design-based theory’ as an epistemic
alternative to the technical rationality represented by ‘the conventional
view’ of ‘theory-based design’ which ‘doesn’t … seem to work’ (see
Carroll 1990, pp. 277284; and Carroll and Campbell 1989).
Carroll’s quest for ‘design-based theory’ is partly informed by
Patricia Wright’s focus on the need for a bi-directional flow of
information (see Carroll 1990, p. 283). However, the quest for ‘design-
based theory’ goes beyond Wright’s proposal. Carroll claims that
There is a conventional view of the relationship of scientific research
and the invention, design, and development of practical artefacts. The
idea is that basic science provides understanding of nature that can be
applied deductively in practical contexts. This idea has been rather
thoroughly absorbed, at least by scientists, and is familiar to anyone
who have studied science. (Carroll 1990, p. 277)
Against this ‘conventional view’ Carroll points out that technological
inventions often vastly predate their own scientific analysis.349 As one of
several examples from the history of technology and from the history of
humancomputer interaction techniques, he mentions that
current practice in designing texts far outstrips what can be grounded
in the basic psychology of text comprehension (Carroll 1990, p. 281)
and he continues
… Such inversions of theory-based design cannot be understood in the
conventional view. Their resolution lies in a different view of the
relation between science and design, one that takes these artifacts and
the process of invention and development that produces designed
artifacts more seriously. We refer to this view as design-based theory.
(Carroll 1990, p. 281)
Carroll stresses the potential of focusing on, and extracting
knowledge from, existing artefacts. This is possible because designed
artefacts embody implicit theoretical claims. However, artefacts develop
in a ‘task-artifact cycle’ and user interface design depends on under-
standing this ‘ecology of tasks and artifacts’. In order to reach such
understanding, several things are needed: a competent understanding of
domain details; scientists who can provide conceptual guidance in
349. ‘Optical scaling’ in typeface design may represent such an invention. See my excursus
on ‘optical scaling’ in the section ‘Experimental performance studies’, in chapter 2.
Discussion: knowledge production and technical rationality / 244
design; and tools that expose the theoretical and psychological claims
embodied in designed artefacts.
Once HCI is understood in terms of an ecology of tasks and artifacts, it
becomes clear that the conventional views have been looking for
psychology in the wrong place. The psychology in HCI is not to be found,
for the most part, in laboratory methods and information-processing
theories; it is to be found in the user interfaces, the artifacts, that HCI
researchers build and evaluate. (Carroll and Campbell 1989, p. 254)
Carroll’s position may be brushed off as wishful thinking, but if we
transfer Carroll’s concern to text design and typography (ranging from
concerns about comprehension to traditional legibility concerns), we may
very well realise that there already exist examples of systematic inquiry
where design and artefacts are taken seriously on their own terms (as
opposed to instances of design enquiry where design in its own right is
made invisible and is seen through the lenses of more dominant and
colonising fields of enquiry like experimental psychology, art history,
sociology, cultural studies, or literary theory).
I can think of several examples of highly relevant and systematic
scholarly investigation on the function and usability of typographic
artefacts where the focus is on the relationship between ‘content’ and
‘form’, that is, investigations that focus on artefacts on the terms of
design. Take for example a row of papers by book-historian Margaret
Smith on early printed books where usability and the needs of the reader
are strongly present, although in an implied manner (with ‘textual
articulation’ as an important concept). In contradiction to many book-
historians or analytical bibliographers, Smith is consistently focusing on
the relationship between content and form as well as the highly relevant
role of navigational and other information-technological devices of the
book (see for example Smith 1983, 1987, 1993, 1994). Richard Southall
has in the context of ‘electronic publishing’ produced a series of highly
relevant theoretical papers that consistently focus on the relationship
between content and form, and thus the artefactual interface between
‘content’ and the reader (see for example Southall 1988, 1989, 1992).
Michael Twyman has produced a row of historical and theoretical papers
on typography that has a similar inclination (see for example Twyman
1979, 1982, 1986, 1993). And Robert Waller has produced a row of papers
Discussion: knowledge production and technical rationality / 245
where the focus is on ‘the typographic contribution to language’, that is,
the focus is on the relationship and border-zone between writing as a
linguistic medium and writing as a graphic medium (see for example
Waller 1980, 1982, 1987, 1991). What all these authors have in common
is a will to employ a combined empirical and interpretive approach that
very much focuses on design in its own right, and not through the lens of
for example art history or cultural studies. Furthermore, not only are
insights produced by these studies, but also a conceptual vocabulary,
that in the last instance has a potential for guidance in practical design.
It is important to realise that although such preoccupation with the
artefact may signal a turn away from reader-centred design and back to
an earlier traditional preoccupation of design (and design education)
with the artefact and its formal visual structures, this need not be the
case. Rather, the artefact must be understood not as a thing, but rather
as an interface that negotiates between content and ‘message’ on one
hand and the reader and his/her needs, capacities and intentions on the
other hand.350 If we look at informational artefacts in the way suggested
by Carroll, we realise that by focusing on the artefact, the reader need
not to be forgotten, but rather, that concern for the reader can be taken
care of implicitly instead of by explicit rhetoric.
Similar ideas as Carroll’s have been tentatively expressed by Nigel
Cross (1999), and obliquely expressed by Clive Dilnot (1999), on the
question of what ‘design research’ could or should be. Nigel Cross echoes
Bill Hillier et al. ([1972] 1984), and Michael Macdonald-Ross and Robert
Waller (1975),351 when he stresses that design knowledge resides in
‘products themselves’:
Much everyday design work entails the use of precedents or previous
exemplars not because of laziness by the designer but because the
exemplars actually contain knowledge of what the product should be …
knowledge implicit within the object itself of how best to shape, make,
350. An abstract ‘logical’ text structure may suggest six chapter levels; and similarly, an
orderly ‘logical’ organisation of content for a web-site may suggest a very deep
hierarchical structure. However, the artefactual and visual instantiation has to take
into account that it may be difficult for readers to discriminate in a meaningful way
between six chapter levels, and unnecessary or even devastating to have to navigate
through an excess of hierarchical levels in a website.
351. See the section ‘Changing paradigms: from legibility to usability’, in chapter 3.
Discussion: knowledge production and technical rationality / 246
and use it. … we would be foolish to disregard or overlook this informal
product knowledge simply because it has not been made explicit yet;
that is a task for design research. (Cross 1999, p. 6)
And Clive Dilnot advocates the importance of focusing on the
artefact, especially in a climate of ‘artefact denial’. As he argues:
artefacts are seen everywhere, but remain ‘invisible’, not least in the
university. However, Dilnot’s emphasis on the artefact is not made
without at the same time expressing a warning about ‘the fetishistic
“distraction” [of the artefact itself] that on occasion so blinds design to its
own consequences and implications’ (Dilnot 1999, p. 89).
247
7 Conclusion
The thesis effectively reveals that nearly all of the 28 studies which have
been reviewed (of a surprising total of 72 identified studies) lack internal
validity (the intra-paradigm sine qua non of experimental research). It is
shown that this lack of internal validity is largely due to confounding
factors that resides in the stimulus material, in the last instance caused
by the researchers’ inadequate domain knowledge (about typography).
Other methodological flaws are also revealed.
The tremendous effort that has gone into producing all these
typeface legibility studies, based on a wide variety of rationales and
operational methods, has not resulted in an incremental body of
knowledge. Nor has it resulted in any clarifying theories. The reviews of
the individual studies provide a thorough empirical substantiation of
criticism that has been raised only in a sweeping manner at earlier
occasions. The thesis shows that traditional experimental legibility
research has provided a non-productive approach to typographic
knowledge production.
However, the thesis has revealed that an increasing number of
comparative typeface legibility studies have been carried out during the
last two decades. This stands in contrast to prevailing notions that
legibility research pretty much vanished in the early 1980s, or
alternatively, that ‘too few’ legibility studies have been carried out
during the last two decades. What is more, the thesis has also shown
that dubious and even seriously flawed legibility studies are frequently
and indiscriminately cited in the contemporary discursive realms of
‘document design’ and ‘information design’.
It is suggested that much of contemporary legibility research is
naive (far from representing ‘superior knowledge’) and that it can be
Conclusion
/ 248
described as an expression of ‘shadow positivism’ and easy-going
‘operationalism’, thriving in the ‘hyphen areas’ of the social sciences,
where advances in the philosophy of science made decades ago seem not
yet to have been fully recognised. Epistemic alternatives to traditional
legibility research have also been discussed.
As well as being an idiographic historical study in its own right,
and primarily contributing knowledge to a self-reflective conversation in
typography and information design, the thesis also contributes know-
ledge of value to design history, design epistemology, reading research
history and social science history. The thesis has, by its empirical
approach, acknowledged the idea that epistemological studies (or science
history for that matter) should focus on what researchers actually do,
and the works that are actually delivered, rather than idealised
conceptions of what research is.
249
References
Bibliographical details about cited letters to the editor of newspapers,
magazines and journals are most often given in footnotes only, and thus do not
appear in this bibliography. The same applies to cited correspondence, cited
archival documents, and electronic bibliographic databases. URLs were correct
at the time of writing.
The format ‘[Year] Year’, as in ‘[1949] 1973’, indicates that the document
was first published in 1949, but that the edition cited was published in 1973.
A
——————
Adams, Sarah, R. Rosemier, and P. Sleeman. 1965. ‘Readable letter size and
visibility for overhead projection transparencies’. Audiovisual
Communications Review, vol. 13, pp. 412417
Adobe. 1989. ‘Face fax’. Font & Functio n, Fall 1989, p. 12
Aldersey-Williams, Hugh, Lorraine Wild, Daralice Boles, Katherine McCoy,
Michael McCoy, Roy Slade, and Niels Diffrient. 1990. Cranbrook design: the
new discourse. New York: Rizzoli
Amachree, Tomlinson K.P., Hubertus L. Bloemer, and Bob J. Walter. 1977.
‘Typographic legibility on maps: a comparative study’. Bulletin of the
Society of University Cartographers, 11, pp. 2739. (Probably based on
Tomlinson Krakray emo P. Amach ree. 1975. ‘Typographic legibil ity on
maps: a comparison between Sans-serif (Gill) and Serif (Times Roman)
type’. MA thesis. Ohio University)
André, Jacques. 1993. Contribution à la création de fontes en typographie
numérique. Rennes: l’université de Rennes 1, Institut de Formation
Supérieure en Informatique et en Communication. (On the cover it says:
Jacques André. Création de fontes en typographie numérique. Documents
d’habilitation. Rennes: IRISA and IFSIC)
André, Jacques, and Irène Vatton. 1994. ‘Dynamic optical scaling and variable-
sized characters’. Electronic Publishing, vol. 7, no. 4, pp. 231250
Arditi, A., K. Knoblauch, and I. Grünwald. 1990. ‘Reading with fixed and
variable pitch’. Journal o f the Ophthalmological Society of America, vol. 7,
pp. 20112015
References / 250
B
——————
Badaracco, Claire Hoertz. 1995. Trading word s: poet ry, typogr aphy, a nd
illustrated books in the modern literary economy. Baltimore and London:
The Johns Hopkins University Press
Baker, Steve. 1985. ‘The hell of connotation’. Wor d & Imag e, vol. 1, no. 2,
pp. 164175
Bal, Mieke, and Norman Bryson. 1991. ‘Semiotics and art history’. The Art
Bulletin, vol. 73, no. 2, pp. 174208
Baird, Russel N., and Arthur T. Turnbull. 1975. The graphics of communication:
typography, layout, design. 3rd ed. New York: Holt, Rinehart and Winston
Baird, Russel, N., Duncan McDonald, Ronald K. Pittman, and the late Arthur T.
Turnbull. 1993. The graphics of communication: methods, media and
technology. 6th edition. Fort Worth: Harcourt Brace Jovanovich College
Publishers
Banks, William P., and David Krajicek. 1991. ‘Perception’. Annual Review of
Psychology, vol. 42, pp. 305331
Bartram, David. 1982. ‘The perception of semantic quality in type: differences
between designers and non-designers’. Information Design Journal, vol. 3,
no. 1, pp. 3850
Bartram, David. 1982. ‘Jeremy J. Foster, Legibility research 19721978: a
summary’. Review article. Information Design Journal, vol. 3, no. 1,
pp. 7576
Baudin, Fernand. 1967a. ‘Miles A. Tinker, Bases for effective reading’. Review
article. Journal of Typographic Research, vol. 1, no. 2, pp. 204207
Baudin, Fernand. 1967b. ‘Typography: evolution + revolution’. Journal of
Typog raphi c Resear ch , vol. 1, no. 4, pp. 373386
Baxandall, Michael. 1985. Patterns of in ten tio n: on t he his tor ical exp lan ati on of
pictures. New Haven and London: Yale University Press
Beaugrande, Robert de. 1980. Text, disc ourse , a nd proces s: to war d a
multidisciplinary science of text. Norwood, New Jersey: Ablex
Beaugrande, Robert de. 1984. Text pr oductio n: toward a scien ce of c omposi ti on.
Norwood, New Jersey: Ablex
Becker, D., J. Heinrich, R. von Sichowsky, and D. Wendt. 1970. ‘Reader
preferences for typeface and leading’. Journal of Typographic Research,
vol. 4, no. 1, pp. 6166
Beldie, I.P., S. Pastoor, and E. Schwartz. 1983. ‘Fixed vs. variable letter width for
televised text’. Human Factors, vol. 25, pp. 273277
Bell, Richard C., and James L.F. Sullivan. 1981. ‘Student preferences in
typography’. Programmed Learning and Educational Technology, vol. 18,
no. 2, pp. 5761
Benedek, Andy. 1991. ‘The craft of digital type’. Eye, vol. 1, no. 2, pp. 8687
Benson, Philippa. 1985. ‘Writing visually: design considerations in technical
publications’. Tec hni cal Com munica tion, no. 32, pp. 3539
References / 251
Berger, Sidney E. 1991. The design of bibliographies: observations, references and
examples. No. 6 of Bibliographies and indexes in library and information
science. London: Mansell, a Cassell imprint; Westport, Connecticut:
Greenwood Press (1992)
Berkowitz, Leonard, and Edward Donnerstein. 1982. ‘External validity is more
than skin deep: some answers to criticisms of laboratory experiments’.
American Psychologist, vol. 37, no. 3, pp. 245257
Berliner, Anna. 1920. ‘Atmosphärenwert von Drucktypen’. Zeitschrift für
Angewendte Psychologie, 17, p. 165
Bernhardt, Stephen A. 1985. ‘Text structure and graphic design: the visible
design’. In Systemic perspectives on discourse: volume 2. Selected applied
papers from the 9th international systemic workshop. Edited by James
D. Benson and William S. Greaves, pp. 1838. Norwood, New Jersey: Ablex
Bhaskar, Roy. 1986. Scientific realism and human emancipation. London: Verso
Bickhard, M.H. 1992. ‘Myths of science’. Theory and Psychology, vol. 2, no. 3,
pp. 321337
Biemann, Emil O. 1961. ‘Univers: a new concept in European type design’. Print,
vol. 15, no. 1, February, pp. 3236
Bigelow, Charles A. 1981. ‘Technology and the aesthetics of type: maintaining
the “tradition” in the age of electronics’. The Seybold Report, vol. 10, no. 24,
pp. [12], 316 (This is part 1 of a series of 3 articles by Bigelow in this
journal, under the series title: ‘Aesthetics vs. technology’)
Bigelow, Charles A. 1982. ‘The principles of digital type: quality type for low,
medium and high resolution printers’. The Seybold Report on Publishing
Systems, vol. 11, no. 11, pp. [1], 323. (This is part 2 of a series of 3 articles
by Bigelow in this journal, under the series title: ‘Aesthetics vs. technology’)
Bigelow, Charles. 1989. ‘On type: form, pattern, & texture in the typographic
image’. Fine Print, vol. 15, no. 2, pp. 7582
Bigelow, Charles, and Donald Day. 1983. ‘Digital typography’. Scientific
American, no. 8, pp. 106110, 112, 114119
Bigelow, Charles, and Kris Holmes. 1991. ‘Notes on Apple 4 fonts’. Electronic
Publishing, vol. 4, no. 3, pp. 171181
Bigelow, Charles, and Kris Holmes. 1993. ‘The design of a Unicode font’.
Electronic Publishing, vol. 6, no. 3, pp. 289305
Biggs, Michael, and Claus Huitfeldt. 1997. ‘Philosophy and electronic publishing:
theory and metatheory in the development of text encoding’. The Monist,
vol. 80, no. 3, pp. 348367
Birkvig, Henrik. 1990. ‘Frutiger: en sikker vinner’. Bogtrykkerbladet, no. 9,
pp. 1114
Bi[rkvig], H[enrik]. 1995. ‘Typografiens ekstrapolation’. Rubrik, no. 37,
pp. 1011
Black, Alison. 1990. Typ efa ces for deskt op publi shing : a use r g uide. London:
Architecture Design and Technology Press
Black, Alison, and Andrew Boag. 1992. ‘Choosing binary or grayscale bitmaps:
some consequences for users’. In EP 92: Proceedings of Electronic
Publishing 1992, edited by C. Vanoirbeek and G. Coray, pp. 247260.
Cambridge: Cambridge University Press
References / 252
Blackwell, Lewis. 1995. The end of print: the graphic design of David Carson.
London: Laurence King
Boag, Andrew. 1993. ‘What is the point?’ Print, vol. 47, no. 2, pp. 109110
Boag, Andrew. 1996. ‘Typographic measurement: a chronology’. Typo graphy
Papers, no. 1, pp. 105121
Boden, Margaret A. 1990. ‘Escaping the Chinese room’. In The philosophy of
artificial intelligence, edited by Margaret A. Boden, pp. 89104. Oxford:
Oxford University Press
Bohn, Willard. 1986. The aesthetics of visual poetry: 19141928. Cambridge:
Cambridge University Press
Bonner, John V.H. 1998. ‘Towards consumer product interface design guidelines’.
In Human factors in consumer products, edited by Neville Stanton,
pp. 239258. London: Taylor & Francis
Bonsiepe, Gui. 1968. ‘A method of quantifying order in typographic design’. Ulm,
no. 21, pp. 2431. (Also published in Journal of Typographic Research,
vol. 2, no. 3, 1968, pp. 203220. Republished in Readings from Ulm: selected
articles from the journal of the Ulm school of Design, edited by Kirti Trivedi,
pp. 249258. Bombay: Industrial Design Centre, at the Indian Institute of
Techn ology, 1 989)
Bourdieu, Pierre, and Loïc J.D. Wacquant. 1992. An invitation to reflexive
sociology. Cambridge: Polity Press
Bouwhuis, Don G. 1989. ‘Reading as goal-driven behaviour’. In Workin g m ode ls
of human perception, edited by Ben A.G. Elsendoorn and Herman Bouma,
pp. 341362. London: Academic Press
Bouwhuis, D.G. 1993. ‘Reading rate and letter size’. IPO Annual Progress Report,
no. 28, pp. 3036
Bowden, Paul R., and David F. Brailsford. 1989. ‘On the noise immunity and
legibility of Lucida fonts’. In Raster imaging and digital typography, edited
by Jacques André and Roger Hersch, pp. 205212. Cambridge: Cambridge
University Press
Boyarski, Dan, Christine Neuwirth, Jodi Forlizzi, and Susan Harkness Regli.
1998. ‘A study of fonts designed for screen display’. CHI 98, pp. 8794
Brachfeld, Oliver. 1964. ‘Zuviel Grotesk’. Ver l ags-Praxis, no. 9, September,
pp. 245246
Bracht, Glenn H., and Gene V. Glass. 1968. ‘The external validity of
experiments’. American Educational Research Journal, vol. 5, no. 4,
pp. 437474
[Brinch, Ole]. 1976. ‘Bibliografi over kommunikationsteori, kommunikations-
midler, læselighetsforskning o.l.’ Copenhagen: The graphic College of
Denmark. (To a large extent based on bibliographic information from
FOGRA Literaturdienst, Deutsche Gesellschaft für Forschung im
graphischen Gewerbe e.V., München, under the index term ‘Schriften:
lesbarkeit’; and ZIID-Referatkartei Grafische Technik, Institut f. graf.
Techn ik, Leipzig)
References / 253
Braun, Curt C., and N.Clayton Silver. 1995. ‘Interaction of warning label
features: determining the contributions of three warning characteristics’.
In Designing for the global village: proceedings of the Human Factors and
Ergonomics Society 39th annual meeting 1995, volume 2, pp. 984988.
Santa Monica, California: Human Factors and Ergonomics Society
Brooks, T. 1991. ‘Career development: filling the usability gap’. Tech nical
Communication, vol. 38, no. 2, pp. 180184
Bruce, Vicki, and Patrick Green. 1990. Visu al perce pti on: physiolo gy, psycholo gy
and ecology. 2nd edn. Hove, East Sussex; and Hillsdale, New Jersey:
Lawrence Erlbaum. (3rd edition, with Mark A. Georgeson as co-author, was
published in 1996 by Psychology Press, an imprint of Erlbaum (UK) Taylor
Francis. Hove, East Sussex)
Bryson, Norman. 1994. ‘Art in context’. In The point of theory: practices of
cultural analysis, edited by Mieke Bal and Inge E. Boer, pp. 6678.
Amsterdam: Amsterdam University Press
Buchanan, Richard. 1992. ‘Wicked problems in design thinking’. Design Issues,
vol. 8, no. 2, pp. 521
Buchanan, Richard. 1993. ‘Semiotics and design’. Semiotica, vol. 97, no. 1/2,
pp. 189197
Buckingham, B.R. 1931. ‘New data on the typography of textbooks’. Ye ar bo o k of
the National Society for the Study of Education, vol. 30, pp. 93125
Buckley, Paul. 1989. ‘Expressing research findings to have a practical influence
on design’. In Cognitive ergonomics and humancomputer interaction, edited
by J. Long and A. Whitefield, pp. 166190. Cambridge: Cambridge
University Press
Bullington, Edward Weeks. 1948. ‘Legibility determinations of multiple carbon
copies’. MS thesis. Virginia Polytechnic Institute
Burke, Christopher. 1995/96. ‘Typeface review: ITC Bodoni’. Printing Historical
Society Bulletin, no. 40, pp. 3637
Burke, Christopher. 1998a. Paul Renner: the ar t o f typ ogr aph y. London: Hyphen
Press; New York: Princeton Architectural Press
Burke, Christopher. 1998b. ‘From functionalism to information design: a review
of Jan Tschichold’s The new typography’. Information Design Journal,
vol. 9, no. 1, pp. 5158
Burnhill, P., and J. Hartley. 1975. ‘Psychology and textbook design: a research
critique’. In Communication and learning, edited by Jon Baggaley,
G. Harry Jamieson, and Harry Marchant, pp. 6578. Volume 8 of Aspects of
educational technology. Pitman Publishing
Burt, Cyril. 1959. A psychological study of typography. With an introduction by
Stanley Morison. Cambridge: Cambridge University Press. (Reprinted in
1974 for the College of Librarianship, in Wales, by Bowker)
Burt, Cyril. 1960a. ‘The readability of type’. New Scientist, vol. 7, no. 168,
pp. 227229
Burt, Cyril. 1960b. ‘The typography of children’s books: a record of research
in the U. K. ’. I n Ye ar bo o k of Ed uc a ti on , edited by G.Z.F. Bereday and
J.A . Lauwerys. London: Evans
References / 254
Burt, Cyril, W.F. Cooper, and J.L. Martin. 1955. ‘A psychological study of
typography’. The British Journal of Statistical Psychology, vol. 8, pt. 1,
pp. 2957
Button, Graham, Jeff Coulter, John R.E. Lee, and Wes Sharrock. 1995.
Computers, minds and conduct. Cambridge: Polity Press
Buur, Jacob, and Kirsten Bagger. 1999. ‘Replacing usability testing with user
dialogue’. Communications of the ACM, vol. 42, no. 5, pp. 6366
C
——————
Cambrosio, Alberto, and Peter Keating. 1988. ‘ “Going monoclonal”: art, science,
and magic in the day-to-day use of hybridoma technology’. Social Problems,
vol. 35, no. 3, pp. 244260
Campbell, Joan. 1989. Joy in work: German work, the national debate 18001945.
Princeton, New Jersey: Princeton University Press
Campbell, Kim Sydow. 1995. Coherence, continuity, and cohesion: theoretical
foundations for document design. Hillsdale, New Jersey: Lawrence Erlbaum
Carmichael, Leonard, and Walter F. Dearborn. 1947. Reading and visual fatigue.
Boston: Houghton Mifflin.
Carroll, John M. 1990. The Nurnberg funnel: designing minimalist instruction for
practical computer skill. Cambridge, Massachusetts: The MIT Press
Carroll, John M. 1997. ‘Humancomputer interaction: psychology as a science of
design’. International Journal of HumanComputer Studies, vol. 46,
pp. 501522
Carroll, John M., and Robert L. Campbell. 1986. ‘Softening up hard science: reply
to Newell and Card’. HumanComputer Interaction, vol. 2, pp. 227249
Carroll, John M., and Robert L. Campbell. 1989. ‘Artifacts as psychological
theories: the case of humancomputer interaction’. Behaviour and
Information Technology, vol. 8, no. 4, pp. 247256
Carter, Harry. [1937] 1984. ‘Optical scale in type founding’. Printing Historical
Society Bulletin, no. 13, 1984, pp. 144148. (Originally published in
Typog raphy , no. 4, 1937, pp. 26)
Carver, Ronald P. 1992. ‘Reading rate: theory, research, and practical
implications’. Journal of Reading, vol. 36, no. 2, pp. 8495
Catalogue 43: the book arts. [1999]. Oxford: Frances Wakeman Books
Chapman, Jocelyn. 1978. ‘Re-thinking research into visual communication’.
Icographic, no. 13, pp. 2832
Cheetham, Dennis, and Brian Grimbly. 1964. ‘Design analysis: typeface’. Design,
no. 186, pp. 6171
Cheetham, Dennis, Christopher Poulton, and Brian Grimbly. 1965. ‘The case for
research’. Design, no. 195, pp. 4851
Christie, A.W., and K.S. Rutley. 1961a. ‘Relative effectiveness of some letter
types designed for use on road traffic signs’. Roads and Road Construction,
no. 39, August, pp. 239244
Christie, A.W., and K.S. Rutley. 1961b. ‘[Research] on road signs’. Design, no. 152,
August, pp. 59–60
References / 255
Chung, S.T.L., J.S. Mansfield, and G.E. Legge. 1998. ‘Psychophysics of reading
18: The effect of print size on reading speed in normal peripheral vision’.
Visio n Resear ch , vol. 38, no. 19, pp. 29492962
Clark, Malcolm. 1989. ‘Book review: Digital typography: an introduction to type
and composition for computer system design, Richard Rubinstein’. TEXline,
no. 8, January, pp. 3233
Coghill, Vera. 1980. ‘Can children read familiar words in unfamiliar type?’
Information Design Journal, vol. 1, no. 4, pp. 254260
Cohen, A.S. 1981. ‘Car drivers’ pattern of eye fixations on the road and in the
laboratory’. Perceptual an d M oto r S kil ls, vol. 52, pp. 515522
Cohn, Herman, and Robert Rübencamp. 1903. Wie sollencher und Zeitungen
gedruck werden? Braunschweig: F. Vieweg & Sohn
Collins, H.M. 1987. ‘Expert systems and the science of knowledge’. In The social
construction of technological systems: new directions in the sociology and
history of technology, edited by Wiebe E. Bijker, Thomas P. Hughes, and
Trevor J. Pinch, pp. 329348. Cambridge, Massachusetts: The MIT Press
Connolly, Kevin G. 1998. ‘Legibility and readability of small print: effects of font,
observer age and spatial vision’. MSc thesis. University of Calgary
Cook, Thomas D., and Donald T. Campbell. 1979. ‘Validity’. Chapter 2 in Quasi-
experimentation: design & analysis issues for field settings, pp. 3794.
Chicago: Rand McNally
Cooper, Geoff, and John Bowers. 1995. ‘Representing the user: notes on the
disciplinary rhetoric of humancomputer interaction’. In The social and
interactional dimensions of humancomputer interfaces, edited by Peter
J. Thomas, pp. 4865. Cambridge: Cambridge University Press
Cooper, M.B., H.N. Daglish, and J.A. Adams. 1979. ‘Reader preferences for report
typefaces’. Applied Ergonomics, vol. 10, no. 2, pp. 6670
Cornog, D.Y., and F.C. Rose. 1967. Legibility of alphanumeric characters and
other symbols. Vol. 2: a reference handbook. Miscellaneous publication
262-2.Washington, D.C.: National Bureau of Standards
Crawford, M.L.J., Richard A. Andersen, Randolph Blake, Gerald H. Jakobs, and
Christa Neumeyer. 1990. ‘Interspecies comparisions in the understanding
of human visual perception’. In Vis ual perc eption: the neu rophy sio logical
foundations, edited by Lothar Spillmann and John S. Werner, pp. 2352.
San Diego: Academic Press
Crosland H.R., and Georgia Johnson. 1928. ‘The range of apprehension as
affected by inter-letter hair-spacing and by characteristics of individual
letters’. Journal of Appl ied Psychology, vol. 12, February, pp. 82124
Cross, Nigel. 1980. ‘An introduction to design methods’. Information Design
Journal, vol. 1, no. 4, pp. 242253
Cross, Nigel. 1999. ‘Design research: a disciplined conversation’. Design Issues,
vol. 15, no. 2, pp. 510
Cross, Nigel, John Naughton, and David Walker. 1981. ‘Design method and
scientific method’. In Design: science: method. Proceedings of the 1980
Design Research Society Conference. Edited by Robin Jacques and James
A. Powell, pp. 1829. Guildford: Westbury House. (Also published, with a
few minor alterations, in Design Studies, vol. 2, no. 4, 1981, pp. 195201)
References / 256
Crutchley, Brooke. 1986. ‘ “Congeniality” in the typography of books’. Matrix,
no. 6, pp. 134139, plus 8 inserted pages with illustrations
D
——————
Dauppe, Michèle-Anne. 1991. ‘Get the message?’. Eye, no. 3, pp. 4–7
Deach, Stephen. 1992. ‘Outline font hints and rasterization: a technology primer’.
The Seybold Report on Desktop Publishing, vol. 6, no. 7, pp. 2132
Dearborn. W.F. 1906. The psychology of reading. New York: Columbia University
contribution to philosophy and psychology, vol. 14, no. 7
Debs, Mary Beth. 1988. ‘A history of advice: what experts have to tell us’. In
Effective communication: what we have learned from research, edited by
Stephen Doheny-Far ina , pp. 11 23. Cambridge, MA: The MIT Press
Department of Transport. 1991. The history of traffic signs. London: Department
of Transport, Traffic Signs Branch, Network Management and Driver
Information Division. (Photocopied leaflet of 35 pages)
Department of Transport. 1994. The design and use of directional informatory
signs. Local transport note, no. 1/94. London: HMSO
Department of Transport. 1995. Know your traffic signs. Fourth edition. London:
HMSO. (First edition 1975).
Design. 1959. ‘Which signs for motorways’. No. 129, pp. 28–30, 32, 35
Design Issues, vol. 11, no. 1, 1995. (Special issue on design historiography)
Dillon, Andrew. 1992. ‘Reading from paper versus screens: a critical review of the
empirical literature’. Ergonomics, vol. 35, no. 10, pp. 12971326
Dillon, Andrew. 1994. Designing usable electronic text: ergonomic aspects of
human information usage. London: Taylor & Francis
Dilnot, Clive. [1984] 1989. ‘The state of design history, part 1: mapping the field’,
pp. 213232; and ‘The state of design history, part 2: problems and
possibilities’, pp. 233262. In Design discourse: history, theory, criticism,
edited by Victor Margolin, Chicago and London: University of Chicago
Press, 1989. (Reprinted facsimile-like page for page from Design Issues,
vol. 1, no. 1, 1984, pp. 423; and vol. 1, no. 2, 1984, pp. 320)
Dilnot, Clive. 1999. ‘The science of uncertainty: the potential contribution of
design to knowledge’. In Doctoral education in design. Proceedings of the
Ohio conference, October 811, 1998. Edited by Richard Buchanan, Dennis
Doordan, Lorraine Justice, and Victor Margolin, pp. 6597. Pittsburgh: The
School of Design, Carnegie Mellon University
Doheny-Far ina , Step hen. 198 8. ‘ Methods an d resul ts: a ran ge of possibi lit ies ’.
Introduction to Effective documentation: what we have learned from
research, edited by Stephen Doheny-Fa rina, pp. 17. Cambridge,
Massachusetts: The MIT Press
Dormer, Peter. 1994. The art of the maker. London: Thames and Hudson
Dowding, Geoffrey. 1957. Fact ors in th e choice of typ e fac es. London: Wace &
Company
References / 257
Dowding, Geoffrey. 1966. Finer points in the spacing & arrangement of type.
3rd edition. London: Wace & Company. (A new edition, based on the 3rd
edition, with a foreword by Crispin Elsted, was published in 1995, by
Hartley & Marks, Vancouver.)
Dreyfus, John. 1957. ‘David Kindersley’s contribution to street lettering’. Penrose
Annual, vol. 51, pp. 3841
Dreyfus, John. 1961. ‘Univers in action’. Penrose A nnual, vol. 55, pp. 1519
Dreyfus, John. 1994. Who is to design books now that computer [sic] are making
books?” ’. In Into print: selected writings on printing history, typography and
book production, pp. 283297. London: The British Library. (Based on a
lecture given in 1988. The essay was first published in Classical typography
in the computer age, by Hermann Zapf and John Dreyfus. Los Angeles:
William Andrews Clark Library, 1991)
Dreyfus, Hubert L. 1992. What computers still can’t do: a critique of artificial
reason. 3rd edition. Cambridge, Massachusetts: The MIT Press. (What
computers can’t do: a critique of artificial reason, 1st edition 1972; What
computers can’t do: the limits of artificial reason, 2nd edition 1979)
Dreyfus, Hubert L., and Stuart E. Dreyfus. 1986. Mind over machine: the power
of human intuition and expertise in the era of the computer. New York: Free
Press
Dreyfuss, Henry. 1955. Designing for people. New York: Simon & Shuster
Duchnicky, Robert L., and Paul A. Kolers. 1983. ‘Readability of text scrolled on
visual display terminals as a function of window size’. Human Factors,
vol. 25, no. 6, pp. 683692
Dyson, Mary C. 1993. ‘Improving discrimination of symbols for display at low
resolution’. Electronic Publishing, vol. 6, no. 3, pp. 231239
Dyson, Mary C., and Gary J. Kipping. 1997. ‘The legibility of screen formats: are
three columns better than one?’. Computers and Graphics, vol. 21, no. 6,
pp. 703712
Dyson, Mary C., and Gary J. Kipping. 1998a. ‘Exploring the effect of layout on
reading from screen’. In Electronic publishing, artistic imaging and digital
typography, edited by Roger D. Hersch, Jacques André, and Heather
Brown, pp. 294304. Berlin: Springer-Verlag
Dyson, Mary C., and Gary J. Kipping. 1998b. ‘The effects of line length and
method of movement on patterns of reading from screen’. Vis ible Lan guage ,
vol. 32, no. 2, pp. 150181
E
——————
Eason, Ron, and Sarah Rookledge (edited by Phil Baines and Gordon Rookledge).
1991. Rookledge’s international handbook of type designers: a biographical
directory. Carshalton Beeches, Surrey: Sarema Press
Easterby, Ronald, and Harm Zwaga (editors). 1984. Information design: the
design and evaluation of signs and technical information. Chichester: John
Wiley
Edelman, Gerald M. 1992. Bright air, brilliant fire: on the matter of the mind.
London: Penguin, 1994
References / 258
Edworthy, Judy, and Austin Adams. 1996. War nin g desi gn: a resea rch
prospective. London: Taylor & Francis
Ehses, Hanno, and Ellen Lupton. 1988. Rhetorical handbook: an illustrated
manual for graphic designers. Design Papers no. 5. Halifax, Nova Scotia:
Design Division, Nova Scotia College of Art and Design
Emigre. 1996. Catalog (of various merchandise and typefaces). Sacramento:
Emigre
English, Earl. 1944. ‘A study of the readability of four newspaper headline types’.
Journalism Quarterly, vol. 21, September, pp. 217229
Engman, Petra. 1988. ‘Fontexperiment’. Stockholm: Department of numerical
analysis and computing science, Royal Institute of Technology
Epelboim, Julie, James R. Booth, and Robert M. Steinman. 1994. ‘Reading
unspaced text: implications for theories of reading eye movements’. Vision
Research, vol. 34, no. 13, pp. 17351766
Epelboim, Julie, James R. Booth, and Robert M. Steinman. 1996. ‘Much ado
about nothing: the place of space in text’. Vis ion Rese arch, vol. 36, no. 3,
pp. 465470
Eriksen, Trond Berg. 1993. ‘Form-skrift: en skriftreform’. In Bokspor: norske bøker
gjennom 350 år, edited by Per Strømholm, pp. 233242. Oslo:
Universitetsforlaget
Eurographic Press Interview. 1962. ‘Designer’s profile: Adrian Frutiger’. Print in
Britain, no. 9, vol. 9, January, pp. 258262
Evans, S.H., A.A.J. Hoffman, M.D. Arnoult, and O. Zinser. 1968. ‘Pattern
enhancement with schematic operators’. Behavioral Science, vol. 13, no. 5,
pp. 402404
Eyler, William R., and Donald A Stewart. 1982. ‘Notes on our new format’.
Editorial. Radiology, vol. 142, no. 1, p. 248
F
——————
Fal k, Valte r. 196 5. ‘Grotesk-revy: den serifflösa bokstavsformen under 150 år’.
Biblis, pp. 71105
Far rel l, Joyce E. 19 91. ‘Fitting physical screen parameters to the human eye’.
In The manmachine interface, edited by J.A.J. Roufs, pp. 723.
Basingstoke: Macmillan Press
Felker, Daniel B. (ed). 1980. Document design: a review of the relevant research.
Was hi ngton DC : A merican Ins ti tutes for Re se arch
Femia, Joseph V. 1981. ‘An hi storicist c ritique of “revi sionist” methods for
studying the history of ideas’. History and Theory, vol. 20, no. 2,
pp. 113134
Ferguson, Eugene S. 1 992. Engineering and the mind’s eye. Cambridge,
Massachusetts: The MIT Press
Fielding, Helen. 1994. ‘Life in the lost lane: motorists of Britain just don’t know
which way to turn: Helen Fielding on road sign madness’. Independent on
Sunday, 2 October
FMC Corporation. 1985. Product safety sign and label system. Santa Clara,
California: FMC Corporation
References / 259
Foster, Jere my J. 1968. ‘Commen tary: psychological research into legibility’ .
Journal of Typographic Research, vol. 2, no. 3, pp. 279282
Foster, Jere my J. (ed). 1971. Legibility research abstracts 1970. London: Lund
Humphries
Foster, Jere my J. (ed). 1972. Legibility research abstracts 1971. London: Lund
Humphries
Foster, Jere my J. 1973. ‘Legibility re search: the ergonomics of print ’. Icographic,
no. 6, pp. 2024
Foster, Jere my J. 1978. ‘Locating legibil ity research: a guide for the graphic
designer’. Visible La nguag e, vol. 7, no. 2, pp. 201205
Foster, Jere my J. 1980. Legibility research 19721978: a summary. [London]:
Graphic Information Research Unit, Royal College of Art
Frenckner, Kerstin. 1990. Legibility of continuous text on computer screens:
a guide to the literature. TRITA-NA P9010. Stockholm: Department of
numerical analysis and computing science, Royal Institute of Technology
Frenckner, Kerstin, Caroline Nordquist, Staffan Romberger, and Hans
Smedshammar. 1991. Legibility of continuous text on computer screens:
a series of experiments. Research report TRITA-NA-P9033. Stockholm:
Department of numerical analysis and computing science, Royal Institute
of Technology
Froshaug, Anthony. 1963. ‘Roadside traffic signs’. Design, no. 178, October,
pp. 3750
Frutiger, Adrian. 1962. ‘How I came to design Univers’. Print in Britain, no. 9,
vol. 9, January, pp. 263263266
Fuller, Steve. 1992. ‘Social epistemology and the research agenda of science
studies’. In Science as practice and culture, edited by Andrew Pickering,
pp. 390428. Chicago: University of Chicago Press
Furnham, Adrian F. 1988. Lay theories: everyday understanding of problems in
the social sciences. Oxford: Pergamon Press
Futuraheft. [1931/1932]. Frankfurt am Main, Barcelona, and New York:
Bauersche Giesserei
G
——————
Gadamer, Hans-Georg. [1960] 1993. Truth a nd me tho d. Second, revised edition.
London: Sheed & Ward (first English language edition, 1975) [Wahrhe it
und Methode: Grundzüge einer philosophischen Hermeneutik]
Gallagher, Thomas J., and Wendy S. Jakobson. 1993. ‘The typography of
environmental impact statements: criteria, evaluation, and public
participation’. Environmental Management, vol. 17, no. 1, pp. 99109
Garcia, Marelys L, and Cesar I. Caldera. 1996. ‘The effect of color and typeface
on the readability of on-line text’. Computers and Industrial Engineering,
vol. 31, no. 1/2, pp. 519524
References / 260
Garland, Ken. [1982] 1996. ‘Some thoughts. From Ken Garland and Associates
Designers: 20 years work and play, London, 1982’. In Ken Garland, A word
in your eye: opinions, observations and conjectures on design, from 1960 to
the present, pp. 5357. Reading: The University of Reading, Department of
Typography and Graphic Communication
Gelderman, Maarten. 1999. ‘A short introduction to font characteristics’.
TUGboat, vol. 20, no. 2, pp. 96104
Gelernter, Mark George. 1981. ‘The subject-object problem in design theory and
education’. PhD thesis. Bartlett School of Architectural Planning,
University College of London, University of London
Gelernter, Mark. 1993. ‘The concept of design and its application to architecture’.
In Companion to contemporary architectural thought, edited by Ben Farmer
and Hentie Louw, pp. 399403. London and New York: Routledge
Gerstner, Karl. 1963. ‘The old Berthold sans-serif on a new basis’. Druckspiegel,
no. 6. (Also published in Karl Gerstner, Designing programmes, pp. 2948.
London: Alec Tiranti, 1964)
Geske, Joel. 1996. ‘Legibility of sans serif type for use as body copy in computer
mediated communication’. AEJMC Visual Communication Division, 22 pp.
(Paper presented at the 79th annual meeting of the Association for
Education in Journalism and Mass Communication. Anaheim, California,
9–3 August, 1996)
Gibson, James J. 1979. The ecological approach to visual perception. Boston and
London: Houghton-Mifflin. (Republished by Lawrence Erlbaum. Hillsdale,
New Jersey, 1986)
Giddens, Anthony. 1976. New rules of sociological method: a positive critique of
interpretive sociologies. London: Hutchinson
Giddens, Anthony. 1987. ‘Nine theses on the future of sociology’. In Anthony
Giddens, Social theory and modern sociology. Cambridge: Polity Press
Giddens, Anthony. 1996. ‘What is social science?’. In Anthony Giddens,
In defense of sociology, pp. 6577. Cambridge: Polity Press
Gill, Eric. 1936. An essay on typography. 2nd edition. Sheed and Ward
Glaister, Geoffrey Ashall. 1979. Glaister’s glossary of the book. 2nd edition.
London: George Allen & Unwin. (The second edition was republished Oak
Knoll Press and The British Library in 1996, with a new introduction by
Donald Farren, under the title Encyclopedia of the book, a title which had
been used for the first American edition in 1960.
Goldschmidt, Gabriela. 1994. ‘On visual design thinking: the vis [sic] kids of
architecture.’ Design Studies, vol. 15, no. 2, pp. 158174
Gordon, V.E.C., and Ruth Mock. 1960. Twentie th-century handwriting. London:
Methuen
Göranzon, Bo, and Ingela Josefson (eds). 1988. Knowledge, skill and artificial
intelligence. London: Springer-Verlag
Göranzon, Bo, and Magnus Florin (eds). 1990. Artificial intelligence, culture and
language: on education and work. London: Springer-Verlag
Gray, Nicolete. 1960. ‘Sans serif and other experimental inscribed lettering of the
early renaissance’. Motif, no. 5, pp. 6676
References / 261
Gray, Nicolete. [1938] 1976. Nineteenth century ornamented typefaces. New
edition, and with a chapter on ornamented types in America by Ray Nash.
London: Faber and Faber (Originally published as XIXth century
ornamented types & title pages in 1938)
Green, C.D. 1992. ‘Of immortal mythological beasts’. Theory and Psychology,
vol. 2, no. 3, pp. 291320
Gribbons, William M. 1991. ‘Visual literacy in corporate communication: some
implications for information design’. IEEE Transactions on Professional
Communication, vol. 34, no. 1, pp. 4250
Griffing, Harold, and Shepherd Ivory Franz. 1896. ‘On the conditions of fatigue
in reading’. Psychological Review, vol. 3, pp. 513530
Grooters, Lyle E. 1972. ‘The relationship of letter style, letter size, and viewing
distance to the readability of transparent visuals’. PhD dissertation.
University of Oklahoma
Grove, Laurel K., and Donald E. Zimmerman. 1997. ‘Introduction: bringing
communication science to technical communication advancing the
profession’. IEEE Transactions on Professional Communication, vol. 40,
no. 3, pp. 157167
Grudin, Jonathan. 1993. ‘Interface: an evolving concept’. Communications of the
ACM, vol. 36, no. 4, pp. 110119
Gürtler, André. 1988. ‘History of the development of the newspaper: a survey of
newspaper text faces of the 20th century’. Typo grafis che Monats btter,
no. 1, pp. 136. (Tri-lingual text in German, French and English.)
H
——————
Habermas, Jürgen. [1981] 1984. The theory of communicative action, volume 1:
reason and the rationalization of society. Translated by Thomas McCarthy.
London: Heinemann [Theorie des kommunikativen Handelns, Band 1,
Handlungsrationalitet und gesellschaftliche Rationalisierung. Frankfurt
am Main: Suhrkamp Verlag]
Hacker, Peter. 1987. ‘Languages, minds and brains’. In Mindwaves: thoughts on
intelligence, identity and conciousness, edited by Colin Blakemore and
Susan Greenfield, pp. 484505. Oxford: Basil Blackwell
Hacker, Peter. 1991. ‘Seeing, representing and describing: an examination of
David Marr’s computational theory of vision’. In Investigating psychology:
sciences of the mind after Wittgenstein, edited by John Hyman, pp. 119154.
London and New York: Routledge
Hallberg, Åke. 1992. Typ ografi n och sp ros essen : grafis k kom munik ati on
med text och bild. Halmstad: Bokförlaget Spectra
Halliday, M.A.K. 1970. ‘Language structure and language function’. In New
horizons in linguistics, edited by John Lyons, pp. 140165.
Harmondsworth: Penguin
Halliday, M.A.K, and Ruqaiya Hasan. 1976. Cohesion in English. London:
Longman
Handover, P.M. 1960. ‘Palette for printers’. Motif, no. 5, pp. 9495
Handover, P.M. 1961. ‘Letters without serifs’. Motif, no. 6, pp. 6681
References / 262
Handover, P.M. 1963. ‘Grotesque letters: a history of unseriffed type faces from
1816 to the present day’. Monotype News Letter, no. 69, pp. 2–9
Harré, Rom, and Grant Gillett. 1994. The discursive mind. Thousand Oaks,
California: Sage
Harris, J. 1973. ‘Confusions in letter recognition’. Professional Printer, vol. 17,
no. 2, pp. 2934
Hartley, James, and Peter Burnhill. 1977. ‘Understanding instructional text:
typography, layout and design’. In Adult learning: psychological research
and applications, edited by Michael J.A. Howe. London: John Wiley
Hartley, James, and Donald Rooum. 1983. ‘Sir Cyril Burt and typography: a
re-evaluation’. British Journal of Psychology, vol. 74, pt. 1, pp. 203212
Hearnshaw, L.S. 1979. Cyril Burt: psychologist. London: Hodder and Stoughton
Hellesnes, Jon. 1975. Sosialisering og teknokrati: ein sosialfilosofisk studie med
særleg vekt på pedagogikkens problem. Oslo: Gyldendal Norsk Forlag
Hellesnes, Jon. 1998. ‘Skulen og den teknokratiske freistinga’. Syn og Segn,
vol. 104, no. 3, pp. 210215
Henderson, Kathryn. 1995. ‘The visual culture of engineers’. In The cultures of
computing, edited by Susan Leigh Star, pp. 196218. Oxford: Blackwell and
Sociological Review
Hersch, Roger D. 1993. ‘Font rasterization: the state of the art’. In Vi sual and
technical aspects of type, edited by Roger D. Hersch, pp. 78109.
Cambridge: Cambridge University Press
Hersch, Roger D., Claude Bétrisey, Justin Bur, and André Gürtler. 1995.
‘Perceptually tuned generation of grayscale fonts’. IEEE Computer
Graphics and Applications, vol. 15, no. 6, pp. 7889
Hillier, Bill, John Musgrove, and Pat O’Sullivan. [1972] 1984. ‘Knowledge and
design’. In Developments in design methodology, edited by Nigel Cross,
pp. 245264. Chichester: John Wiley. (Originally published in 1972, in
Environmental design: research and practice, edited by W.J. Mitchell. Los
Angeles: University of California)
Hoffman, Valerie. 1987. ‘Effect of typeface on iconic storage capacity’. Paper
presented at the Annual meting of the Western Psychological Association,
Long Beach, California, April 2326, 25 pp.
Hoffman, Veronica M. 1988. ‘An investigation of the affects of typefaces upon
reader’s perception of the meanings of messages using the semantic
differential testing technique’. MS thesis. Rochester Institute of Technology
Holleran, Patrick. 1992. ‘An assessment of font preferences for screen-based text
display’. In Peopl e a nd com put ers 7: p roc eed ing s o f HCI 92 , York, Se pte mber
1992, British Computer Society Conference Series 5, edited by A. Monk,
D. Diaper, and M.D. Harrison, pp. 447459
Horn, Robert E. 1998. ‘Comments on “Building the bridge across the years and
the disciplines” ’. Information Design Journal, vol. 9, no. 1, pp. 2123
Horn, Robert E. 1999. ‘Information design: emergence of a new profession’. In
Information design, edited by Robert Jakobson, pp. 1533. Cambridge,
Massachusetts: The MIT Press
Hubel, David H. 1988. Eye, brain, and vision. New York: Scientific American
Library
References / 263
Hubel, D.H., and T.N. Wiesel. 1959. ‘Receptive fields of single neurones in the
cat’s striate cortex’. Journal of Physiology, vol. 148, pp. 574591
Hubel, D.H., and T.N. Wiesel. 1962. ‘Receptive fields, binocular interaction and
functional architecture in the cat’s visual cortex’. Journal of Physiol ogy,
vol. 160, pp. 106154
Huey, Edmund B. 1908. The psychology and pedagogy of reading. New York:
Macmillan. (Republished by MIT Press, Cambridge, Massachusetts, 1968.)
Hughes, Philip K., and Barry L. Cole. 1986. ‘Can the conspicuity of objects be
predicted from laboratory experiments?’. Ergonomics, vol. 29, no. 9,
pp. 10971111
Huitfeldt, Claus. 1995. ‘Multidimensional texts in a one-dimensional medium’.
Computers and the Humanities, vol. 28, pp. 235241
Hutchings, R.S. 1966. ‘Neo-grot types and the new Swiss typography’. British
Printer, vol. 79, no. 3, pp. 116130
Hutchinson, James. 1983. Letters. London: The Herbert Press
Hvistendal, J.K., and Mary R. Kahl. 1976. Roman v. sans serif body type:
readability and reader preference’, Typograph ic, vol. 8, no. 2, 4 pages.
Published by the International Typographic Composition Association, in
the USA. (Allegedly reprinted from The ANPA [American Newspaper
Publishers Association] News Research Bulletin, no. 2, 1975, pp. 311;
however British Library’s interlibrary service did not find it in the 1976
volume of the bulletin. The article is probably based on Mary Ruth Luna
Kahl’s M.S. thesis: ‘A study of reading speed and reader preferences
between roman and sans serif type’, Iowa State University, 1974)
I
——————
‘Information design database: authors AK’. 1992. Reading: Department of
Typography & Graphic Communication (unpublished bibliography)
‘Information design database: authors LZ’. 1993. Reading: Department of
Typography & Graphic Communication (unpublished bibliography)
Ingvar, David H.; and Åke Hallberg. 1989. Hjärnan, bokstaven, ordet. Halmstad:
Bokförlaget Spektra
Ions, Edmund. 1977. Against behaviouralism: a critique of behavioural science
Oxford: Basil Blackwell
J
——————
Janik, A. 1988. ‘Tacit knowledge, working life and scientific method’. In
Knowledge, skill and artificial intelligence, edited by Bo Göranzon and
Ingela Josefson, pp. 5363. London: Springer-Verlag
Janik, A. 1990. ‘Tacit knowledge, rule following and learning’. In Artificial
intelligence, culture and language: on education and work, edited by Bo
Göranzon and Magnus Florin, pp. 4555. London: Springer-Verlag
References / 264
Jansen, Linda, and Paulli Thomsen. 1986. Typo gra for syns hæmmende: en
undersøgelse af mulige sammenhænge mellom nogle typografiske faktorer og
læselighed for synshæmmende. Copenhagen: Instituttet for blinde og
svagsynede
Janson, Marius A., and Stephen J. Morrissey. 1991. ‘Legibility of video display
units: one more look’. Behaviour & Information Technology, vol. 10, no. 6,
pp. 525542
Jaspert, W. Pincus, W. Turner Berry, and A.F. Johnson. 1970. The encyclopaedia
of type faces. Third edition. Poole: Blandford Press
Javal, Emile. [1905] 1978. Physiologie de la lecture et de l’écriture. [Paris: Félix
Alcan]. Reprinted as facsimile, with a new preface by François Richaudeau,
in the series Les classiques des sciences humaines. Paris: Retz-CEPL
Jha, S.S., and C.N. Daftuar. 1981. ‘Legibility of typefaces’. Journal of
Psychological Researches, vol. 25, no. 2, pp. 108110
John, E. Roy, and Eric L. Schwartz. 1978. ‘The neurophysiology of information
processing and cognition’. Annual Review of Psychology, vol. 29, pp. 129
Jonas, Wolfgang. 1993. ‘Design as problem-solving? or: here is the solution
what was the problem?’. Design Studies, vol. vol. 14, no. 2, pp. 157170
Johnson, Edward Allan. 1996. ‘Meanings of typography in trademarks’. PhD
dissertation. The University of Alabama
Joynson, R.B. 1974. Psychology and common sense. London: Routledge & Kegan
Paul
Joynson, Robert B. 1989. The Burt affair. London: Routledge
K
——————
Kahn, Paul, and Krzysztof Lenk. 1993. ‘Typography for the computer screen:
applying the lessons of print to electronic documents’. The Seybold Report
on Desktop Publishing, vol. 7, no. 11, pp. 316
Kalleberg, Ragnvald. 1996. ‘Forskningsopplegget og samfunnsforskningens
dobbeltdialog’. In Kvalitative metoder i samfunnsforskningen, edited by
Harriet Holter and Ragnvald Kalleberg, second edition, pp. 2672. Oslo:
Universitetsforlaget
Kalleberg, Ragnvald. 1997. ‘Skjervheim, sosiologi, filosofi og vitenskapsstudier’.
In Regime under kritikk: om Hans Skjervheim i norsk filosofi og samfunns-
debatt, edited by Hermund Slaattelid, pp. 239266. Oslo: Aschehoug
Kantowitz, Barry H. 1992. ‘Selecting measures for human factors research’.
Human Factors, vol. 34, no. 4, pp. 387398
Karow, Peter. 1994. Fon t t echn olo gy: m eth ods an d t ools. Foreword by Gerard
Unger. Berlin: Springer-Ver lag. (Or ig ina l tit le : Schrifttechnologie). (On the
cover the title reads: Font techn olo gy: descri pti on and to ols)
Karow, Peter. 1997. ‘The quality display of text on computer screens’. Type,
vol. 1, no. 1, pp. 4764
Katzen, May. 1977. The visual impact of scholarly journal articles. Primary
Communications Research Centre, University of Leicester
Kelley, Harold H. 1992. ‘Common-sense psychology and scientific psychology’.
Annual Review of Psychology, vol. 43, pp. 123
References / 265
Kelly, John. 1993. Artificial intelligence: a modern myth. Ellis Horwood series in
artificial intelligence. New York and London: Ellis Horwood
Kempson, Elaine, and Nick Moore. 1994. Designing public documents: a review of
research. London: Policy Studies Institute
Kerr, J. 1926. The fundamentals of school health. London: Allen & Unwin
Kindersley, David. 1960. ‘Motorway sign lettering’. Traffic Engin eer ing &
Control, December, pp. 463465.
Kindersley, David. 1968. ‘Legibility’. Printing Technology, vol. 12, no. 2,
pp. 6971
Kindersley, David. 1976. Optical letter spacing: for new printing systems. London:
Wynkyn de Word e Soci ety.
Kinneir, Jock. 1970. ‘Det offentlige skilt’. In Offentlig design, edited by Christian
Ejlers, Erik Ellegaard Fredriksen and Niels Kryger, pp. 1523.
Copenhagen: Christian Ejlers Forlag
Kinneir, Jock. 1971. ‘Technical notes on the redesign of the United Kingdom road
signs’. Unpublished machine-written manuscript, 15 + 7 pp. Written for a
Russian monthly journal, but never published. (Location: Margaret Calvert,
London)
Kinneir, Jock. 1984. ‘The practical and graphic problems of road sign design’. In
Information design: the design and evaluation of signs and technical
information, edited by Ronald Easterby and Harm Zwaga, pp. 341358.
Chichester: John Wiley
Kinross, Robin. 1984. ‘Kinneir, Jock’. In Contemporary designers, edited by Ann
Lee Morgan and Colin Naylor, p. 324. London: Macmillan Publishers
Kinross, Robin. 1985. ‘The rethoric of neutrality’. Design Issues, vol. 2, no. 2,
pp. 1830. (Reprinted facsimile-like page for page, in Design discourse:
history, theory, criticism, edited by Victor Margolin, pp. 131143. Chicago
and London: University of Chicago Press. 1989. Originally presented as a
paper to the first Information Design Conference in the UK, in 1984)
Kinross, Robin. 1989. ‘Road signs: wrong turning?’ Blueprint, October, pp. 5051
Kinross, Robin. 1992. Modern typography: an essay in critical history. London:
Hyphen Press
Kinross, Robin. 1994a. Fell ow readers : an es say on mu lti pli ed la ngu age . London:
Hyphen Press
Kinross, Robin. 1994b. ‘Obituary: Richard Jock Kinneir’. Directions: newsletter of
the Sign Design Society, vol. 1, no. 7, pp. [23]. (Extended and revised
version of obituary first published in The Guardian, August 30, 1994)
Klare, George R. 1963. The measurement of readability. Ames, Iowa: Iowa State
University Press
Kline, Paul. 1988. Psychology exposed or the emperor’s new clothes. London and
New York: Routledge
Klitz, T.S., J.S. Mansfield, and G.E. Legge. 1995. ‘Reading speed is affected by
font transitions’. (Poster presentation / meeting abstract). Investigative
Ophthalmology & Visual Science, vol. 36, no. 4, p. S670
Koch, S. 1992. ‘Psychology’s Bridgman vs. Bridgman’s Bridgman’. Theory and
Psychology, vol. 2, no. 3, pp. 261290
References / 266
Kolers, Paul A. 1983. ‘Locations and contents’. In Eye movements in reading:
perceptual and language processes, edited by Keith Rayner, pp. 5361. New
Yo rk a n d Lo n do n: A ca de m ic P r es s
Kosslyn, Stephen M. 1994. Elements of graph design. New York: W.H. Freeman
and Company
Kostelnick, Charles, and David R. Roberts. 1998. Designing visual language:
strategies for professional communicators. Boston: Allyn and Bacon
Kostelnick, Charles. 1995. ‘Cultural adaptation and information design: two
contrasting views’. IEEE Transactions on Professional Communication,
vol. 38, no. 4, pp. 182196
Kowler, Eileen (editor). 1990. Eye movements and their role in visual and
cognitive processes. Amsterdam and New York: Elsevier Science
Krampen, Martin. 1983. ‘Icons of the road’. Special issue of Semiotica, vol. 43,
nos. 1/2
Kravutske, Mary Eileen. 1994. ‘The effect of serif versus sans serif typeface on
reader comprehension and speed of reading’. PhD dissertation. Wayne
State University
Krull, Robert. 1997. ‘What practitioners need to know to evaluate research’.
IEEE Transac tio ns on Profess ional Co mmuni cation, vol. 40, no. 3,
pp. 168181
Kuhn, Thomas S. [1962] 1970. The structure of scientific revolutions. 2nd edition.
Chicago: University of Chicago Press
Kunst, Robert John. 1972. ‘Type styles as related to reading comprehension’.
DEd dissertation. Arizona State University
Kvernbekk, Tone. 1995. ‘Erfaringstyranni eller teorityranni: et filosofisk
perspektiv på praksis’. In Profesjonsutdanning og forskning: FoU-
perspektiver på praksisfeltet, [edited by] Lærerutdanningsrådet, pp. 1727.
Oslo: Lærerutdanningsrådet
Kyng, Morten, and Joan Greenbaum. 1991. Design at work: cooperative design of
computer systems. Hillsdale, New Jersey: Lawrence Erlbaum
L
——————
LaCapra, Dominick. 1983. ‘Rethinking intellectual history and reading texts’. In
Rethinking intellectual history: texts, contexts, language, pp. 2371. Ithaca:
Cornell University Press. (The essay was first published in History and
Theory, vol. 19, 1988, pp. 245276. It was later reprinted with some
changes in the volume above, as well as in Modern European intellectual
history: reappraisals and new perspectives, edited by Dominick LaCapra
and Steven L. Kaplan, pp. 4785. Itacha: Cornell University Press)
Lakatos, Imre. 1970. ‘Falsification and the methodology of scientific research
programs’. In Criticism and the growth of knowledge, edited by Imre
Lakatos and Alan Musgrave, pp. 91196. London: Cambridge University
Press
References / 267
Landauer, Thomas K. 1991. ‘Let’s get real: a position paper on the role of
cognitive psychology in the design of humanly useful and usable systems’.
In Designing interaction, edited by John M. Carroll, pp. 6073. Cambridge:
Cambridge University Press
Landauer, Thomas K. 1997. ‘Behavioural research methods in humancomputer
interaction’. In Handbook of humancomputer interaction, second,
completely revised edition, edited by Martin G. Helander, Thomas K.
Landauer and Prasad V. Prabhn, pp. 203227. Amsterdam: Elsevier
Lane, Barnett. 1966. ‘Legibility depends on the reader’. Print in Britain, vol. 14,
no. 8, p. 12
[Lange, Rudi Wynand de]. 1992. ‘The legibility of sans serif typefaces’. Graphix,
December/January, 1993, pp. 1516
Lange, Rudi W. de, Hendry L. Esterhuizen, and Derek Beatty. 1993.
‘Performance differences between Times and Helvetica in a reading task’.
Electronic Publishing, vol. 6, no. 3, pp. 241248
Lange, Rudi Wynand de. 1993. ‘The legibility of sans serif typefaces, an
experimental and comparative study’. Master’s Diploma dissertation.
Bloemfontein: Technikon OFS
Leahey, Thomas Hardy. 1992. A history of psychology: main currents in
psychological thought. 3rd edn. Englewood Cliffs, New Jersey: Prentice Hall
Leat, S.J., W. Li, and K. Epp. 1999. ‘Crowding in central and eccentric vision: the
effects of contour interaction and attention’. Investigative Ophthalmology &
Visua l Scienc e, vol. 40, pp. 504512
Lee, Marshall. 1979. Bookmaking: the illustrated guide to design/production/
editing. Second edition. New Yo rk : Bo wk e r
Lee, Vicki L. 1988. Beyond behaviourism. Hillsdale, New Jersey: Lawrence
Erlbaum
Legge, Gordon E., Denis G. Pelli, Gary S. Rubin, and Mary M. Schleske. 1985a.
‘Psychophysics of reading 1: Normal vision’. Vision Re search, vol. 25,
no. 2, pp. 239252
Legge, Gordon E., Gary S. Rubin, Denis G. Pelli, and Mary M. Schleske. 1985b.
‘Psychophysics of reading 2: Low vision’. Vision Re search, vol. 25, no. 2,
pp. 253265
Legge, Gordon E., Gary S. Rubin, and Andrew Luebker. 1987. ‘Psychophysics of
reading 5: The role of contrast in normal vision’. Visi on Re sea rch, vol. 27,
no. 7, pp. 11651177
Legge, Gordon E., Julie A. Ross, Kathleen T. Maxwell, and Andrew Luebker.
1989. ‘Psychophysics of reading 7: Comprehension in normal and low
vision’. Clinical Vision Science, vol. 4, no. 1, pp. 5160
Legge, Gordon E., Sonia J. Ahn, Timothy S. Klitz, and Andrew Luebker. 1997.
‘Psychophysics of reading 16: The visual span in normal and low vision’.
Visio n Resear ch , vol. 37, no. 14, pp. 19992010
Legros, Lucien Alphonse. 1922. A note on the legibility of printed matter.
Prepared for the information of the committee on type faces. London: His
Majesty’s Stationary Office
References / 268
Legros, Lucien Alphonse, and John Cameron Grant. 1916. Typogra phica l
printing-surfaces: the technology and mechanism of their production.
London: Longmans, Green, and Co
Lenze, J. 1989. ‘Serif vs. sans serif type fonts: a comparision based on readers’
comprehension’. (Not seen. Ref. in Kravutske 1994)
Level, Jeff. 1989. ‘Face to face with the daily news’. Fine Print, vol. 15, no. 1,
pp. 2330, plus a one-page bibliography on the back of a separately
attached fold-out poster
Levenston, E.A. 1992. The stuff of literature: physical aspects of texts and their
relation to literary meaning. Albany: State University of New York Press
Lewis, Clive. 1989. ‘Typographic factors in reading’. PhD thesis. Lancashire
Polytechnic, School of Psychology, Faculty of Science
Lewis, Clive, and Peter Walker. 1989. ‘Typographic influences on reading’.
British Journal of Psychology, vol. 80, no. 2, pp. 241257
Licko, Zuzana. 1990. ‘Typeface designs’. Emigre, no. 15, pp. 813
Licko, Zuzana. 1996a. Base 12/9. Sacramento: Emigre. (Promotional folder with
type specimens and discursive text.)
[Licko, Zuzana]. 1996b. A typeface designed by Zuzana Licko: based on the design
of Baskerville. Sacramento: Emigre. (Promotional pamphlet with type
specimens and discursive text.)
Long, John. 1996. ‘Specifying relations between research and the design of
humancomputer interactions’. International Journal of HumanComputer
Studies, vol. 44, pp. 875920
Long, Robert. 1995. ‘David Kindersley: in love with letters’. Serif: The Magazine of
Type & Typography, no. 3, 1995, pp. 3237
Lubell, Stephen. 1993. ‘Ranged right, ranged left, justified and all that … : a few
thoughts on legibility, readability, typographic tradition and De gustibus
non est disputandum’. TypeL ab Ga zet te, ATypI (Association Typographique
Internationale) conference, Antwerp, 1993, unpaginated, 4 tabloid sized
pages
Luckiesh, Matthew, and Frank K. Moss. 1937. ‘The visibility of various type
faces’. Journal of the Franklin Institute, no. 223, January, pp. 7782
Luckiesh, Matthew, and Frank K. Moss. 1938. ‘Effects on leading on readability’.
Journal of Appli ed Psychology, vol. 22, pp. 140160
Luckiesh, Matthew, and Frank K. Moss. 1942. Reading as a visual task. New
Yo rk : Van N o st ra n d
Lund, Ole. 1993. ‘Description and differentiation of sans serif typefaces’.
Postgraduate diploma dissertation. Department of Typography & Graphic
Communication, The University of Reading
Lund, Ole. 1995. Review article of In black & white: an r & d report on typography
and legibility, by Kim Pedersen and Anders Kidmose. Information Design
Journal, vol. 8, no. 1, pp. 9195
Lund, Ole. 1997. ‘Why serifs are (still) important’. Typogra phy Pa pers, no. 2,
pp. 91104
Lund, Ole. 1997. Review article of Type & layou t: how typog rap hy and design can
get your message across or get in the way, by Colin Wheildon. Information
Design Journal, vol. 9, no. 1, pp. 7477
References / 269
M
——————
Macdonald-Ross, Michael. 1978. ‘Graphics in text’. In Review of research in
education, vol. 5, 1977, edited by Lee S. Shulman, pp. 4985. Itacha,
Illinois: F. E . P e ac o c k P ub l i s h e rs , a p u b l i c at i o n o f t h e Am e r i c a n E d u c a ti o n a l
Research Association
Macdonald-Ross, Michael. 1989. ‘Towards a graphic ecology’. In Knowledge
acquisition from text and pictures, edited by Heinz Mandl and Joel R. Levin,
pp. 145154. Amsterdam: North Holland, Elsevier Science Publishers
Macdonald-Ross, Michael. 1994. ‘Print media, production of’. In The international
encyclopedia of education, second edition, volume 8, pp. 46894694
Macdonald-Ross, Michael, and Eleanor Smith. 1977. Graphics in text: a
bibliography. IET monograph no. 6. Milton Keynes: Institute of Educational
Techn ology, The Open Universit y
Macdonald-Ross, Michael, and Robert Waller. 1975. ‘Criticism, alternatives and
tests: a conceptual framework for improving typography’. Programmed
Learning and Educational Technology, vol. 12, no. 2, pp. 7583
Mack, George E. 1979. ‘Opinion/Commentary’. Communication Arts, vol. 21,
pt. 2, May/June, pp. 9697
MacKenzie, Maureen. 1994. ‘Our changing visual environment: questions and
challenges’. Communication News (of the Communication Research
Institute of Australia), vol. 7, no. 4, pp. 1114
Mackintosh, N.J. (ed). 1995. Cyril Burt: fraud or framed? Oxford: Oxford
University Press
Mackworth, N. 1965. ‘Visual noise causes tunnel vision’. Psychonomic Science,
no. 3, pp. 6768
Mansfield, J. Stephen, Gordon E. Legge, and Mark C. Bane. 1996. ‘Psychophysics
of reading 15: Font effects in normal and low vision’. Investigative
Opthalmology & Visual Science, vol. 37, no. 8, pp. 14921501
Manwaring, Tony, and Stephen Wood. 1984. ‘The ghost in the machine: tacit skills
in the labor process’. Socialist Review, no. 74, March/April, pp. 5583
Margolin, Victor. 1988. ‘A decade of design history in the United States 197787’.
Journal of Desi gn History, vol. 1, no. 1, pp. 5172
Margolin, Victor. 1992. Design history or design studies: subject matter and
methods’. Design Studies, vol. 13, no. 2, pp. 104116. (Also published in
Design Issues, vol. 11, no. 1, 1995, pp. 415)
Margolin, Victor. 1998. ‘Design studies: proposal for a new doctorate’. In The
education of a graphic designer, edited by Steven Heller, pp. 163170. New
Yo rk : Al lw o rt h P re ss
Marr, David. 1982. Visio n: a co mputatio nal inve sti gatio n i nto the human
representation and processing of visual information. San Francisco:
W.H . Freeman
Mason, John. 1994. ‘British Airports Authority’. Directions: newsletter of the Sign
Design Society, vol. 1, no. 6, pp. [13]
Massin. 1970. Letter and image. Translated [from French] by Caroline Hillier
and Vivienne Menkes. London: Studio Vista
References / 270
McClelland, J.L., and D.E. Rumelhart. 1981. ‘An interactive model of context
effects in letter perception: part 1: an account of findings’. Psychological
Review, vol. 88, no. 5, pp. 375407
McCullough, Malcolm. 1996. Abstracting craft: the practical digital hand.
Cambridge, Massachusetts: The MIT Press
McGann, Jerome. 1988a. The beauty of inflections: literary investigations in
historical method and theory. Oxford: Clarendon Press
McGann, Jerome. 1988b. ‘Theory of texts’. London Review of Books, vol. 10, no. 4,
18 February, pp. 2021. (Review essay on D.F. McKenzie’s Bibliography
and the sociology of texts: the Panizzi lectures 1985)
McGrew, Mac. 1993. American metal typefaces of the twentieth century. 2nd ed.
New Castle, Delaware: Oak Knoll Books
McLean, Ruari. 1980. The Thames and Hudson manual of typography. London:
Thames and Hudson
Mergenthaler Linotype. 1935. The legibility of type. Brooklyn, New York:
Mergenthaler Linotype Company
Michaels, Claire F., and Claudia Carello. 1981. Direct perception. Englewood
Cliffs, New Jersey: Prentice-Hall
Misanchuk, Earl R. 1989a. ‘Learner/user preferences for fonts in microcomputer
screen display’. Canadian Journal of Educational Communication, vol. 18,
no. 3, pp. 193205
Misanchuk, Earl R. 1989b. ‘Learner preferences for typeface (font) and leading in
print materials’. ERIC no.: ED307854
Michl, Jan. 1991. ‘On the rumor of functional perfection’. Pro Forma, no. 2,
pp. 6781
Michl, Jan. 1995. ‘Form follows what?: the modernist notion of function as a carte
blanche’. Magazine of the Faculty of Architecture & Town Planning, no. 10,
pp. 3120 [sic]
Mijksenaar, Paul. 1997. Visual f unction: an int roductio n t o inf ormat ion desi gn.
Rotterdam: 010 Publishers
Ministry of Transport. 1962. Traffic sig ns fo r motorw ays: final re port of adv isory
committee. London Her Majesty’s Stationery Office. (Title on cover:
Motorway signs: final report of advisory committee on traffic signs for
motorways.) (The Anderson report)
Ministry of Transport. 1963. Report of the traffic signs committee: 18th April
1963. London: Her Majesty’s Stationery Office. (Title on cover: Traffi c signs
1963: report of the committee on traffic signs for all-purpose roads.) (The
Wor boys re por t)
Moede [1931/1932]. See Futuraheft. [1931/1932]
Molander, Bengt. 1996. Kunskap i handling. Göteborg: Daidalos
Moles, Abraham. [1958] 1966. Information theory and esthetic perception. Urbana
and London: University of Illinois Press (Originally published in 1958 as
Théorie de l’information et perception esthétique)
Moles, Abraham. 1986. ‘The legibility of the world: a project of graphic design’.
Design Issues, vol. 3, no. 1, pp. 4353. (Republished facsimile-like page by
page in Design discourse: history, theory, criticism, edited by Victor
Margolin, pp. 119129. Chicago: University of Chicago Press, 1989)
References / 271
Mollerup, Per. [1997] 1998. Marks of excellence: the history and taxonomy of
trademarks. Revised edition. London: Phaidon Press
Monotype Recorder. 1964. ’Map design and typography’, vol. 43, no. 1, pp. 4252
Moore, R.L., and A.W. Christie. 1963. ‘Research on traffic signs’. In Engineering
for Traffic Conference, July, 1963, p. 116
Moriarty, Sandra Ernst, and Edward C. Scheiner. 1984. ‘A study of close-set text
type’. Journal of Applied Psych ology, vol. 69, no. 4, pp. 700702
Morison, Stanley. 1996. First principles of typography. New edition. Introduction
by Huib van Krimpen. Preface by David McKitterick. Leiden: Academic
Press
Morris, Robert A. 1988. ‘Image processing aspects of type’. In Document
manipulation and typography. Proceedings of the international conference
on electronic publishing, document manipulation and typography, Nice,
April 2022 1988. Edited by J.C. van Vliet, pp. 139155. Cambridge:
Cambridge University Press
Morris, R.A. 1989. ‘Rendering digital type: a historical and economic view of
technology’. The Computer Journal, vol. 32, no. 6, pp. 524532
Morris, R.A., K. Berry, K.A. Hargreaves, and D. Liarokapis. 1991. ‘How typeface
variation and typographic scaling affect readability at small sizes’. In
Proceedings of the 7th International congress on advances in non-impact
printing technologies. Portland: Society for Imaging Science and Technology
Morris, R.A., K. Berry, and K.A. Hargreaves. 1993. ‘Towards quantification of the
effects of typographic variation on readability’. 1993 SID international
symposium: digest of technical papers, edited by Jay Morreale, vol. 24,
pp. 6467. Playa del Rey, California: Society for Information Display
Morris, R.A., R.D. Hersch, and A. Coimbra. 1998. ‘Legiblity of condensed
perceptually-tuned grayscale fonts’. In Electronic publishing, artistic
imaging and digital typography, edited by Roger D. Hersch, Jacques André,
and Heather Brown, pp. 281293. Berlin: Springer-Verlag
Morrison, Gary R. 1986. ‘Communicability of the emotional connotation of type’.
Educational Communication and Technology, vol. 34, no. 4, pp. 235244
Morrison, Robert E., and Albrecht-Wer ner Inh off. 198 1. ‘ Visua l f act ors and e ye
movements in reading’. Visib le Langu age, vol. 15, no. 2, pp. 129146
Mosley, James. 1965. ‘The nymph and the grot: the revival of the sanserif letter’.
Typog raphi ca, no. 12, pp. 219
Mosley, James. 1999. The nymph and the grot: the revival of the sanserif letter.
London: Friends of the St. Bride Printing Library. (Updated version of
Mosley 1964)
Myers, Prue Wallis. 1983. ‘Handwriting in English education’. Vi sible La nguag e,
vol. 17, no. 4, pp. 333356
N
——————
Nagel, Ernest. 1961. The structure of science: problems in the logic of scientific
explanation. London: Routledge & Kegan Paul
References / 272
Naiman, Avi C. 1991. The use of grayscale for improved character presentation.
Dynamic graphics project technical memo DGP-91-01. Toronto: Computer
systems research institute, University of Toronto. (Avi Naiman’s PhD
thesis)
Nash, Walter. 1980. Designs in prose: a study of compositional problems and
methods. London and New York: Longman (Extensively rewritten and
updated as: Walter Nash and David Stacey. Creating texts: an introduction
to the study of composition. London and New York: Longman, 1997)
Nell, Victor. 1988. Lost in a book: the psychology of reading for pleasure. New
Haven and London: Yale University Press
Neman, Thomas Edward. 1968. ‘The relative legibility of three cold type faces,
three line lengths, three paper stocks, and the interaction of these three
variables’. PhD dissertation. Indiana University
Nes, F.L. van. 1991. ‘Visual ergonomics of displays’. In The manmachine
interface, edited by J.A.J. Roufs, pp. 7082. Basingstoke: Macmillan Press
Neue Deutsche Biographie. 1994. Siebzehnter Band: MelanderMoller. Berlin:
Duncker & Humblot
Nielsen, Jakob. 1993. Usability engineering. Boston: Academic Press
Nolan, Carson Y. 1959. ‘Readability of large types: a study of type sizes and type
styles’. International Journal of Education for the Blind, vol. 9, December,
pp. 4144
Norman, Donald A. 1988. The psychology of everyday things. New York: Basic
Books (This book has since been published under the title: The design of
everyday things)
Norman, Donald A. 1995. ‘On differences between research and practice’.
Ergonomics in Design, April, pp. 3536
Norris, Christopher. 1995. ‘Truth, science, and the growth of knowledge’. New
Left Review, no. 210, pp. 105123. (Also published in Christopher Norris,
Reclaiming truth: contribution to a critique of cultural relativism,
pp. 154179. London: Lawrence & Wishart, 1996)
Nunberg, Geoffrey. 1990. The linguistics of punctuation. Stanford: Center for the
study of language and information
Nyce, James M., and Jonas Löwgren. 1995. ‘Toward foundational analysis in
humancomputer interaction’. In The social and interactional dimensions of
humancomputer interfaces, edited by Peter J. Thomas, pp. 3747.
Cambridge: Cambridge University Press
O
——————
Oborne, David J. 1987. Ergonomics at work. 2nd edn. Chichester: John Wiley
OED. 1989. The Oxford English dictionary. 2nd ed. (20 vols). Oxford: Oxford
University Press
O’Regan, J. Kevin. 1990. ‘Eye movements and reading’. In Eye movements and
their role in visual and cognitive processes, edited by Eileen Kowler,
pp. 395453. Amsterdam and New York: Elsevier Science
References / 273
O’Regan, Kevin, Nicole Bismuth, Roger D. Hersch, and Alexandros Pappas. 1996.
‘Legibility of perceptually-tuned grayscale fonts’. International conference
on image processing 1996, vol. 1, pp. 537540
Östberg, Olov, Houshang Shahnavaz, and Rikard Stenberg. 1989. ‘Legibility
testing of visual display screens’. Behaviour & Information Technology,
vol. 8, no. 2, pp. 145153
Outhwaite, William. 1987. New philosophies of social science: realism,
hermeneutics and critical theory. New York: St. Martin’s Press
Ovink, G.W. 1938. Legibility, atmosphere-value and forms of printing types.
Leiden: A.W. Sijthoff’s Uitgeverssmaatschappij
P
——————
Pace, Ann Jaffe. 1982. ‘Analysing and describing the structure of text’. In The
technology of text: principles for structuring, designing, and displaying text,
[volume one], edited by David H. Jonassen, pp. 1527. Englewood Cliffs,
New Jersey: Educational Technology Publications
Palmer, Carl. 19791981. ‘Legibility is the cornerstone of typography’, [which is
the title of the first ten of sixteen articles on typography and legibility
running in] Graphic Arts Monthly, [from May 1979 to June 1981]
Pardoe, F.E. 1975. John Baskerville of Birmi ngham: letter founder & printer.
London: Frederick Muller
Paterson, Donald G., and Miles A. Tinker. 1932a. ‘Studies of typographical
factors influencing speed of reading. 8, Space between lines or leading’.
Journal of Appli ed Psychology, vol. 16, pp. 388397
Paterson, Donald G., and Miles A. Tinker. 1932b. ‘Studies of typographical
factors influencing speed of reading. 10, Style of type face’. Journal of
Applied Psychology, vol. 16, pp. 605613
Paterson, Donald. G., and Miles A. Tinker. 1940a. How to make type readable: a
manual for typographers, printers and advertisers. New York and London:
Harper & Brothers
Paterson, Donald G., and Miles A. Tinker. 1940b. ‘Influence of line width on eye
movements’. Journal of Experimental Psychology, vol. 27, pp. 572577
Paterson, D.G. and M. Tinker. 1947. ‘The effect of typography upon the
perceptual span in reading’. American Journal of Psychology, vol. 60,
pp. 388396
Payne, Donald E. 1967. ‘Readability of typewritten material: proportional versus
standard spacing’. Journal of Typographic Research, vol. 1, no. 2,
pp. 125136
Pedersen, Kim, and Anders Kidmose. 1993. In black and white: an r & d report
on typography and legibility. Copenhagen: The Graphic College of
Denmark. (Bilingual text in Danish and English)
Pedhazur, Elazar J., and Liora Pedhazur Schmelkin. 1991. Measurement, design,
and analysis: an integrated approach. Hillsdale, New Jersey: Lawrence
Erlbaum Associates
Peters, John. 1962. ‘Sans-serifs and legibility’. Letter from a reader. Monotype
Newsletter, no. 68, p. 12
References / 274
Phillips, Russel M. 1976. ‘The interacting effects of letter style, letter stroke-
witdh and letter size on the legibility of projected high contrast lettering’.
EdD dissertation. Indiana University
Pleasants, Nigel. 1996. ‘Nothing is concealed: de-centering tacit knowledge and
rules from social theory’. Journal for the Theory of Social Be haviour,
vol. 26, no. 3, pp. 233255
Poffenberger, A.T.; and R.B. Franken. 1923. ‘Appropriateness of type faces’.
Journal of Applied Psychology, vol. 7, pp. 312329
Polanyi, Michael. 1958. Personal knowledge: towards a pos t-critical philosophy.
London: Routledge & Kegan Paul; Chicago: University of Chicago Press
Polanyi, Michael. 1966. The tacit dimension. London: Routledge and Kegan Paul
Polkinghorne, Donald E. 1992. ‘Postmodern epistemology of practice’. In
Psychology and postmodernism, edited by Steinar Kvale, pp. 146165.
London: Sage
Popper, Karl. 1963. Conjectures and refutations: the growth of scientific
knowledge. London and Henley: Routledge and Kegan Paul
Poulton, E.C. 1960. ‘A note on printing to make comprehension easier’.
Ergonomics, vol. 3, no. 3, pp. 245248
Poulton, E.C. 1965. ‘Letter differentiation and rate of comprehension in reading’.
Journal of Appli ed Psychology, vol. 49, no. 5, pp. 358362
Poulton, E.C. 1968. ‘The measurement of legibility’. Printing Technology, vol. 12,
no. 2, pp. 7276
Poynor, Rick. 1999. ‘The great type debate rages on’. Graphis, no. 319, pp. [14],
plus one page
Prince, J.H. 1967. ‘Printing for the visually handicapped’. Jour nal of Typographic
Research, vol. 1, no. 1, pp. 3147
Pyke, R.L. 1926. Report on the legibility of print. Medical Research Council,
Special report series, no. 110. London: His Majesty’s Stationery Office
R
——————
Rainford, Paul. 1996. ‘Motorway signs’. Designing, no. 44, p. 13
Ralph, J.B. 1982. ‘A geriatric visual concern: the need for publishing guidelines’.
Journal of the American Opt ometric Association, vol. 53, pp. 4350
Rannem, Øyvin. 1981. ‘Leselighet: typografiens første prinsipp’. Norsk Grafisk
Tidsskrift, November, pp. 6–9
Rannem, Øyvin. 1983. No title. (A comprehensive privately circulated
bibliography on typography and legibility). Oslo
Renear, Allen, Elli Mylonas, and David Durand. 1996. ‘Refining our notion of
what text really is: the problem of overlapping hierarchies’. In Research in
humanities computing 4. Selected papers from the ALLC/ACH conference,
Christ Church, Oxford, April 1992, pp. 263280. Oxford: Clarendon Press
Rayner, Keith. 1981. ‘Visual cues in word recognition and reading: introduction’.
Visib le Langu age, vol. 15, no. 2, pp. 125127
Rayner, Keith (ed). 1983. Eye movements in reading: perceptual and language
processes. New York and London: Academic Press
References / 275
Rayner, Keith, and Alexander Pollatsek. 1989. The psychology of reading.
Englewood Cliffs, New Jersey: Prentice Hall
Rayner, Keith, and Alexander Pollatsek. 1996. ‘Reading unspaced text is not
easy: comments on the implications of Epelboim et al.’s (1994) study for
models of eye movement control in reading’. Vision Re search, vol. 36, no. 3,
pp. 461465
Reed, Edward S. 1988. James J. Gibson and the psychology of perception. New
Haven and London: Yale University Press
Regan, D., and X.H. Hong. 1994. ‘Recognition and detection of texture-defined
letters’. Vision Re search, vol. 34, pp. 24032407
Rehe, Rolf F. 1970. ‘Psychological studies and their impact on modern
typography’. Inland Printer / American Litographer, part one in vol. 164,
no. 6, p. 53, and part two in vol. 165, no. 1, p. 66
Rehe, Rolf F. 1971. ‘Psychologishe Studien und ihre Bedeutung für die Typografie
von heute’. Der Druckspiegel, vol. 26, no. 3, p. 152
Rehe, Rolf F. [1972] 1984. Typogr aphy: h ow to make it most le gib le. 5th edn.
Carmel, Indiana: Design Research International
Rehe, Rolf F. 1985. Typograp hy an d design for newspape rs . Darmstadt: The
International Association for Newspaper and Media Technology
Reynolds, Linda. 1979a. ‘The Graphic Information Research Unit: background
and recent research’. Visible La nguag e, vol. 13, no. 4, pp. 428448
Reynolds, L. 1979b. ‘Legibility studies: their relevance to present day
documentation methods’. Journal of Documentati on, vol. 35, no. 4,
pp. 307340
Reynolds, Linda. 1984. ‘The legibility of printed scientific and technical
information’. In Information design: the design and evaluation of signs and
printed material, edited by Ronald Easterby and Harm Zwaga, pp. 187208.
Chichester: John Wiley
Reynolds, Linda. 1988. ‘Legibility’. Baseline, no. 10, pp. 2629
Richaudeau, François (ed.). 1984. Reserches actuelles sur la lisibilité. Paris:
Editions Retz
Rivlin, Christopher. 1987. ‘A study of the relationship between visual perception
and typographic organisation’. PhD thesis. Manchester Polytechnic
Robinson, David Owen, Michael Abbamonte and Selby H. Evans. 1971. ‘Why
serifs are important: the perception of small print’. Visibl e Langua ge, vol. 5,
no. 4, pp. 353359
Rock, Irvin. 1984. Perc ept ion . New York: Scientific American Library
Roethlein, Barbara Elizabeth. 1912. ‘The relative legibility of different faces of
printing types’. American Journal of Psychology, vol. 23, no. 1, pp. 136.
(Also published in the series Publications of the Clark University Library,
as vol. 3, no. 1. Worchester, Massachusetts: Clark University Press)
Rögener, Stefan, Albert-Jan Pool, and Ursula Packhäuser. 1995. Branding with
type. Edited by E.M. Ginger, translated by Stephanie Tripier. Mountain
View, California: Adobe Press. (German edition: Ty pen machen Marken
mächtig. Hamburg: AdFinder, 1995)
References / 276
Rogers, Tim B. 1992. ‘Antecedents of operationism: a case history in radical
positivism’. In Positivis m i n p sychology: histor ica l a nd conte mporary
problems, edited by Charles W. Tolman, pp. 5765. New York: Springer-
Ver l ag
Rolf, Bertil. 1991. Profession, tradition och tyst kunskap: en studie i Michael
Polanyis teo ri om den prof ess ion ell a kunskapens t yst a d ime nsi on. Sweden:
Bokförlaget Nya Doxa
Rollins, Carl Purington. 1942. ‘Gilding the lily: in the designing of books there’s
no sin like complacency’. In Bookmaking & kindred amenities, edited by
Earl Schenck Miers and Richard Ellis, pp. 2131. New Brunswick, New
Jersey: Rutgers University Press
Rooum, Donald. 1981. ‘Cyril Burt’s “A psychological study of typography”: a
reappraisal’. Typo s: A Journal of Typography, no. 4, pp. 3740. London
College of Printing
Rossum, Mark van. 1997. ‘A new test of legibility’. Quærendo, vol. 27, no. 2,
pp. 141147
Roth, Joel. 1969. ‘Excerpt: typography that makes the reader work’. Journal of
Typog raphi c Resear ch , vol. 3, no. 2, pp. 193196
Roth, Susan King. 1999. ‘The state of design research’. Design Issues, vol. 15,
no. 2, pp. 1826
Rowe, Camille L. 1982. ‘The connotative dimensions of selected display
typefaces’. Information Design Journal, vol. 3, no. 1, pp. 3037
Rubinstein, Richard. 1988. Digital typography: an introduction to type and
composition for computer system design. Reading, Massachusetts: Addison-
Wesley
Ruder, Emil. 1957. ‘Univers: eine Grotesk von Adrian Frutiger’. Typogra sche
Monatsblätter, no. 5, 8 pp.
Ruder, Emil. 1959. ‘Univers: eine neue Grotesk von Adrian Frutiger / Univers: a
new sans-serif type by Adrian Frutiger’. Neue Grafik / New Graphic
Design / Graphisme actuel, no. 2, pp. 5557
Ruder, Emil. 1961. ‘Univers and contemporary typography’. Print in Britain,
no. 1, vol. 9, pp. 2223
Ruediger, W.C. 1907. ‘The field of distinct vision: with reference to individual
differences and their correlations’. Archives of Psychology, vol. 1, no. 5,
pp. 1–68
Runeson, Sverker. 1977. ‘On the possibility of “smart” perceptual mechanisms’.
Scandinavian Journal of Psychology, vol. 18, pp. 172179
Ryle, Gilbert. [1949] 1973. The concept of mind. Harmondsworth: Penguin Books
S
——————
Saenger, Paul. 1997. Space between words: the origins of silent reading. Stanford:
Stanford University Press
Samuel, Raphael 1991/1992. ‘Reading the signs’ [part 1]; and ‘Reading the signs:
[Part] 2. Fact-grubbers and mind-readers’. History Workshop, respectively
no. 32, 1991, pp. 88109; and no. 33, 1992, pp. 220251
References / 277
Salcedo, Rodolfo N., Hadley Read, James F. Evans, and Ada C. Kong. 1972.
‘A broader look at legibility’. Journalism Quarterly, vol. 49, pt. 2,
pp. 285289, 295
Sassoon, Rosemary. 1988. ‘Joins in children’s handwriting, and the effects of
different models and teaching methods’. PhD thesis. Department of
Typography & Graphic Communication, The University of Reading
Sassoon, Rosemary. 1999. Handwriting of the twentieth century. London and
New Yo rk : Ro u tl ed g e
Schmidt, Nils. 1906. ‘Några ord om huru böcker och tidningar ur ögon-hygienisk
synpunkt böra tryckas’. Föredrag vid sv. graf. Fackskolefören. Nordisk
Boktryckarekonst, vol. 7, no. 3, pp. 105109
Schön, Donald A. 1983. The reflective practitioner: how professionals think in
action. London: Temple Smith. New York: Basic Books
Schön, Donald A. 1987. Educating the reflective practitioner: toward a new design
for teaching and learning in the professions. San Francisco: Jossey-Bass
Publishers
Schriver, Karen A. 1997. Dynamics in document design: creating text for readers.
New York: John Wiley
Schriver, Karen A. 1998.A review of “Building the bridge across the years and
disciplines” ’. Information Design Journal, vol. 9, no. 1, pp. 1115
Schulz-Anker, Erik. 1969. Form ana lyse u nd Dok umentatio n e ine r s erife nlo sen
Linearschrift auf neuer Basis: Syntax Antiqua. Frankfurt am Main:
Stempel. (Also published in Novum Gebrauchsgraphik, no. 8, 1970,
pp. 4956; and in Druck-Print, no. 1, 1970, pp. 2023)
Schumacher, Gary M., and Robert Waller. 1985. ‘Testing design alternatives:
a comparison of procedures’. In Designing usable texts, edited by Thomas
Duffy and Robert Waller, pp. 377403. Orlando and London: Academic
Press
Scriven, M. 1967. ‘The methodology of evaluation’. In Pers pec tiv es of cu rri cul um
evaluation, edited by R.W. Tyler, R. Gagné, and M. Scriven, pp. 3983.
Chicago: Rand McNally
Searle, John. 1980. ‘Minds, brains, and programs’. Behavioral and Brain
Sciences, vol. 3, no. 3, pp. 417424. (Republished on numerous occasions in
collections and readers)
Searle, John R. 1990. ‘Cognitive science and the computer metaphor’. In
Artificial intelligence, culture and language: on education and work, edited
by Bo Göranzon and Magnus Florin, pp. 2334. London: Springer-Verlag
Searle, John R. 1992. The rediscovery of the mind. Cambridge, Massachusetts:
The MIT Press
Seybold, Jonathan. 1991. ‘Adobe’s “Multimaster” technology: breakthrough in
type aesthetics’. The Seybold Report on Desktop Publishing, vol. 5, no. 7,
pp. 3–7
Shaw, Alison. 1969. Print for partial sight: a report to the Library Association
sub-committee on books for readers with defective sight. London: The
Library Association
Shaw, Montague. 1989. David Kindersley: his work and workshop. Cambridge:
Cardozo Kindersley Editions and Uitgeverij de Buitenkant
References / 278
Shaw, Paul. 1989. ‘The Century family’. In Fine Print on type: the best of Fine
Print: a magazine on type and typography, edited by Charles Bigelow, Paul
Hayden Duensing and Linnea Gentry, pp. 4649. San Francisco: Fine Print
and Bedford Arts
Shaw, Paul. 1996. ‘Baskerville revisited’. Print, vol. 50, no. 6, pp. 28D, 28F,
115116. (Review of the typeface family Mrs. Eaves, designed by Zuzana
Licko, Emigre Fonts)
Silver, N. Clayton, and Curt C. Braun. 1993. ‘Perceived readability of warning
labels with varied font sizes and styles’. Safety Science, volume 16, no. 5/6,
pp. 615625
Silver, N. Clayton, Paul B. Kline, and Curt C. Braun. 1994. ‘Type form variables:
differences in perceived readability and perceived hazardousness’.
Proceedings of the Human Factors and Ergonomics Society 38th annual
meeting 1994, pp. 821825
Simon, Herbert A. [1969] 1996. The sciences of the artificial. Third edition.
Cambridge, Massachusetts: The MIT Press
Skarzenski, Emily N. 1996. Book review of Typ e & layo ut: how typo gra phy a nd
design can get your message across or get in the way, by Colin Wheildon.
Tec hnic al Comm unicat ion, fourth quarter, pp. 424426
Skjervheim, Hans. [1957] 1996. ‘Deltakar og tilskodar’. In Deltakar og tilskodar
og andre essays, [second edition], pp. 7187. Oslo: Aschehoug. (Published
the very first time as a multiplied typewritten manuscript by the
Department of Sociology at the University of Oslo, 1957)
Skjervheim, Hans. [1963] 1996. ‘Fenomenologi og psykologi’. In Deltakar og
tilskodar og andre essays, [second edition], pp. 169185. Oslo: Aschehoug.
(Published the very first time in Metapsykologi, edited by Carl Erik
Grenness and Steinar Kvale, Oslo, 1963)
Sless, David. 1981. Learning and visual communication. London: Croom Helm;
New York and Toronto: John Wiley
Sless, David. 1994. ‘What is information design?’ In Designing information for
people: proceedings from the symposium, edited by Robyn Penman and
David Sless, pp. 116. Hackett: Communication Research Institute of
Australia
Sless, David. 1996.Better information presentation: satisfying consumers?’.
Visib le Langu age, vol. 30, no. 3, pp. 246267
Sless, David. 1997. ‘Building the bridge across the years and the disciplines’.
Information Design Journal, vol. 9, no. 1, pp. 310, 2728
Smedshammar, Hans, Kerstin Frenckner, Caroline Nordquist, and Steffan
Romberger. 1990. Läslighet på bildskärm hos några teckensnitt. (English
abstract: Legibility of four different typefaces on computer screens).
TRITA-NA-P9022. Stockholm: Department of numerical analysis and
computing science, Royal Institute of Technology
Smith, Margaret M. 1983. ‘Form and its relationship to content in the design of
incunables’. PhD dissertation. University of Cambridge
Smith, Margaret M. 1987. ‘Printed foliation: forerunner to printed page-
numbers?’ Gutenberg Jahrbuch, 1987, pp. 5470
References / 279
Smith, Margaret M. 1993. ‘The pre-history of “small caps”: from all caps to
smaller capitals to small caps’. Journal of the Printing Historical Society,
no. 22, pp. 79106
Smith, Margaret M. 1994. ‘The design relationship between the manuscript and
the incunable’. In A millennium of the book: production, design &
illustration in manuscript & print: 9001900, edited by Robin Myers and
Michael Harris, pp. 2344. Winchester: St Paul’s Bibliographies. Delaware:
Oak Knoll Press
Smither, J.A., and C.C. Braun. 1994. ‘Readability of prescription drug labels by
older and younger adults’. Journal of Clinical Ps ychology in Medical
Settings, vol. 1, no. 2, pp. 149159
Snyder, Gertrude, and Alan Peckolick. 1985. Herb Lubalin: art director, graphic
designer and typographer. New York: American Showcase
Solli, Susanna M. 1998. ‘Bourdieu og konstruksjonen av forskningsobjektet’.
In Modernitetrefleksjoner og idébrytninger: en antologi, edited by
Elisabeth L’orange Fürst and Øystein Nilsen, pp. 235253. Oslo: Cappelen
Akademisk Forlag
Sorg, J.A. 1985. ‘An exploratory study of type face, type size, and color paper
preferences among older adults’. Master thesis. Pennsylvania State
University
Southall, Richard. 1988. ‘Visual structure and the transmission of meaning’.
In Document manipulation and typography. Proceedings of the
international conference on electronic publishing, document manipulation
and typography, Nice (France) April 2022 1998, edited by J.C. van Vliet,
pp. 3545. Cambridge: Cambridge University Press
Southall, Richard. 1989. ‘Interfaces between the designer and the document’.
In Structured documents, edited by J. André, R. Furuta, and V. Quint,
pp. 119131. Cambridge: Cambridge University Press
Southall, Richard. 1992. ‘Presentation rules and rules of composition in the
formatting of complex text’. In EP92: proceedings of Electronic Publishing
1992, edited by C. Vanoirbeek and G. Coray, pp. 275290. Cambridge:
Cambridge University Press
Souttar, James. 1992. ‘The voice of the establishment’. The Monotype conference
1992: typographic connections: a European conference about typographic
history, theory and practice: proceedings. Queens’ College, University of
Cambridge, 13 September, 1992
Spencer, Herbert. 1961. ‘Mile-a-minute typography?’. Typog raphi ca, series 2,
no. 4, pp. 228
Spencer, Herbert. 1969. The visible word. 2nd edn. London: Lund Humphries /
Royal College of Art
Spencer, Herbert. 1974. ‘Legibility research in information publishing’. Pen ros e
Annual, vol. 67, pp. 153162.
Spencer, Herbert, Linda Reynolds, and Brian Coe. [1975] 1977. The effect of
image degradation and background noise on the legibility of text and
numerals in four different typefaces. [London]: Readability of Print
Research Unit, Royal College of Art (revised 1977)
References / 280
Spiekermann, Erik. 1986. ‘Post mortem: or how I once designed a typeface for
Europe’s biggest company’. Baseline, no. 7, pp. 6–7
Spiekermann, Erik, and E.M. Ginger. 1993. Stop stealing sheep & find out how
type works. Mountain View, California: Adobe Press
Stahl, Albert F. 1989. ‘The effects of type style upon readability’. PhD
dissertation. Wayne S tate Uni ve rsity
Stake, Robert E. 1994. ‘Case studies’. In Handbook of qualitative research, edited
by Norman K. Denzin and Yvonne S. Lincoln, pp. 236247. Thousand Oaks,
California: Sage Publications
Stamm, Beat. 1998. ‘Visual TrueType: a graphical method for authoring font
intelligence’. In Electronic publishing, artistic imaging and digital
typography, edited by Roger D. Hersch, Jacques André, and Heather
Brown, pp. 7792. Berlin: Springer-Verlag
Stanton, Neville (ed.). 1998. Human factors in consumer products. London:
Taylor & Francis
Steer, Vincent. No date [first edition 1934]. Printing design and layout: the
manual for printers, typographers and all designers and users of printing
and advertising. With a foreword by Beatrice Warde. London: Virtue and
Company
Stern, John A., Donna Boyer, and David Schroeder. 1994. ‘Blink rate: a possible
measure of fatigue’. Human Factors, vol. 36, no. 2, pp. 285297
Stiff, Paul. 1993a. ‘Stop sitting around and start reading’. Eye, no. 11, vol. 3,
pp. 4–5
Stiff, Paul. 1993b. ‘Graphic design, MetaDesign, and information design’.
Information Design Journal, vol. 7, no. 1, pp. 4146
Stiff, Paul. 1994. ‘Structuralists, stylists, and forgotten readers’. Information
Design Journal, vol. 7. no. 3, pp. 227241
Stiff, Paul. 1995a. ‘Design methods, cultural diversity, and the limits of
designing’. Information Design Journal, vol. 8, no. 1, pp. 3647
Stiff, Paul. 1995b. ‘Public graphics: the Lunteren symposium’. Information
Design Journal, vol. 8, no. 1, pp. 6471
Stiff, Paul. 1996a. ‘The end of line: a survey of unjustified typography’.
Information Design Journal, vol. 8, no. 2, pp. 125152
Stiff, Paul. 1996b. ‘Instructing the printer: what specification tells about
typographic designing’. Typography Papers, no. 1, pp. 2774
Stone, Deborah Berman. 1997. The legibility of text on paper and laptop
computer: a multivariable approach’. PhD dissertation. University of
Maryland at College Park
Stone, Deborah B.; Sylvia K. Fisher, and John Eliot. 1999. ‘Adults’ prior exposure
to print as a predictor of the legibility of text on paper and laptop
computer’. Reading and Writing, vol. 11, no. 1, pp. 128
Suen, C.Y., and M.K. Komoda. 1986. ‘Legibility of digital type-fonts and
comprehension in reading’. In Text pro cessing and do cument manipulation.
Proceedings of the international conference, University of Nottingham,
1416 April 1986, edited by J.C. van Vliet, pp. 178187. Cambridge:
Cambridge University Press on behalf of The British Computer Society
References / 281
Sutherland, Sandra Wright. 1989. ‘Miles Albert Tinker and the zone of optimal
typography’. PhD dissertation. [Seattle]: University of Washington
T
——————
Tantillo, J., J. Dilorenzoaiss, an d R.E. Mathisen. 1995. ‘ Quantifying perceive d
differences in type style: an exploratory study’. Psychology & Marketing,
vol. 12, no. 5, pp. 447457
Taylo r, Jef frey Lynn. 1990. ‘The effect of typeface on r eading rates and the
typeface preferences of individual readers’. PhD dissertation. Wayne State
University
Taylo r, Conrad. 1998. ‘Commen ts on David Sless’s Schwarz enberg paper’.
Information Design Journal, vol. 9, no. 1, pp. 2325
Tarr, John C. 1947.Legibility in printing. Schweizer Graphische Mitteilungen,
Jahrgang 66, Heft 3, pp. 97100
Tarr, John C. 1949. ‘A critic al discursus on t ype legibil ity’, Pe nro se Annual,
vol. 43, pp. 2931
Tempt e, Nils. 1961. Peda gog isk a och t ypogr afis ka ans p k p å l ärock ern a.
Nordisk Bokuke, Gausdal, September 1961
Thompson, Evan. 1995. Colour vision: a study in cognitive science and the
philosophy of perception. London and New York: Routledge
Thompson, Martyn P. 1993. ‘Reception theory and the interpretation of historical
meaning’. History and Theory, vol. 32, no. 3, pp. 248272
Tinker, Miles A. 1944. ‘Criteria for determining the readability of type faces’,
Journal of Educational Psy chology, vol. 35, no. 7, pp. 385396
Tinker, Miles A. 1963. Legibility of print. Ames, Iowa: Iowa State University
Press
Tinker, Miles A. 1965. Bases for effective reading. Minneapolis: University of
Minnesota Press
Tinker, Miles A., and Donald G. Paterson. 1942. ‘Reader preferences and
typography’. Journal of Applied Psychology, vol. 26, pp. 3840
Tinker, Miles A., and Donald G. Paterson. 1949. ‘What is legibility?’ Print, vol. 6,
no. 2, p. 61
Tinker, Miles A., and Donald G. Paterson. 1955. ‘The effect of typographical
variations upon eye movement in reading’. Journal of Educational
Research, vol. 49, pp. 171183
Tomas zewski, Roman. 19 73. Materiały do bibliografii na temat czytelności pism
drukarskich (The materials of bibliography on the thema legibility of
printing types). Poland. (Not seen; source: Les sciences de l’écrit: encyclopédie
internationale de bibliologie, edited by Robert Estivals, Jean Meyriat, and
François Richaudeau, p. 454. Paris: Retz, 1993)
Tracy, Walter. 1986. Letters of credit: a view of type design. London: Gordon
Fraser
Tracy, Walter. 1988. The typographic scene. London: Gordon Fraser
Turnbull, Arthur T., and Russel N. Baird. 1980. The graphics of communication:
typography, layout, design, production. 4th ed. New York: Holt, Rinehart
and Winston
References / 282
Turvey, M.T., R.E. Shaw, E.S. Reed, and W.M. Mace. 1981. ‘Ecological laws of
perceiving and acting: in reply to Fodor and Pylyshyn (1981)’. Cognition,
vol. 9, pp. 237304
Twyman, Michael. 1970. Printing 17701970: an illustrated history of its
development and uses in England. London: Eyre & Spottiswood. (New
edition published by The British Library, 1998)
Twyman, Michael. 1979. ‘A schema for the study of graphic language’. In
Processing of visible language: volume 1, edited by Paul A. Kolers, Merald
E. Wrolstad and Herman Bouma, pp. 117150. New York and London:
Plenum Press. (Also published, partly rewritten, in Media, knowledge and
power, edited by Oliver Boyd-Barrett and Peter Braham, pp. 201225.
London and Sydney: Croom Helm, in association with The Open University,
1987)
Twyman, Michael. 1982. ‘The graphic presentation of language’. Information
Design Journal, vol. 3, no. 1, pp. 222
Twyman, Michael. 1986. ‘Articulating graphic language: a historical perspective’.
In Tow ard a new un dersta nding of li teracy, edited by M.E. Wrolstad and
D.F. Fisher, pp. 188251. New York: Praeger
Twyman, Michael. 1993. ‘The bold idea: the use of bold-looking types in the
nineteenth century’. Jour nal of the Printing Historical Society, no. 22,
pp. 107143
Typog rafische Mona tsb lätter, no. 1, 1961. (A special issue on the typeface family
Univers)
U
——————
Updike, Daniel Berkeley. [1922] 1962. Printing types: their history, forms and
use: a study in survivals. Two volumes. Third edition. Cambridge,
Massachusetts: Harvard University Press; London: Oxford University Press
Uttal, William R. 1990. ‘On some two-way barriers between models and
mechanisms’. Per cepti on and Ps ychophysics, vol. 48, no. 2, pp. 188203
V
——————
Vagle, We nche. 1 99 5. Re vi ew a rticl e of Skjermtekster: skriftkulturen og den
elektroniske informasjonteknologien, edited by Ture Schwebs. Norsk
Medietidsskrift, vol. 2, no. 1, pp. 165170
Valenti ne, E li zabe th R. 19 82 . Conceptual issues in psychology. London: George
Allen & Unwin
Vanderpl as, James M., an d Jea n H. Vand erplas. 1 98 0. ‘S ome fact ors affec ting
legibility of printed material for older adults’. Perce ptual an d M oto r S kil ls,
vol. 50, pp. 923932
References / 283
Ven ezky, Ric hard L. 198 4. ‘The history o f re ading re se arc h’. In Handbook of
reading research [vol. 1], edited by P. David Pearson, Rebecca Barr, Michael
L. Kamil, and Peter Mosenthal, pp. 338. New York and London: Longman.
(This article is based in part on an earlier article by the same author:
‘Research on reading processes: a historical perspective’, American
Psychologist, vol. 32, no. 5, 1977, pp. 339345)
Vihma, Susann (ed.). 1990. Semantic visions in design. Proceedings from the
symposium on design research and semiotics 1718.5 1989, at the
University of Industrial Arts, Helsinki. Helsinki: University of Industrial
Arts
Vihma, Susann (ed.) 1992. Objects and images: studies in design and advertising.
Helsinki: University of Industrial Arts
de Vinne, Theodore Low. 1899. Plain printing types. New York: The Century Co.
(The whole title, including the introductory series title and a descriptive
title of the volume, reads: The practice of typography: a treatise on the
processes of type-making, the point system, the names, sizes, styles and
prices of: plain printing types)
Visib le Langu age, vol. 28, no. 3, and no. 4, 1994. (Special issues on graphic
design historiography)
W
——————
Wad e, Ni cholas J. , an d Micha el Swa ns ton. 1991. Visual pe rcept ion : an in tro-
duction. London and New York: Routledge
Wak em an. [1999]. Catalogue 43: the book arts. Oxford: Frances Wakeman Books
Wal ke r Jr., Al vi n (ed .) . 1 99 4. Thesaurus of psychological index terms. Twentieth
anniversary 19741994. 7th ed. Washington, D.C.: The American
Psychological Association
Wal ke r, John. 1989. Design history and the history of design. With a contribution
by Judy Attfield. London: Pluto Press
Wal le r, R obert. 1979. ‘Fu nc tional inf or mation de si gn: re searc h and pr ac tice.
Information Design Journal, vol. 1, no. 1, pp. 4350
Wal le r, R obert H.W. 1 98 0. ‘Gr ap hic aspec ts of comple x texts: ty pog ra phy as
macro-punctuation’. In Processing of visible language: volume 2, edited by
Paul A. Kolers, Merald E. Wrolstad and Herman Bouma, pp. 241253. New
Yo rk : Pl e nu m P re ss
Wal le r, R obert. 1982. ‘Text as diag ra m: us ing ty pography t o improve a ccess and
understanding’. In The technology of text: principles for structuring,
designing, and displaying text [volume one], edited by David H. Jonassen,
pp. 137166. Englewood Cliffs, New Jersey: Educational Technology
Publications
Wal le r, R obert H.W. 1 98 5. ‘Us in g typogra ph y to struc tu re argume nt s: a critica l
analysis of some examples’. In The technology of text: volume two: principles
for structuring, designing, and displaying text, edited by David H. Jonassen,
pp. 105125. Englewood Cliffs, New Jersey: Educational Technology
Publications
References / 284
Wal le r, R obert. 1987. ‘Th e typograph ic contribut io n to langu ag e: to war ds a mo de l
of typographic genres and their underlying structures’. PhD thesis.
Department of Typography & Graphic Communication, University of
Reading
Wal ler, Rober t. 1991. Typo gr aphy an d discourse ’. In Handbook of reading
research, vol. 2, edited by Rebecca Barr, Michael L. Kamil, Peter B.
Mosenthal and P. David Pearson, pp. 341380. New York: Longman
Wac qu ant, Loïc J.D. 199 3. Pos it ivism. In The Blackwell dictionary of twentieth
century social thought, edited by William Outhwaite and Tom Bottomore,
pp. 495498. Oxford: Blackwell
War de , B eatrice. 195 6. Ne w l ig ht on typo gr aphic legi bi li ty’ . Pen ros e Ann ual ,
vol. 50, pp. 5155
Wat t, Roger. 19 93 . ‘ The vi sual analys is of pages of t ext ’. In Computers and
typography, compiled by Rosemary Sassoon, pp. 178201. Oxford: Intellect
Books
Wat ts , Lynne, and John Nis be t. 19 74. Legibility in children’s books: a review of
research. Windsor: NFER Publishing Company
Web ste r, He len Agne s. 1 933. Th e i nfluenc e o f t ype fac e a nd pap er sur fac e on t he
legibility of print’. MA thesis. University of Minnesota. (Not seen)
Web ste r, He len A. and Mi les A. Ti nker. 193 5. ‘The in flue nce of typ e f ace on th e
legibility of print’. Journal of Applied Psychology, vol. 19, February,
pp. 4352
Wen dt, D irk . 196 8. ‘Sema ntic di ffe rential s o f t ypeface s a s a me tho d o f
congeniality research’. Journal of Typographic Research, vol. 2, no. 1,
pp. 3– 25
Wen dt, D irk . 196 9. Einflüsse von Schriftart (Bodoni vs. Futura), Schriftneigung
und Fettigkeit auf die erzielbare Lesegeschwindigkeit mit einer Druckschrift.
Bericht Nr. 5, Untersuchungen zur Lesbarkeit von Druckschriften.
Hamburg: Psychologisches Institut der Universität Hamburg
Wen dt, D irk . 197 0a.By what cr ite ria are we to jud ge legibil ity ?’. I n Typogra phic
opportunities in the computer age, pp. 4246. Papers of the 11th congress of
the Association Typographique Internationale, 1969, Prague. (Also
published as ‘Mehrdimensionale Kriterien der Lesbarkeit’. Pap ier un d
Druck, Fachausgabe Typographie, vol. 18, no. 12, 1969, pp. 261264)
Wen dt, D irk . 197 0b. Pr obl eme un d E rgebni sse ps ycholo gis che r
Lesbarkeitforschung’. Druck-Print, vol. 107, no. 1, pp. 1619
Wen dt, D irk . 197 1. ‘Lese n und L esbarke it in Ab hängig kei t v on der
Textanordnung’. Papier u nd Dru ck , vol. 20, no. 6, pp. 9092
Wen dt, D irk . 197 2. ‘Sch rif tge staltun g u nd exp eri mental-psychologische
Forschun g’. Druck Print, no. 3, pp. 145147
Wen dt, D irk . 199 4. ‘Legi bil ity. In Fo nt tec hno log y: metho ds and to ols , by Peter
Karow, pp. 271306. Berlin: Springer-Verlag
Wen dt, D irk , Wieb ke Groggel , and Ge org Gu tsc hmidt. 1997. ‘On the effectiveness
of highlighting ads in telephone directories by color’. Visible Language ,
vol. 31, no. 3, pp. 326337
Wen iger, Eri ch. 1 990. Ausgewählte Schriften: zur geisteswissenschaftlichen
Pädagogik. Weinheim: Verlag Julius Beltz
References / 285
Wes tin ghouse . 198 0. Wes tin ghous e prod uct safet y l abel h and book. Trafford,
Pennsylvania: Westinghouse Printing Division
Whalley, Peter. 1993. ‘An alternative rhetoric for hypertext’. In Hypertext:
a psychological perspective, edited by C. McKnight, A. Dillon, and
J. Richardson, pp. 718. New York and London: Ellis Horwood
Wheatly, W.F. 1985. Typ eface an alogu e. Arlington, Virginia: National
Composition Association
Wheildon, Colin. 1984. Communicating, or just making pretty shapes? a study of
validity or otherwise of some elements of typographic design. Sydney:
Newspaper Advertising Bureau of Australia
Wheildon, Colin. 1995. Type & la yout: h ow ty pography and des ign c an get y our
message across or get in the way. Edited and with an introduction by Mal
War wi ck. Berkeley, Ca li fornia: St rat hm oor Pre ss
Whittemore, Irving C. 1948. ‘What do you mean, legibility?’. Print, vol. 5, no. 4,
pp. 3537
Wijnholds, Aernout de Beaufort. 1996. Using type: the typographer’s
craftsmanship and the ergonomist’s research. Literary [sic] review.
Cognitive Ergonomics, Psychonomics Department: Utrecht University
Wilkes, K.V. 1990. ‘Modelling the mind’. In Modelling the mind, edited by K.A.
Mohyeldin Said, W.H. Newton-Smith, R. Viale, and K.V. Wilkes, pp. 6382.
Oxford: Oxford University Press, Clarendon Press
Wilkins, Arnold J. 1995. Vis ual stre ss. Oxford: Oxford University Press
Williams, Mark Allen. 1990. ‘Legibility of serif and sans serif type faces in
computer displays’. MS thesis. Colorado State University.
Williams, Thomas R., and Jan H. Spyridakis. 1992. ‘Visual discriminability of
headings in text’. IEEE Transactions on Professional Communication,
vol. 35, no. 2, pp. 64–70
Winograd, Terry, and Fernando Flores. 1986. Understanding computers and
cognition: a new foundation for design. Norwood, New Jersey: Ablex.
(Reprinted several times by Ablex, and republished as a paperback by
Addison-Wes ley ; Rea din g, M assach use tts)
Wright, Patricia. 1978. ‘Feeding the information eaters: suggestions for
integrating pure and applied research on language comprehension’.
Instructional Science, 1978, vol. 7, no. 3, pp. 249312
Wright, Patricia. 1994. ‘Enhancing the usability of written instructions’. In the
conference proceedings of ‘Public graphics: visual information for everyday
use’; 2630 September 1994, Lunteren, the Netherlands; edited by Harm
J.G. Zwa ga, Th eo Bo ersema , a nd Henr tte C.M. Hoonhout; pp. 1.11.18
Wright, Patricia. 1998. ‘Developments and growth in information design’
Information Design Journal, vol. 9, no. 1, pp. 1621
X
——————
x-height. 1995. ‘Boiled in oil’, vol. 4, no. 2, pp. 1415
References / 286
Y
——————
Ya ge r, De a n, a nd Ro be r t Pl a ss . 19 95 . ‘P r es en t at io n o f v id eo -based text to low
vision readers: rapid serial visual presentation vs. full-page reading’.
Investigative Ophthalmology & Visual Science, vol. 36, p. S71 (Conference
poster / meeting abstract)
Ya ge r, De a n, K at h y Aq ui l an te , an d R ob e rt P l as s. 1 99 8 . ‘ H ig h a nd l o w lu m in an c e
letters, acuity reserve, and font effects on reading speed’. Visi on Re search,
vol. 8, no. 17, pp. 25272531
Z
——————
Zachrisson, Bror. 1965. Studies in the legibility of printed text. Stockholm:
Almqvist & Wiksell
Zachrisson, Bror. 1968. ‘The ATypI legibility research committee’. Journal of
Typog raphi c Resear ch , vol. 2, no. 3, pp. 271277
Zachrisson, Bror, and Hans Smedshammar. 1971. Förunder kning av vi ssa
typografiska variabler vid läsning med nedsatt synförmåga. [Rapport
no. 56]. Stockholm: Lärarhögskolan i Stockholm, Pedagogiska Institutionen
Zachrisson, Bror, and Hans Smedshammar. 1973a. Typ ografi r syn svaga: en
översikt och bibliografi. Rapport nr. 39. Stockholm: Grafiska Institutet
[Zachrisson, Bror, and Hans Smedshammar]. 1973b. ‘Typografi för synsvaga’.
Grafiskt Forum, no. 67, p. 24
Zakaluk, Beverly L., and S. Jay Samuels (eds). 1988. Readability: its past,
present and future. Newark, Delaware: International Reading Association
Zwahlen, H.T., M. Sunkara, and T. Schnell. 1995. ‘Review of legibility
relationships within the context of textual information presentation’.
In Human performance and safety in highway, traffic, and ITS systems.
Transportation research record, no. 1485, pp. 6170. Washington, D.C.:
National Research Council and National Academy Press
287
Colophon
This thesis is written and paginated as one single document in
Microsoft Word on an Apple Macintosh computer. The main text is set in
the serif typeface New Century Schoolbook. The title page, the headings
and the running headline are set in variants of the sans serif typeface
Stone Sans.
... According to the study, line spacing, a key typographic feature, has a significant influence on the reading experience. According to research [10]; [5], proper line spacing improves readability and visual comfort while minimizing strain. This research-based knowledge enables designers and communicators to optimize line spacing, hence improving the reading experience. ...
... The questionnaire gathered demographic information in order to better understand the sample composition and included sections in which participants assessed several text samples with varied typographic features. Previous research [15]; [10] emphasized the relevance of font size and line spacing in improving readability, which impacted the design of this questionnaire. ...
... Large line spacing, however, causes reading very inefficient (Lund, Ole. 1999) [10], which is why line spacing beyond 1.6 was not used in this study. Then the question arises, in that context should your line spacings be little or big (1.2, 1.4, 1.6)? ...
Book
Full-text available
Typography is an intricate form of visual refinements for the written language, and as such it comes with a rather considerable impact on reading not limited to its mere aesthetics. This study examines how typography affects cognitive comprehension and emotional engagement in reading. An analysis of three critical typographical variables-typeface, font size, and line spacing-showed that each had different effects on the experience of reading. Serif fonts like Times New Roman improved readability and message clarity more than Script fonts such as The Nautigal increased emotional engagement despite reduced legibility. Fonts such as Helvetica, sans-serif, have been balanced in usage in terms of readability and engagement. The best font size is 12px because it increases any more than which would add less; instead, it causes a less readable output; and maximum line spacing is set at 1.4 lines for maximum readability comfort and efficiency. These findings support valuable knowledge for designers and educators to optimize text presentation to impact comprehension and emotional effects.
... Due to the lack of space, in this section, we cannot focus on the effect different (mostly micro)typographic variables such as x-height exert on perception (for overviews cf . Lund 1999;Filek 2013, and for the influence of specific fonts on reading processes ⟶ Section 4.7). However, a further comment must be made about abstract vs. concrete graphetic research. In ⟶ Section 3.2.3, the segmentation of basic shapes into elementary forms was introduced as one of the core topics of graphetics. The structural approaches of se ...
... There exist numerous overviews of the methodology of psycholinguistic typography research (cf. the respective sections in Tinker 1963;Wendt 2000;Bosshard 1996;Zachrisson 1965;Lund 1999); especially the summary given by Filek (2013: 72-75) serves as a starting point for the following brief presentation of methods. A first and central group of methods pertains to the physiological preconditions and includes the measurement of perception thresholds. ...
Book
Full-text available
Grapholinguistics, the multifaceted study of writing systems, is growing increasingly popular, yet to date no coherent account covering and connecting its major branches exists. This book now gives an overview of the core theoretical and empirical questions of this field. A treatment of the structure of writing systems—their relation to speech and language, their material features, linguistic functions, and norms, as well as the different types in which they come—is complemented by perspectives centring on the use of writing, incorporating psycholinguistic and sociolinguistic issues such as reading processes or orthographic variation as social action. Examples stem from a variety of diverse systems such as Chinese, English, Japanese, Arabic, Thai, German, and Korean, which allows defining concepts in a broadly applicable way and thereby constructing a comparative grapholinguistic framework that provides readers with important tools for studying any writing system. The book emphasizes that grapholinguistics is a discipline in its own right, inviting discussion and further research in this up-and-coming field as well as an overdue integration of writing into general linguistic discussion.
... In the history of legibility research, serifs are one of the most disputed typographic features (for a review of the early literature; see Lund, 1999). Numerous experiments have compared performance of multiple fonts, including both serif and sans serif examples; however, when the fonts are bundled into serif versus sans serif categories, the results are generally inconclusive, because fonts within the categories do not show similar reading performance (Bernard et al., 2002;Boyarski et al., 1998;Sheedy et al., 2005). ...
Article
Full-text available
Aim It is a long-lasting dispute whether serif or sans serif fonts are more legible. However, different fonts vary on numerous visual parameters, not just serifs. We investigated whether a difference in word identification can be attributed to the presence or absence of serifs or to the contrast of the letter stroke. Method Participants performed a word-recognition two-interval, forced-choice task (Exp. 1) and a classic lexical decision task (Exp. 2). In both experiments the word stimuli were set with four new fonts, which were developed to isolate the stylistic features of serif and letter-stroke contrast. Two measures (i.e., font-size threshold & sensitivity) were analysed. Results The threshold measure of both experiments yielded a single significant main effect of stroke contrast such that low stroke contrast elicited lower than high stroke contrast. The sensitivity measure of Experiment 1 yielded a single significant effect of the interaction between serifs and stroke contrast. Specifically, at the sans-serif level, low stroke contrast revealed better sensitivity, relative to high stroke contrast. At the serif level, the opposite stroke contrast pattern was observed. Conclusion Sans serif fonts with low stroke contrast yield better performance and if a serif font is used, high stroke contrast yields better performance than low stroke contrast. Limitations and future directions are discussed.
Chapter
Full-text available
Dieser Beitrag beschäftigt sich mit »Leichter Sprache« – einem Bereich der Wissensvermittlung, der bis jetzt wenig Aufmerksamkeit der angewandten Typografie erfahren hat. Die Ursachen für diese Nichtbeachtung sowie die Bedeutung von Typografie für den Erfolg von Kommunikation mit »Leichter Sprache« werden aufgezeigt. Ich beschreibe den Stand der Forschung zu »Leichter Sprache« und skizziere den Aufbau des geplanten linguistischtypografischen Forschungsprojektes.
Article
With the increasing prevalence of the automobile, the transmission of information through the visual means of signage became critical owing to the safety problems that followed the growth of the highway system and the continuous increase in traffic. This paper presents a review of research on the legibility of highway signs and discusses the key studies of the legibility of typefaces used on them. It examines in particular the legibility of the Latin typefaces in English language used on US highway signs, focussing on the most significant findings on the characteristics of typefaces and the features that most affect legibility. The paper also discusses the methodological approaches used to examine legibility in conditions of driving and suggests that future research should pursue the application of findings in the field of reading research and be informed by design knowledge.
Book
Full-text available
Reading Letters is a book about typeface legibility. In our everyday life we constantly encounter a diversity of reading matters, including display types on traffic signage, printed text in novels, newspaper headlines, or our own writing on a computer screen. All these conditions place different demands on the typefaces applied. In a straightforward manner, the book discusses these aspects by drawing on typography history, designers’ ideas, and available scientific data concerning the reading process. Easily accessible and richly illustrated, this is a must-have for any designer looking for guidance when choosing a typeface for a project.
Chapter
Full-text available
This chapter discusses the increased use of screen-based reading in education and in daily life generally, noting that readers usually have the option of printing off screen-based material to be read on paper. Some existing typefaces were taken over for use in computer systems, while other serif and sans serif typefaces were developed specifically for on-screen use. The chapter discusses the legibilityLegibility of serif and sans serif typefaces projected using older technology such as slide projectorsSlide projectors, overhead projectorsOverhead projectors, and PowerPointPowerPoint. Finally, the chapter describes some of the technical issues concerning the way that images are displayed using cathode-ray tubesCathode-Ray Tubes (CRTs) and liquid crystal displays.
Chapter
Full-text available
Any differences in the legibilityLegibility of serif and sans serif typefaces might become more apparent in readers whose visual systems are challenged as the result of disablement. Some researchers have focused on children in special educationSpecial education. In particular, children with visual impairment might be more sensitive to typographical factors. It has been suggested that the effects of congenital visual impairmentVisual impairment, congenital might be different from those of acquired visual impairmentVisual impairment, acquired. Finally, a majority of people with aphasiaAphasia also exhibit an impairment of reading, while other people without aphasiaAphasia may exhibit the specific disorder of reading known as dyslexia.
Chapter
Full-text available
As novice readers, young children may be disproportionately affected by different typefaces. The use of different typefaces may also affect how readily children acquire the ability to read. Research by Burt and Kerr[aut]Kerr, J. is often cited in support of the idea that serif typefaces are more legible. Zachrisson[aut]Zachrisson, B. provided a more thorough account of the role of typographic variables in reading among children of different ages using various research methods. It has been known for more than 100 years that children tend to confuse letters that are mirror images of each other (such as p and q), and this may in principle be affected by the presence or absence of serifs. Older readers tend to suffer from visual problems which may depend on typographical factors. This is of practical importance, as in the design of labels for medication containers.
Chapter
Full-text available
This chapter concludes Part I by summarising and discussing the key findings. Are there any differences in the legibilityLegibility of serif and sans serif typefaces when they are used to generate printed material? Are there any differences in readers’ preferences and connotations between serif and sans serif typefaces when they are used to generate printed material? Where does this leave previously stated assumptions about the legibilityLegibility of serif and sans serif typefaces? The chapter concludes by assessing the position adopted in the latest edition of the American Psychological Association’sAmerican Psychological Association, Publication Manual Publication Manual.
ResearchGate has not been able to resolve any references for this publication.