Full Terms & Conditions of access and use can be found at
Journal of Quantitative Linguistics
ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/njql20
On Stylometric Features of H. Beam Piper’s
Tomi S. Melka & Michal Místecký
To cite this article: Tomi S. Melka & Michal Místecký (2020) On Stylometric Features
of H. Beam Piper’s Omnilingual , Journal of Quantitative Linguistics, 27:3, 204-243, DOI:
To link to this article: https://doi.org/10.1080/09296174.2018.1560698
Published online: 12 Feb 2019.
Submit your article to this journal
Article views: 86
View related articles
View Crossmark data
Citing articles: 2 View citing articles
On Stylometric Features of H. Beam Piper’s
Tomi S. Melka
and Michal Místecký
Department of Humanities, Parkland College, Champaign, IL, USA;
Researcher, Las Palmas de G.C., Spain;
Department of Czech Language, Faculty of Arts,
University of Ostrava, Ostrava, Czech Republic
The article will focus on H. Beam Piper’s classical story Omnilingual (1957). This Piper-
esque writing has entered the records of the science ﬁction prose for the ‘Martian’
periodic table of elements, being synonymous with a scientiﬁc‘Rosetta-like stone’in
thedeciphermentarea.Thework,whilehaving a search potential in text analysis
and stylistics, may add in a parallel fashion some lustre to the validity of science as a
communicative channel in non-conventional circumstances. In order to capture
stylistic features of the novelette, a number of quantitative indicators are drawn in.
The study will concentrate on vocabulary-richness indexes (TTR, entropy, RR, RR
G, ATL, HL, MATTR, and Lambda), a complex assessment of activity (Busemann’s
coeﬃcient, the chi-square testing classiﬁcation), and a sketch of the Belza chain
analysis. The goal of the article is to ﬁnd distinctive features of the piece in question,
and point out ways for further research.
The study of style is probably the most complex problem in text analysis.
There are numerous deﬁnitions and accounts in this respect, but hardly any
of them is thoroughly operational (cf. in the chronological order, Swift,
1930 [1907, 1721]; Coleridge, 1914 [1907, 1818]; Wackernagel, 1873/1888;
Cooper, 1907/1930; Sebeok, 1960; Frye, 1963; Baker, 1966; Freeman, 1970;
Chatman, 1971; Sanders, 1977; and other past and recent scholars, too
many to list here).
In point of fact, we have to counter, after all the elementary and
legitimate question ‘what is it’(cf. Chatman, 1971; Sanders, 1977). In the
quantitative approach, which is based upon pragmatic principles, one can
consider style to be ‘a latent property of language that cannot be measured
directly’(Bell, Berridge, & Rayson, 2009, p. 3), but the discrepancies of
which can be mathematically captured. Basically, researchers in the ﬁeld
focus on the diﬀerences in the styles of various pieces of language, and not
CONTACT Tomi S. Melka email@example.com
This is a revised and expanded version of Melka (2018), published in the journal of Glottometrics,no.43.
JOURNAL OF QUANTITATIVE LINGUISTICS
2020, VOL. 27, NO. 3, 204–243
© 2019 Informa UK Limited, trading as Taylor & Francis Group
on the style as an isolated phenomenon. In the present study, the main
attention will be paid to the spheres of vocabulary richness, computable
cohesion, and activity/descriptiveness of a text, with individual indexes
discerned, evaluated, and ‘criticized’on the bases of their interpretative
potentials (cf. Argamon et al., 2007).
If one develops the problem further, there appear questions that need to
be answered –‘How does it behave?’and ‘Why does it behave (in) that
way?’The hypotheses set up in this domain will never be ideally answered/
tested because of the polysemy of the notion (cf. Carter & Simpson, 1989;
Eckert & Rickford, 2001; Sebeok, 1960). Nevertheless, one can try to
approximate the theoretical level by setting up simple, quantiﬁed indicators
or functions, test them and seek the answer to the ‘why?’question by
ﬁnding some other properties with which the given one is correlated.
Style –just as any other structural phenomenon –is result of a self-
regulation, based on ‘. . .its own internal and self-suﬃcient rules’(Hawkes,
1977/2004, p. 6). If an observed property Abehaves in a certain manner, it
may be thereby queried: what is the cause of this behaviour? The other
property associated with Amust be quantiﬁed too, and their relation must
be expressed in form of a testable hypothesis. Procedures of this kind are
well known from the synergetic linguistics (cf. Köhler, 2005a, p. 766), and
have already applied in text analysis, too.
Methodologically, there must be a possibility of comparing at least two
disparate texts, in order to show that there is a stylistic diﬀerence. Such a
diﬀerence will relate to the personality of the author, thematic choices, or
one of her/his characters if the author is the same. However, although the
diﬀerences may be interpreted by and large intuitively, they must be
supported statistically. The number of statistical methods used for this
purpose is suﬃcient and may be (should be) used for any examination.
The question and the commitment to unlocking the answers of ‘what are
the properties of a text?’–on a microcosmic scale –are perhaps no less
similar to those of ‘what are the properties of the world?’–on a macro-
cosmic one –.
And yet, one should not be oblivious to the fact that every inspection,
however useful, is only a reﬂection of a restricted view of the text, not the
capturing of ‘truth’(cf. Bunge, 1983, p. 269; Popescu, Altmann et al., 2009,
2. H. Beam Piper and his Story Omnilingual
Technically, it would be not only appropriate but also prudent to
enlighten some aspects about the author and Omnilingual (1957). The
writer does his bit to bring a look on the interpretation of unknown data
(deciphering, in this case), using clever and ad hoc mechanisms of
JOURNAL OF QUANTITATIVE LINGUISTICS 205
expression as well as a cache of words that excel the daily language
output of many humans. In this sense, useful assumptions can be made
and clues can be possibly found about the author and the deployed
H. Beam Piper (1904–1964) is a US fantasy and science ﬁction (sf)
writer. The ﬁrst name, assumed to be Henry or Horace, is just another
indication about the puzzle surrounding many facets of his life, and the
untimely death by suicide. H. Beam Piper (HBP) did not have much literary
success during lifetime: his ﬁrst story Time and Time Again was published
in 1947 when he was 42 years old (Beam Piper, 1947). Having a full-time
position as a night watchman at Pennsylvania Railroad Company, he was
able to support his self-education and passion for writing during a non-
negligible period of time. The mystery novel Murder in the Gunroom
(1953); the sf novels Little Fuzzy (1962), Space Viking (1963), and Lord
Kalvan of Otherwhen (1965) are considered literary highlights of his career.
Biography writers tend to think that a marriage gone awry, an unfriendly
divorce, the death of his literary agent Kenneth White, together with dire
economic woes, caused HBP to self-destruct (Carr, 2008; Hines, 2002). In
latter times, scepticism about the value of his work has been supplanted by
a cult-following movement, by several reprints of the major works, and by a
growing acclaim of his originality and insightfulness.
Omnilingual, published as a novelette in Astounding Science Fiction maga-
zine (February 1957; Figure 1), subsequently known as Analog, was later
collected in Federation (1981), a compilation of short stories by HBP.
Omnilingual deals with a human survey party –archaeologists included –
looking for clues and/or indigenous relics among the ruins of a very ancient
Martian city. Consider that we are in the realm of ﬁction and such things do
occur accordingly. On the other hand, science fact suggests that the red planet’s
surface water had likely disappeared before rocks formed about 3.9 billion to
4.6 billion years ago (cf. Carr & Head, 2010;Kramer,2014). A local culture
given to building Martian universities and complex research facilities could
hardly thrive in the aftermath amidst the cold and barren environment and the
absence of a protective atmosphere against UV showering and other radia-
tions. It should also be stated at this point that Piper’s is old-fashioned science
ﬁction, with the Mars theme and its inhabitants (‘natives’and/or Earth immi-
grants) regularly used in several storylines by his predecessors or contempor-
aries. Unusual methods of travelling and mis/adventures, in concert with
philosophical, technological, and socioethical quagmires of the various kinds,
are chronologically shown, e.g. in Greg (1880); Lasswitz (1897/1971); Wells
(1898); Serviss (2006 )Bogdanov(1908/1984); Burroughs (1912/1917);
Tolstoy (1922/1950); or Weinbaum (1934); for more, see Bleiler (1990).
The tradition of the ‘Mars fever’(Fergus, 2013) continues later with
Bradbury (1950); Clarke (1951), Brown (1954), Heinlein (1961), Dick
206 T. S. MELKA AND M. MÍSTECKÝ
(1964), Disch (1976), Pohl (1976) and Shiner (1984/2010), whose specta-
cular and polyvalent tales are linked –one way or another –with our
planetary neighbour. Next, the Mars-related literary patterns turn to more
hard science and speciﬁc topics. In such works, the human endurance and
resourcefulness are strongly tested in one or more quandaries, with the
nearly far-fetched outcomes upping one’s imagination, or with the realiza-
tion of extraordinary feats –e.g. Robinson (1992,1993,1996); Bear (1993);
McAuley (1994); Landis (2000); Weir (2011/2014); or Pratchett and Baxter
The human expedition, part of which is the protagonist of Piper’sstory,
Martha Dane, uncovers some strange writings while muddling through the
remnants of a large-scale habitat. In an act of scientiﬁcdevotion,Martha
makes assumptions, as she confronts the assertive and ego-driven associate
Anthony Lattimer. While the story is set in the 1990s, Omnilingual itself was
written in the 1950s, with Anthony Lattimer, PhD, not taking well to being
Figure 1. Cover of Astounding Science Fiction (February 1957; edited by John W.
Campbell, Jr.) featuring the ﬁctional characters M. Dane, H. Penrose, and S. von
Ohlmhorst. In the far background, a mural found amidst the University ruins shows
a‘heroic-sized Martian’handling a ‘theodolite’-like apparatus. Illustration by Frank Kelly
Freas; reprinted after Wikipedia (2018) public domain image.
JOURNAL OF QUANTITATIVE LINGUISTICS 207
upstaged by his young female colleague. With the ‘mystery’being pursued, the
breakthrough occurs when an ancient Martian University is located. Within its
conﬁnes, scores of books are discovered in what appears to be the Mars University
library. At some moment, a diagram of a simple atom and a table of words and
numbers are seen on one of the walls of the Department of Physics/Chemistry.
Given the occurrence of 92 slots, Martha speculated that the structural layout
corresponded to a periodic table. Hence, she builds up the chart in a piecemeal
fashion, computing the best alignment between Earth and Martian tables:
Hydrogen, No. 1: Sarafaldsorn;Helium,No.2:Tirfaldsorn; etc. The deconstruc-
tion of the Martian words in base roots and aﬃxes helped her in grasping the
meaning of extra words in a chain-like reaction. The fact is initially suggested in
Section 4, where the archaeologist deduces some similarity between Martian and
the German language, having as the model for generating new words by ‘pasting
existing ones’. Wikipedia (2018) keenly dwells on that idea, and suggests by
analogy the string of chemical elements –hydrogen: Wasserstoﬀ; carbon:
Kohlenstoﬀ;nitrogen:Stickstoﬀ,andoxygen:Sauerstoﬀ, each sharing the root
word ‘stoﬀ’(‘stuﬀ, matter, substance’), and a diﬀerent preﬁx. The ensuing Greek-
based patterns are comparable to some extent to the German-based ones –
Nitrogen, Hydrogen, Oxygen, Cyanogen, etc. –where the common root is
‘gen’, i.e. generating, producing, issuing, with N: ‘generating nitrous gas, or nitrate
substances’;H:‘generating water’;O:‘generating acid’;CN:‘generating the gas-
eous compound of carbon and nitrogen’;etc.
With the evidence getting stronger, the interpretation of a long lost language
began to fall into place. As the known universe is mostly packed with the element
hydrogen, it is –in retrospect –hydrogen on Earth, Jupiter, on the super-massive
Pistol star (constellation of Sagittarius) or elsewhere (cf. also Beam Piper, 1957,
Section 22), so we tend to think that this kind of Rosetta Stone is, if not fully
warrantable, then conceivable. Without the assistance of an ‘all-language’text
(the periodic table of chemical elements), it would have taken perhaps more than
a lifetime to crack the Martian writings. Looking at some of the real historical
decipherments, e.g. the names of Egypt-related rulers in the Rosetta Stone (Pope,
1975/1999), or the names of Minoan towns for Linear B (Chadwick, 1958/2000),
it must be concluded that –before penning Omnilingual –H. Beam Piper was
already familiar with the referents.
2.1. Rosetta Stones as Serendipitous Assistants in Real-Life Settings
It should be mentioned at the outset that the discussions of decipherment
(Subsections 2.1,2.2), rather than being a distraction from understanding
the story and the original idea of HBP, are intended to come to the aid of
the readers. Here, we concentrate on one basic criterion for a feasible
decipherment of an unknown script/language: acquirement of bilingual
208 T. S. MELKA AND M. MÍSTECKÝ
inscriptions presumably encoding speech in linearly arranged symbols
(Gelb & Whiting, 1975, pp. 98–99).
The annals of decipherment have shown not infrequently records whose
original content is duplicated, paralleled, or slightly paraphrased in other lan-
guages (see Daniels, 1996, Friedrich, 1957/1971,p.153;Gelb&Whiting,1975;
Knight & Sproat, 2009). As language contact situations in pre-modern civiliza-
tions are considered matter-of-fact, the phenomenon of bi- and multilingualism
should not come as surprise. For ‘dead’languages, however, the primary evidence
about bilingualism is diﬀerent from that on which modern linguists investigating
bilingualism in spoken languages can call (cf. Adams, 2003,p.3;Campanile,
Cardona, & Lazzeroni, 1988). For the decipherer of ancient languages, written
data are the necessary medium. In order to tackle the problem, the precondition
is that scholars should be acquainted sine qua non with one of the languages.
Given the nature of diﬀerent and/or incomplete sound encodings in two or more
scripts –on the iconic, morphemic, syllabic, segmental levels –consider that a
bilingual text is not a type of external clue that instantly produces a unique
solution to the retrieved portions of ancient writings (cf. Gelb & Whiting, 1975,p.
99). Overall, by virtue of a true (or quasi) bilingual text –a duplicate, e.g. well
done!inEnglishvs¡bien hecho!inSpanish–or, of a virtual bilingual one –place
names, proper names, e.g. a-mi-ni-so,Amnisos;ko-no-so,Knossos;tu-li-so,Tu/
ylissos, after the Linear B decipherment (Pope, 1975/1999, p. 174; Robinson,
2002,p.99)–epigraphers/script analysts get a foothold allowing them to advance
in the elucidation or decipherment of the available material.
Several instances corroborate the above: the Palmyrene (a form of Aramaic)
script was cracked in 1754 thanks to a bilingual (Parkinson, 1999,p.16);the
decipherment of Luwian language was ascertained later through the Phoenician-
Luwian bilingual records of Karatepe hill (Gordon, 1968/1987, pp. 100–101;
Hawkins & Morpurgo Davies, 1978); the Phoenician version of a Cypriot
bilingual provided the key to the Cypriot syllabary (Friedrich, 1957/1971, pp.
136–139, Fig. 60; Steele, 2013, p. 202); the Thugga (modern Dougga) bilingual
text in Tunisia, written in Punic and a variant of the ‘Numidian’(the so-called
Lybico-Berber) alphabet (Friedrich, 1957/1971, pp. 118–121, Fig. 57; O’Connor,
1996, p. 113) also adds up to this list; the so-called Landa’s‘alphabet’, a fragment
of a copy manuscript written by the Spanish bishop of Yucatán Diego de Landa
about the Maya people and their civilization, assisted Yuri Knorozov in carrying
out the initial substantiation of Maya script on credible phonetic values (Pope,
1975/1999, pp. 199–200; Robinson, 2002, pp. 119–125).
Still, the renowned multi-text inscription held responsible for starting
the decipherment of a real-world script is the Rosetta Stone –as is often
known today –. The artefact of Rosetta Stone shows three systems
(Egyptian hieroglyphic, local Egyptian demotic, Greek) coded in two lan-
guages (Egyptian and Greek), displaying a decree in honour of King
Ptolemy V Epiphanes related to year 196 BCE. The literature on this
JOURNAL OF QUANTITATIVE LINGUISTICS 209
momentous artefact and its role in the understanding and interpretation of
hieroglyphic writings is large; suﬃce to take note of some references at this
juncture, Friedrich (1957/1971, pp. 17–25); Pope (1975/1999, p. 61); Ray
(2007); Robinson (2002, pp. 56–60). While the French scholar Jean-
François Champollion is generally credited with using the Rosetta Stone
to decipher the ancient Egyptian script, the intricacies of the decipherment
and the respective contributions are still open to debate (cf. Daniels, 1996,
p. 145; Gordon, 1968/1987; Pope, 1975/1999; Robinson, 2011).
In a narrow sense, the NP-collocation Rosetta Stone has come to indicate
by antonomasia any artefact designed to convey parallel and repeat infor-
mation about unknown entities, events, or cultural phenomena, which
assists in explaining and restoring their original structure and meaning.
2.2. Cosmic Rosetta Stones
So far, unidentiﬁed scripts involving human-related settings have been men-
tioned. The cumulative knowledge brings on many challenges of a diﬀerent
nature: unidentiﬁed signals, sign sequences, visual-like data streams derive simi-
larly from other-than-human sources, whether Earth-bound, or not. With each
passing year, establishing contact with non-human entities is viewed as more
than plausible, raising scientiﬁc and philosophical concerns of the highest order.
The actuality of corresponding with other intelligences or sentient life-
forms, plus its complex ramiﬁcations, is discussed in diﬀerent sources (cf.
Dick, 1998; Drake & Sobel, 1992; Engdahl, 2001/2006; Golomb, 1961/1968;
Heidmann, 1995; Michaud, 2007; Vakoch, 2014). The coordinated eﬀorts
are mainly based hitherto on the current understanding of human needs, of
the physics of space, and of communication. Therefore, adjudging con-
sciously or unconsciously human characteristics to the ‘contact language’or
‘channel’is regarded as a drawback (cf. Harrison, 2014; Michaud, 2007).
The designed languages and/or devices vary from naturally developed
human languages, to the mathematical ones, radio signals, visual-symbolic
codes, or to the dispatch of robotic space probes carrying messages in
multiple ways. In this context, the most celebrated Rosetta Stone –intended
to be intercepted by any scientiﬁcally educated being in outer space –is the
coded message of the Pioneer 10 space probe of 1972 (see Davies, 1995, pp.
55–56; Gombrich, 1982, pp. 150–151). Despite the outcome of Pioneer 10’s
mission (e.g. falling short of achieving its goal for multiple reasons), the
message stands for a deeply symbolic human eﬀort in contacting intelli-
gences beyond Earth. In a similar manner, it paved the way for additional
communication experiments, each a reminder of the humankind’s drive to
expand the frontiers of knowledge in the physical and metaphysical sense.
210 T. S. MELKA AND M. MÍSTECKÝ
3. Analysis of Text
Omnilingual (1957) is an autonomous dataset of c. 16,730 tokens
partitioned in 22 sections, with each of them far from being a ‘suﬃciently large’
text. At this juncture, we cannot do better but direct readers for purported text
lengths and reliable results to Eder (2010/2013), Popescu, Altmann et al. (2009,p.
3), and Tuldava (2005, p. 370), whereas for deﬁnitions of ‘text’in modern
linguistics and quantitative studies, one should refer to Juola (2008, p. 252);
Kubát (2014,p.105);Nekula(2002, p. 489); Popescu, Čech, & Altmann (2011,p.
98); and Yesypenko (2008, p. 18). Similarly, observe that in information science,
one deﬁnition of ‘text’is formulated as ‘. . .a collection of signs purposefully
structured by a sender with the intention of changing the image-structure of its
recipient’(Belkin & Robertson, 1976,p.201).DouglasRaber(2003,p.6)points
outthatifforthemomentwhatitmeanstobe‘informed’is left aside, acollection
of signs, according to Belkin and Robertson (1976), can appear in a variety of
formats and media, including but not limited to writing. Yet, we shall adhere in
our paper to ‘writing’as a standard textual format for the purpose of extracting
data for quantitative analyses.
The simple arithmetic mean would be c. 760 words per corpus section at this
moment, but clearly Figure 2 highlights that the distribution is not uniform in
nature. One realizes that comparing it in favourable terms with other popular
works of H. Beam Piper (1962,1963,1965) adds up to a tenuous practice.
Creative literary samples are (apparently) not written by smart automata with a
quasi-perfect rhythm and disposition; they are not invariant and are character-
ized by an inherent lack of homogeneity (cf. Bell et al., 2009;Strauß,Grzybek,&
Altmann, 2006). This notion ﬁnds support elsewhere. As part of the human
thought processes –which are hierarchical and discontinuous –writing itself is a
reﬂectionof them, contrasting, for example, with spaces distinguished by con-
tinuity and connectivity, such as the physical ones (cf. Khrennikov, 2014). The
point, however, does not suggest that intra-authorial analyses should not be made
one day; it means that given the topic and the lexical speciﬁcity of Omnilingual,
observations and results cannot be rid of (some) arbitrariness. A similar argu-
ment can be raised with regard to ﬁction works of other authors. If we hypothe-
tically consider weighing the Mars-themed Omnilingual against another
storybook, e.g. the New Grub Street of George Gissing (2016[2008, 1891]), the
discrepancy is pretty obvious: the relianceonsamplesizemaycausebias-related
conclusions (i.e. the sampling variation problem).
Speciﬁcally, Gissing crafted a
three-volume novel of c. 220,000 tokens. Another concern is that the plot revolves
around two contrasting characters in the late 19th century of London’sliterary
world, and their hardships, attitudes and ethical (or not) choices regarding
professional and social life. In this sense, attributions to these dissimilar spheres
of action and location (Earth’s late Victorian London vs Mars, plus the involved
cognitive challenges) seem to build more corpus-based gaps than bridges. In
JOURNAL OF QUANTITATIVE LINGUISTICS 211
sheer size, paralleling or overlaying statistics between H. Beam Piper’s and
George Gissing’s may lead to some implausible claims and encroachment of
textual realities. What is also of note is the remark found in the thesis of Jack W.
Grieve (2005, p. 21) when reviewing measures of vocabulary richness:
’…the vocabulary of a text depends far more on its subject than on its author.
For while every word in a text must be drawn from its author’s vocabulary,
diﬀerent subjects will activate diﬀerent sections of an author’s vocabulary,
and diﬀerent sections of any author’s vocabulary will not all be equally rich.’
The raw data of Omnilingual (1957) are retrieved from the public domain
formatted in txt ﬁles and its 22 sections separately arranged, the language data
are processed per statistical software packages with know-how in such matters:
1–QUITA: Quantitative Index Text Analyzer (Kubát, Matlach, & Čech,
2–LancsBox: Lancaster University corpus toolbox (© Vaclav Březina,
2018), for the part addressing the Busemann’s coeﬃcient.
3.1. Properties of the Vocabulary
Vocabulary richness in an organized text can be measured:
(a) Directly, by naming the individual types –not tokens –because the
number of tokens is automatically greater in synthetic languages. The
types (i.e. distinct words) can be ordered according to some principle, e.g.
Figure 2. The graph conveys the number of tokens per individual sections in HBP’s
Omnilingual (1957). The N-size shows various steep slopes, especially through sections 10
to 20, spiking at # 13.
212 T. S. MELKA AND M. MÍSTECKÝ
ranked according to their frequency, their length, etc. (see e.g. Baayen,
2001;Čech, Popescu, & Altmann, 2014;COCA,2018;Herdan,1964;
Johnson, 2008;Köhler&Galle,1993;Kubátetal.,2014; Leech, Rayson,
Strauß et al., 2006).
(b) Indirectly, by performing some classiﬁcations of the types and set-
ting up new distributions: they can be classiﬁed according to the
parts-of-speech (PoS) to which they belong, according to the role
they play in the sentences. On the other hand, some special classes, e.
g. adnominals, verb valencies, etc., should be separated (cf. Fortis &
Fagard, 2010; Helbig & Schenkel, 2011[1991, 1969]; Herbst, Heath,
Roe, & Götz, 2004; Köhler, 2005b; Pan & Liu, 2014).
(c) By means of indicators which can be established either directly,
from the existing approaches (a) and (b).
It is re-emphasized that the aim in this article is to study the vocabulary of
H. Beam Piper (1957), and along this line some of its properties are shown.
As expected, the frequencies of distinct words in individual sections are
counted, evaluated and, in addition, the development of the text is studied.
As stated earlier, H. Beam Piper’s(1957) novelette has 22 sections/chapters,
and for the sake of cross-checking and further examination, they are all
listed in Tables 1 and 5.
As to vocabulary richness, many indicators are available; among these, the
classical type-token ratio (TTR) is widely used, though one needs to bear in
mind that it substantially depends on the text-length. Since, in Omnilingual,
the sections are of a comparable size, it was included in the present
research; the count states:
Table 1. Types and tokens as recorded through each section of Omnilingual.
Section Types Tokens Section Types Tokens
Omnilingual_1 280 520 Omnilingual_12 261 478
Omnilingual_2 398 844 Omnilingual_13 622 1670
Omnilingual_3 274 557 Omnilingual_14 310 602
Omnilingual_4 281 608 Omnilingual_15 297 545
Omnilingual_5 362 761 Omnilingual_16 464 1136
Omnilingual_6 371 884 Omnilingual_17 399 818
Omnilingual_7 409 869 Omnilingual_18 410 867
Omnilingual_8 425 976 Omnilingual_19 340 747
Omnilingual_9 322 615 Omnilingual_20 214 478
Omnilingual_10 266 487 Omnilingual_21 298 712
Omnilingual_11 419 835 Omnilingual_22 309 700
JOURNAL OF QUANTITATIVE LINGUISTICS 213
where Vstands for the number of types and Nfor the total of the words in a text.
The resulting value relates to vocabularydiversity:themorediversetheV,the
higher the TTR, see Table 5. The numbers of types and tokens found across the
sections are listed in Table 1.
Next, lexical wealth of a text can be evaluated on the basis of entropy.
Derived from the original notion introduced by Shannon (1948), linguistic
entropy measures the degree of vocabulary dispersion in a text; it can also
be interpreted as a measure of its monotony. Its formula is as follows:
with Kstanding for the inventory size, and pirelative frequency of a given
word. It needs to be pointed out that, as in TTR, a text size has got a
considerable impact on the values; moreover, a linguistic type plays its role,
too, as it has been found that entropy gets higher with the level of analytical
character of language (cf. Strauß, Fan, & Altmann, 2008, p. 96). It makes
sense, since these tongues tend to use more words in general, which
increase the ﬁgure of the measurement.
3.1.3. Repeat Rate and RR
If there is a necessity to measure repetitiveness of individual words, one can
make use of the repeat rate (RR). George U. Yule’s(1944/2014)‘character-
istic K’indicates through inversion that the richer the text is, the smaller the
repetition of words is. Its basic formula reads:
where rmeans a rank, Vis the number of distinct words (types), and p
squares of individual relative frequencies. The RR formula can be relativized,
transformed in entropy, or chi-squared (cf. Altmann & Köhler, 2015,p.38),etc.
This procedure was normalized by McIntosh (1967; see also Popescu,
Altmann et al., 2009), yielding:
The point of McIntosh’s change to the original formula was to link it to
the size of the text, which is expressed in the number of tokens. Thanks
214 T. S. MELKA AND M. MÍSTECKÝ
to this amendment, it is not needed to count the minimal value of RR to
ﬁnd a springboard for comparisons; two texts can thus be contrasted
directly on the bases of the RR
counts. In the present analysis, both
indexes have been calculated; the results are listed in Table 2.
The sequence of RR is not monotonic, with the steep jump revea-
lingly shown in Section 14 (Figure 3). Apparently, the boundary con-
ditions lead H. Beam Piper to write the given section in a slightly
diﬀerent manner. The observation strikes well a chord with the
‘change-point’notion, as shown in the Subsection 5 ‘Time series ana-
lysis’of F. J. Tweedie (2005, pp. 390–391).
Figure 3. Repeat-rate development through the Omnilingual (1957) sections. If the
whole text was concentrated in one uniformly replicated word, we would acquire a
= 1 and –that is hardly the case in point.
Table 2. RR and RR
ﬁgures for the Omnilingual sections.
Section RR RR
Section RR RR
Omnilingual_1 0.018 0.922 Omnilingual_12 0.013 0.946
Omnilingual_2 0.010 0.946 Omnilingual_13 0.011 0.932
Omnilingual_3 0.011 0.951 Omnilingual_14 0.020 0.910
Omnilingual_4 0.014 0.939 Omnilingual_15 0.012 0.947
Omnilingual_5 0.010 0.949 Omnilingual_16 0.008 0.958
Omnilingual_6 0.011 0.943 Omnilingual_17 0.013 0.934
Omnilingual_7 0.009 0.954 Omnilingual_18 0.013 0.933
Omnilingual_8 0.009 0.949 Omnilingual_19 0.012 0.941
Omnilingual_9 0.014 0.934 Omnilingual_20 0.013 0.949
Omnilingual_10 0.012 0.950 Omnilingual_21 0.011 0.952
Omnilingual_11 0.011 0.941 Omnilingual_22 0.010 0.957
JOURNAL OF QUANTITATIVE LINGUISTICS 215
3.1.4. Gini’s Coeﬃcient
One of the many possibilities to account for the richness of text is Gini’s
coeﬃcient (Kubát et al., 2014; Popescu, Altmann et al., 2009, pp. 54–63).
Similarly, the indicator can be used in other scientiﬁc areas that are not
concerned with stylometric experiments (cf. Damgaard & Weiner, 2000,or
Gini’s coeﬃcient is the space between the Lorenz curve and the straight
line joining <0;1> in the two-dimensional coordinate system (Gini, 1921; cf.
Ceriani & Verme, 2012). The Lorenz curve is the stepwise adding of relative
frequencies beginning from the lowest up to the highest (Popescu, Altmann
et al., 2009, p. 56, Fig. 3.11). Since this constitutes an area, one needs to
ﬁgure out all individual areas between the two lines. Regardless of the fact,
there are easily computable approximations at our disposal. One of them is
and rendered as 1 + 1/V –2* μy/V, where μis the mean of the frequencies.
For comparative purposes, one can use the variance of Gconsistent with:
is the variance of the rank frequencies. The values for individual
sections are listed as fractions in Table 3.
On the whole, Gini’scoeﬃcient tells us that the smaller its values, the greater
will be the vocabulary richness (e.g. Popescu & Altmann, 2006). The sequence of
Gini’scoeﬃcients could be captured by a straight line, but Section 13 involves a
climactic value, plus the variation among the other chapters is clearly perceptible.
3.1.5. Hapax Legomena and Average Tokens Length
complete, two more measures are ﬁgured out. First, hapax legomena (HL)
count is a simple ratio of all the words that occur only once in a text to the
total of them (Popescu, Mačutek et al. 2009, p. 99). As there is no empirically
Table 3. Results of Gini’s coeﬃcient on individual sections.
Section Gini Section Gini Section Gini Section Gini
1 0.412 8 0.483 15 0.401 19 0.459
2 0.461 9 0.422 16 0.494 20 0.454
3 0.434 10 0.392 17 0.453 21 0.467
4 0.471 11 0.443 18 0.467 22 0.459
5 0.456 12 0.397
6 0.489 13 0.543
7 0.451 14 0.437
216 T. S. MELKA AND M. MÍSTECKÝ
attested number of hapax legomena to be found in a text, the proportion is not
exclusively exploited as an indicator for richness; its importance lies rather in
comparisons. The development of HL ratios across the investigated text is
illustrated in Figure 4.
Second, the average tokens length (ATL) is estimated considering the
mean of a word size in characters. The characters were chosen as they
seem to be the steadiest unit, withthephonemesbeingslightlydiﬀerent
in individual varieties of English, and the system of syllabic divisions not
uniﬁed. Mathematically, it is expressed as:
where N= number of tokens, and p
= individual word size.
The length may be directly linked to complexity or style, as, for
instance, the English vocabulary tends to manifest itself in two or even
three layers (e.g. spell –enchantment [n.]; own –possess [v.]; edgy –
excitable [adj.]; ﬁre –ﬂame –conﬂagration [n.]; ask –question –inter-
rogate [v.]; clear –pellucid –transparent [adj.]). These content words,
having originated from diﬀerent sources (Germanic, French, and Latin),
do not only have variegated meanings, but are felt to be situated at various
stylistic levels, the highest ones being reserved for words of the Romance
provenance (cf. Jackson & Amvela, 2000).
Figure 4. HL ratios across the twenty-two sections of Omnilingual.
JOURNAL OF QUANTITATIVE LINGUISTICS 217
3.1.6. The Lambda Indicator (Λ)
Every frequency distribution of words in a text, whether ranked or presented as a
spectrum, displays a number of properties which can be measured, compared
and tested. One among the many others developed in the last years is the so-
called lambda indicator,deﬁned on the basis of Euclidean distances between
neighbouring/ranked frequencies (cf. Popescu et al., 2011,p.3).Itmaybe
relativized in such a way that text size does not have an apparent inﬂuence.
Still, under the premises, it can be approximated in a simple form (cf. Popescu &
Altmann, 2015). As the underlying Euclidean distance can be approximated by:
where Vis the vocabulary size (or the highest rank), f
is the frequency of a
unit at rank 1 (the most commonly used word) and his the h-point deﬁned as:
h¼r;if there is an r ¼fðrÞ
rjriþfðiÞfðjÞ;if there is no r ¼fðrÞ
we obtain for (1) an estimated lambda in the form of
In explicit terms, hcan be described as a ﬁxed point along the rank–frequency
distribution, where rand f(r)ofaspeciﬁc linguistic unit concur during the
counting (cf. Popescu & Altmann, 2006,2007). For cases where r=fand the
point is unattainable, hcan be decided by the point where the product of rank
and frequency reaches its maximum (Kelih, Rovenchak, & Buk, 2014, p. 84). The
h-point has found a use in quantitative text analysis (e.g. in the measurement of
vocabulary richness) and in cross-linguistic comparisons, enfolding synthetic
and analytic languages (e.g. Popescu et al., 2011,pp.10–11). This premise,
however, is all too often subject to text size in its applications, where texts of a
similar length, or other indicators that rely on the h-point and are normalized as
tothesizeoftext,arepreferableduringtheanalyses(Kelihetal.,2014, p. 85).
In view of the aforementioned objection by Kubát (2014)concerningthe
dependence of lambda upon text length, Pearson and Kendall correlation coeﬃ-
cients (cf. Zaid, 2015) have been counted; their values (−0.37 and −0.26) have
shown that there is a feeble inverse dependence, but it does not seem prominent
within the studied context. It may thus be concluded that lambda is a valid
indicator of vocabulary richness for Omnilingual.
Further work can be done if one considers comparing the variance (Var Λ*) of
each section, and performing a normal u-test (Table 4). Nevertheless, given the
projected extension which may include other-than-Omnilingual texts, we would
prefer treating the topic in a study of its own.
The results are listed in Table 5.
218 T. S. MELKA AND M. MÍSTECKÝ
3.1.7. Busemann’s Coeﬃcient
Another property of text which can be measured is activity. The simplest way
of its quantitative evaluation is Busemann’scoeﬃcient (1925),
which is the
division of the number of verbs to the sum of verbs and adjectives; namely:
Table 4. Values of h-point and lambda according to QUITA text analyser; cf. also Figure 5
and Figure 6.
Text section H-point Lambda Text section
01. Martha Dane paused. . . 7.00 1.675 12. The sixth ﬂoor was. . . 7.00 1.602
02. There were ten people. . . 9.83 1.482 13. They made their
way. . .
03. Selim von Ohlmhorst. . . 9.00 1.442 14. Lunch at the huts. . . 7.00 1.678
04. Photographs. . . 10.00 1.366 15. They worked up. . . 8.00 1.607
05. Sachiko was speaking. . . 10.00 1.460 16. The next day. . . 13.00 1.302
06. Michael Ventris. . . 11.00 1.383 17. Ivan Fitzgerald. . . 9.00 1.573
07. Three men had come
in. . .
10.50 1.458 18. Martha
remembered. . .
08. The library, which was
also. . .
11.00 1.405 19. She was halfway. . . 9.50 1.472
09. They all got out. . . 8.00 1.644 20. Ninety-two! 9.00 1.325
10. The door, one of the
double. . .
7.50 1.593 21. Tranter hesitated. . . 9.50 1.299
11. The hallway, too, was
thick. . .
10.00 1.608 22. Sachiko Koremitsu. . . 10.50 1.315
Table 5. Integrated results of eight stylometric indicators regarding the text of
Text TTR Entropy RR RR
G ATL HL Λ(Lambda)
Omnilingual_1 0.538 7.301 0.018 0.922 0.412 4.562 0.398 1.675
Omnilingual_2 0.472 7.782 0.010 0.946 0.461 4.344 0.333 1.482
Omnilingual_3 0.492 7.403 0.011 0.951 0.434 4.176 0.341 1.442
Omnilingual_4 0.462 7.268 0.014 0.939 0.471 4.610 0.327 1.366
Omnilingual_5 0.476 7.698 0.010 0.949 0.456 4.281 0.336 1.460
Omnilingual_6 0.420 7.666 0.011 0.943 0.489 4.256 0.273 1.383
Omnilingual_7 0.471 7.908 0.009 0.954 0.451 4.377 0.314 1.458
Omnilingual_8 0.435 7.870 0.009 0.949 0.483 4.403 0.294 1.405
Omnilingual_9 0.524 7.532 0.014 0.934 0.422 4.498 0.387 1.644
Omnilingual_10 0.546 7.429 0.012 0.950 0.392 4.314 0.392 1.593
Omnilingual_11 0.502 7.855 0.011 0.941 0.443 4.623 0.372 1.608
Omnilingual_12 0.546 7.367 0.013 0.946 0.397 4.529 0.406 1.602
Omnilingual_13 0.372 8.103 0.011 0.932 0.543 4.480 0.238 1.387
Omnilingual_14 0.515 7.303 0.020 0.910 0.437 4.711 0.380 1.678
Omnilingual_15 0.545 7.513 0.012 0.947 0.401 4.464 0.400 1.607
Omnilingual_16 0.408 8.025 0.008 0.958 0.494 4.402 0.256 1.302
Omnilingual_17 0.488 7.722 0.013 0.934 0.453 4.581 0.346 1.573
Omnilingual_18 0.473 7.735 0.013 0.933 0.467 4.314 0.337 1.562
Omnilingual_19 0.455 7.602 0.012 0.941 0.459 4.278 0.297 1.472
Omnilingual_20 0.448 7.061 0.013 0.949 0.454 4.153 0.293 1.325
Omnilingual_21 0.419 7.499 0.011 0.952 0.467 4.253 0.249 1.299
Omnilingual_22 0.441 7.564 0.010 0.957 0.459 4.186 0.273 1.315
JOURNAL OF QUANTITATIVE LINGUISTICS 219
The verb–adjective ratio has been studied in several works: e.g. Altmann
(1978,2018a); Altmann and Köhler (2015); Antosch (1969); Bakker (1965);
Boder (1940); Místecký (2018); Těšitelová (1987/1992). One reason to
include activity in the assessment of Omnilingual is that, unlike other
indicators (TTR, RR, or entropy), the impact of the text length on it is of
no consideration (cf. Zörnig et al., 2015). According to the results, texts can
be classiﬁed into active (Q> 0.5), neutral (Q= 0.5), and descriptive ones
(Q< 0.5). As such division is quite rough, a chi-square test may be
Figure 5. The graph shows the development of lambda (vocabulary use) across the
sections of Omnilingual. The course of lambda is not even, suggesting changes in the
structure of text. The changes may be related with particular properties of each
section, with pauses, with the thematic yarn, or a posteriori diﬀerent presentations,
once HBP ﬁnished writing the preceding section.
Figure 6. Graphic distribution of h-point and lambda across the 22 sections of the text.
Further investigations on H. Beam Piper’s work and other inter-authorial comparisons
are needed for a reliable statement on the possible correlation of these two indicators.
220 T. S. MELKA AND M. MÍSTECKÝ
introduced, which states whether activity/descriptiveness of a text is statis-
tically signiﬁcant. Its formula (cf. Altmann & Köhler, 2015; Zörnig et al.,
2015) is rendered as:
Given the results, the test ranges the texts on the basis of the following
(1) SA –signiﬁcantly active (Q> 0.55, χ
(2) AC –active (Q> 0.55, χ
(3) N –neutral (0.45 < Q< 0.55);
(4) DE –descriptive (Q< 0.45, χ
(5) SD –signiﬁcantly descriptive (Q< 0.45, χ
In the present study, the research in activity has been operated both via
the LancsBox (2018) software and manually, as the programme is incapable
of discerning verbs and predicates. Moreover, the sf novelette contains
adjectives which may not be deep-rooted in standard language. For possible
objections to be prevented, a meticulous attention was paid to deﬁning both
word classes. Verbs like ‘be’and ‘have’are not separately counted; never-
theless, in serving as auxiliaries in compound constructions, or embedded
in ﬁxed sets and idioms, they are treated as a single unit, e.g. you’re not
going to insist on; you want to be a big shot; was ahead of him; we have to
risk, that was right; he had been afraid of; she was trying to think. Modal
verbs are not independently counted rather than estimated as subsidiary
parts, e.g. must have been carried on; I ought to mention; it would mean
something. . . Adjectives are easier to classify; as autonomously standing
units, e.g. small houses; ﬂaky stuﬀ,unshaded light; or compound forms,
either with a blank (frontal modiﬁers, e.g. the Space Force oﬃcers; a bar and
lunch counter), or hyphenated, e.g. purple-tinged; brush-grown; proton-
and-neutron; high-level; long-chain; cases where they are recognized as
alphanumeric combinations, e.g. carbon-14 dating; as acronyms, e.g. A-
bomb mushrooms; or a mixture of them, e.g. the 4000-f.s. bullet.
Modiﬁers as part of ﬁxed collocations are not tagged as adjectives, e.g.
‘Rosetta Stone’;tobe‘a big shot’; the ‘old dog’; the ‘Dark Ages’in Europe;
‘Stone Age;’‘Syrtis Depression;’the ‘Wicked Witch’in the ‘Wizard of Oz.’
Accordingly, there are exceptional circumstances when H. Beam Piper
places in the text invented Martian words. We considered such fragments
of the Martian lingo to be ‘unknown’and disregarded for tagging.
Last but not least, if one pursuits the scaling of verbs in terms of activity
(e.g. go –jog –run –sprint; with the last one bringing about more ‘activity’
JOURNAL OF QUANTITATIVE LINGUISTICS 221
than the others), the analysis can be more comprehensive. Another line of
research is to subdivide verbs in keeping with the various semantic cate-
gories of Dixon (1991/2005) or Kipper, Korhonen, Ryant, and Palmer
(2008), and study their ranked frequencies (cf. Levickij & Lučak, 2005).
The same can be said in assigning any ‘static’,‘semi-static’, and ‘vibrant’
adjectival quality along a similarly graduated series (e.g. a dead person –a
lethargic person –an awkward person –adramatic person –acholeric
person). Further reﬁnements can be carried out on the basis of the semantic
orientation (polarity), or not, e.g. adjectives that involve a desirable state –
graceful,precise,exultant vs adjectives that often represent a negative state –
broken,furious,tiresome; and those adjectives that have no orientation as
per binary properties, e.g. green,glasslike (cf. Lyons, 1977/1996, pp. 270–
291; Wiebe, 2000). In the article, however, for the sake of simplicity and
traditional pragmatism, we comply with the basic division.
The results are summarized in Table 6. It is noteworthy that almost all
sections in Omnilingual are signiﬁcantly active, which may have to do with
the general tendency of modern ﬁction to avoid rich adjectival embellish-
ments. Moreover, because of the situational development of the storyline,
with members of the Martian expedition constantly investigating, debating,
deploying and re-deploying, dynamic and contingent motions (plus argu-
ments) are to be expected –ﬁnding their direct linguistic expression in
verbs. Otherwise, Těšitelová (1987/1992) mentions that, compared to the
number of diﬀerent verbs, non-ﬁctional texts have a higher number of
diﬀerent adjectives. Such a ruling seems to be related to the strongly
Table 6. Calculations concerning activity in the Omnilingual novelette.
Text Verbs Adjectives Activity Chi-square Text type
Omnilingual_1 53 40 0.57 1.81 A
Omnilingual_2 94 62 0.60 6.56 SA
Omnilingual_3 80 18 0.81 39.22 SA
Omnilingual_4 78 40 0.66 12.24 SA
Omnilingual_5 100 49 0.67 17.46 SA
Omnilingual_6 114 49 0.70 25.92 SA
Omnilingual_7 125 50 0.71 32.14 SA
Omnilingual_8 150 55 0.73 44.02 SA
Omnilingual_9 87 50 0.64 9.99 SA
Omnilingual_10 76 18 0.81 35.79 SA
Omnilingual_11 96 58 0.62 9.38 SA
Omnilingual_12 56 30 0.65 7.86 SA
Omnilingual_13 228 99 0.70 50.89 SA
Omnilingual_14 50 46 0.52 0.17 N
Omnilingual_15 84 23 0.79 34.78 SA
Omnilingual_16 190 41 0.82 96.10 SA
Omnilingual_17 98 61 0.61 8.61 SA
Omnilingual_18 114 40 0.74 35.56 SA
Omnilingual_19 100 45 0.69 20.86 SA
Omnilingual_20 60 27 0.69 12.52 SA
Omnilingual_21 74 30 0.71 18.62 SA
Omnilingual_22 102 20 0.83 55.11 SA
222 T. S. MELKA AND M. MÍSTECKÝ
nominal (substantive-based) character of these texts, given their descriptive
and informational nature.
The only exception in the overall activity-infused novelette seems to be
Section 14, where the description of a part of the Martian university is
provided; it is thus much more of an academic-like report than a piece of
ﬁction. Furthermore, Section 14 deviates from the expected numbers in
more than one respect, which is going to receive attention in the Discussion
section of this article.
3.1.8. MATTR (Moving-Average Type-Token Ratio)
Given the fact that TTR is dependent on text length, there have been
attempts to create a vocabulary-richness measure that would be freed
from this restriction. This age-long ambition among quantitative linguists
has been stated and documented in numerous publications (see Popescu et
al., 2011, p. 1; Tweedie, 2005, pp. 389–390). After various tries, the study by
Covington and McFall (2010) developed a normalized TTR formula called
MATTR (Moving-Average Type-Token Ratio). It is based upon the idea of
a text chunk –a‘window’–which moves along the text, always by one
token at a time; the overall ﬁgure for the text is then the average of these
window TTR’s. Mathematically, the aforementioned is conveyed through
here, Nstands for the length of the text, Vifor the number of types in one
window, and Lfor the arbitrarily chosen length of the window. In the
present analysis, the standard number of 100 tokens has been opted for.
To date, the method has been used in several studies (cf. Kubát, 2016;
Kubát & Milička, 2013; Savoy, 2017), and is also included in a new book on
statistics in corpus linguistics (Březina, 2018).
For the text at hand, the MATTR results are listed in Table 7.
The subsequent graph shows the MATTR trend throughout the sections
of the story (Figure 7).
Even though the MATTR-based vocabulary richness is a more suitable
tool when it comes to comparative analysis, an interpretation may be
proposed even here. It is symptomatic that the vocabulary richness drops
in Sections 4 and 14, where the text gets very analytical, as it deals, in the
former, with linguistic issues and in the latter, with a dry scientiﬁc descrip-
tion. In general, there is no single trend to be determined on the basis of the
numbers. The results of MATTR thus conﬁrmed what has been found out
JOURNAL OF QUANTITATIVE LINGUISTICS 223
3.1.9. Belza chain Belza Chains
Belza chains are a numeric attempt at delimiting the degree of cohesion in a
text. A Belza-chain is a later coinage following the original work of the author
(Belza, 1971). Unlike the qualitative approaches, useful as they are (cf.
Dontcheva-Navratilova, Jančaříková, Miššíková, & Povolná, 2017;
Beaugrande & Dressler, 1972/1981; Halliday & Hasan, 1976; Van Dijk,
1977/1992), it tries to ﬁgure out its level with a strictly deﬁned exactitude
within which the phenomenon can be encompassed. The study of Belza
chainBelza chains has gained a momentum recently (Altmann, 2018b; Chen
& Altmann, 2015; Roelcke, Popescu, & Altmann, 2017), and various indexes
have been implemented to make the chain analysis practical for stylometric
purposes (cf. Místecký, 2018; Místecký, Yiang, & Altmann, 2018).
As to the notion, a Belza chain is a string of the same idea that stretches over
neighbouring sentences (lines in poetry). For example, the devised mini-text:
Table 7. MATTR results of Omnilingual by H. Beam Piper (1957).
Section MATTR Section MATTR
Omnilingual_1 0.748957 Omnilingual_12 0.742845
Omnilingual_2 0.755573 Omnilingual_13 0.764581
Omnilingual_3 0.752369 Omnilingual_14 0.720137
Omnilingual_4 0.715745 Omnilingual_15 0.773378
Omnilingual_5 0.768555 Omnilingual_16 0.779495
Omnilingual_6 0.745605 Omnilingual_17 0.757484
Omnilingual_7 0.772853 Omnilingual_18 0.740429
Omnilingual_8 0.783753 Omnilingual_19 0.759548
Omnilingual_9 0.75763 Omnilingual_20 0.737869
Omnilingual_10 0.776663 Omnilingual_21 0.744984
Omnilingual_11 0.763979 Omnilingual_22 0.751442
Figure 7. MATTR development in the ‘Omnilingual’(1957) sections.
224 T. S. MELKA AND M. MÍSTECKÝ
I was dancing with Annie at the ball. She was wearing a stylish blue dress.
Once I touched it, I knew I was in love with her.
contains two Belza chainBelza chains, the ﬁrst is represented by the
sequence (Annie; she; her), and the other one by the string (dress; it). In
other words, there are two concepts that are elaborated in the passage
(‘Annie’and ‘dress’), with the ﬁrst one being more ‘outstretched’than the
other. If a sentence is unlinked to its neighbours, it is assigned the Belza
chain value of 1.
As to the whole of the text, it may be assessed on the basis of several
indicators. First, there is an average Belza chain length, which is deﬁned as:
where PLBstands for the sum of the Belza chain lengths, and Bfor the
total of them. The result –which, in the present case, is 2.5 –may be taken
as a measure of text association (A). Another sophisticated way of evaluating
a text on the basis of Belza chain presupposes a weighting of its elements.
Chen and Altmann (2015) proposed a system which is presented in Table 8;
it was developed in order to rank the elements from the nearest to the most
distant to the core notion. The lower the ﬁnal value, the more closely linked
the individual elements of a Belza chain are supposed to be. For the sake of
the present analysis, the category of indeﬁnite pronouns has been added to
In the example, the ﬁrst chain contains a girl’s name and two
pronouns referring to her; they are thus weighted [1; 6; 6], with the
average weight of the chain being 4.33. The second chain comprises
the weights [1; 6], its average thus totals 3.5. As to the entire text, a
formula has been designed to calculate the degree of weight richness;
Table 8. Weighting of Belza chain elements according to Chen and Altmann (2015).
Weight Chain Element
1 Main word, head of the chain
2 Synonym, metaphor, variant
3 Hyponym (=speciﬁcation)
4 Hypernym (=generalization, class)
5 Relative pronoun, relative phrase, rhetoric question, rhetoric answer,
article, interrogative pronoun, demonstrative pronoun, indeﬁnite pronoun
6 Personal pronoun
7 Possessive pronoun
8 Grammatical aﬃx or introﬂection referring to the head
9 Derivation or composition containing the head;
conversion of head to other POS
JOURNAL OF QUANTITATIVE LINGUISTICS 225
ðÞstands for a frequency of a given weight, and Nfor the
number of sentences (lines) in a text. In the ball story, the count yields:
It should be noted that the ﬁgures of the chain length count are of use
mostly in comparisons.
Considering the shortage of space, the present investigation will focus on
two Omnilingual sections only, with the goal being to present the method,
further to be elaborated in articles to come. The results of the research are
collated in the below Tables 9 and 10.
The measure of associativity indicatesthatmostchainsinboth
sections are two-member, which is a standard situation in many
instances of language that have been analysed so far (cf. Místecký,
2018). Moreover, Section 2 tends to be richer in both the types and
the lengths of the chains explored; this may be attributable to its
discrepant character, as it opens with a record of the people present
on the spot (Belza chain 2), but most of it is covered by the dialogue
between two scientists, which holds together much more thanks to
logical, non-linguistic coherence than because of formal cohesive
devices. All in all, a careful scrutiny of the results will only be possible
after many various texts are processed.
Table 9. The results of the Belza chain analysis of Omnilingual’sSection 1.
Number String Length Weights Average chain weight
1 [Martha; she; she] 3 [1; 6; 6] 4.33
2[–] 1  1
3 [streets; streets] 2 [1; 1] 1
4 [she; she; she] 3 [1; 1; 1] 1
5 [machinery; buldozers, shovels, draglines] 2 [1; 3] 2
6 [she; she] 2 [1; 1] 1
7 [pickmen; pickmen] 2 [1; 1] 1
8 [native; native] 2 [1; 1] 1
9 [laborer; labor] 2 [1; 9] 5
10 [something; jack-hammer] 2 [5; 1] 3
11 [she; she] 2 [1; 1] 1
226 T. S. MELKA AND M. MÍSTECKÝ
4.1. Quantitative Terms
Most assuredly, interpretation of data is as good as the performed statistical
measures, plus the authenticity, size, and characteristics of a sample. Whilst
not oﬀering pat solutions, statistics can be symptomatic of underlying
Table 10. The results of the Belza chain analysis of Omnilingual’ssection 2.
Number String Length Weights
1 [she; she; her] 3 [1; 1; 7] 4.5
2 [people; them; Selim; oﬃcer; Colonel; a couple
of. . .; Sir. . .]
7 [1; 6; 3; 3; 3;
3 [girls; them] 2 [1; 6] 3.5
4 [Sir; he; his] 3 [1; 6; 7] 4.67
5 [she; she] 2 [1; 1] 1
6 [Sachiko; Japanese girl; she; her] 4 [1; 2; 6; 6] 3.75
7[–] 1  1
8[–] 1  1
9[–] 1  1
10 [–] 1  1
11 [I; I] 2 [1; 1] 1
12 [this; it] 2 [1; 6] 3.5
13 [book; it] 2 [1; 6] 3.5
14 [–] 1  1
15 [–] 1  1
16 [–] 1  1
17 [–] 1  1
18 [–] 1  1
19 [–] 1  1
20 [–] 1  1
21 [–] 1  1
22 [–] 1  1
23 [–] 1  1
24 [–] 1  1
25 [–] 1  1
26 [–] 1  1
27 [–] 1  1
28 [–] 1  1
29 [–] 1  1
30 [–] 1  1
31 [Martha; Martha] 2 [1; 1] 1
32 [It; It] 2 [1; 1] 1
33 [–] 1  1
34 [–] 1  1
35 [–] 1  1
36 [–] 1  1
37 [–] 1  1
38 [–] 1  1
39 [–] 2  2
JOURNAL OF QUANTITATIVE LINGUISTICS 227
appears equally supportive of a qualitative approach of the text, i.e. close
reading. As already noted, the smaller Gini’scoeﬃcient, the greater the
vocabulary richness; this is revealed in the Sections 10 and 12, with
values below the 0.4 threshold. In the lowest supposed value, i.e. G=0.0,
all distinct words in Omnilingual (Beam Piper, 1957)wouldhavebeen
used with the same frequency. The quoted sections show a number of
features bolstering that property: carefully described scenarios strewn
with techno-parlance, quite often falling next to an ‘enumerative’and
informational style; sparingly used dialogues (mostly bearing the mark of
a silent monologue); and brevity in terms of tokens. The substantial use
of technical word-forms in linewiththesizeoftextsectionoﬀsets the
lexical dearth. On the other side, Section 13 (the longest in the novel-
ette) shows the highest jump in Gini’scoeﬃcient: 0.543. The relative
diminishing of richness in vocabulary could hint at more embedded
dialogues, where colloquial/informal domestic-like speech may ‘taint’to
a degree the pool of scientiﬁc hapax legomena, V(1, N), or dislegomena,
V(2, N). The other sections do not exhibit the change noticed in Section
13: ﬂuctuations are strong, between c. 0.4 and 0.49, but not that striking
(see Table 3). The data collectively suggest that Beam Piper (1957)might
have taken a respite
before writing the section in question. The out-
come is a more ‘relaxed’and ‘protracted’text, or words to that eﬀect.
Nonetheless, generalizing on the basis of a single written segment should
be cautiously avoided, as it may stand only for a subset or a frame of
writer’s linguistic skills. Overall, the indicator does not suggest a poor
acquisition or management of English vocabulary. This can mean that
H. Beam Piper was sedulously consuming historical/scientiﬁcmaterial
about archaeological decipherment, bio-chemistry, interplanetary
travel and exploration, and gadget engineering. The following
standard and non-standard words burst across the chapters (neologisms,
rare or common portmanteaus), and they come in diﬀerent ﬂavours:
<spraygun>, <airdyne pilot>, <airsealing>, <viviparous>, <gamogenetic>,
<Photostat>, <stenophone>, <oxyacetylene torch>, <spectroscope>,
<loess>, <vibratool>, <tarpaulins>, <radiophone>, <jetticopters>,
<nuclear-electric jackhammer>, <transuranics>, <beryllo-silver alloys>,
the diverse intellectual concerns and the inventive strain of the author.
The observation ﬁnds justiﬁcation in J. F. Carr (2008), with Piper’sup-
to-date information accomplished by dint of relatable literature or par-
ticipation in sf conventions.
The repeat rate data in Table 2 shows that the ﬂow of narrative in terms
of vocabulary wealth does not manifest dull or relatively dull uniformity.
The fact ﬁts well with the mixed nature of text samples (cf. Bell et al., 2009,
228 T. S. MELKA AND M. MÍSTECKÝ
p. 3; Oakes, 2009, p. 1071), and may have to do with the time axis through
which Omnilingual (Beam Piper, 1957) was written. It may be theorized
that ﬂuctuating values do not only act in response to the required situa-
tions/subplots along the sections, but also to the prevailing emotional mood
of the author himself. An interesting observation relates to Section 14,
where the repeat rate (RR) value is doubled or nearly doubled in compar-
ison, for example, with Sections 2,5, 8, 21, 22, pointing at lower vocabulary
richness. In opposition, Gini’s coeﬃcient registers 0.437 for # 14, whereas
the numeric values for # 2, 5, 8, 21, 22 swing within the range 0.456–0.483
The doubling of repeat rate (# 14) contrasts with Section 13’svalue–the
longest in the story and the one with several instances of up-close dialogues.
Section 14, in turn, is a dry and technical report on a sector of the Martian
University and the measures taken by the deployed international team for
camping and its further exploration. The writing at this point, besides being
‘underprivileged’in number of tokens, lacks dialogues and is loaded with past
tenses and passive constructions. Furthermore, in referencing Baroni and Evert
(2009,p.778)onthe‘passivization as a cue of formality’, the cross-over with the
prior comment on the ‘dry’and ‘technical’informational style along Section 14 is
Whether consulting Gini’scoeﬃcient for Sections # 12: 0.397, # 13: 0.543, # 14:
0.437, or the repeat rate, # 12: 0.0130, # 13: 0.0112, # 14: 0.0202, the ﬁgures reveal
certain conspicuous and ‘anomalous’behaviour nearby Section 13. Although
there is divergence in the way these indicators perform –Gini’sshowsrelative
lexical richness for # 14, while counter-posed by the repeat rate showing decrease
in richness –it may be stated with some conﬁdence that Section 13 (or the
circumstances that led to its conception) act/s as a breaking point in the lexical set
up. The discrepancies in the results of the two indexes may be explained by their
diﬀerent focuses –whereas RR takes into account the relative frequencies of
words only, Gini’scoeﬃcient works with the rank–frequency distribution of
them; its elevated value thus indicates a diminished number of especially frequent
words in a text, and a lot of those that occur very rarely. As to Section 13, this is in
line with the ﬁnding in the domain of hapax legomena, the proportion of which is
largely present here (almost 41%).
We would tend to reconcile the observations with pauses/breaks that the
author took in the interim. Such pauses might have conditioned a slightly
diﬀerent creative impulse in HBP, or aﬀected him psychologically, with the
result of a discrepant use of vocabulary (see especially Fiebelkorn, Pinsk, &
Kastner (2018) on rhythmic brain cycles and alternating attention-related
Next, a brief comment should also be paid to the ﬁgures of ATL.
Although the indicator seems independent of text length, it has been proved
that the two may be interconnected (Zörnig & Místecký, 2018); this is
JOURNAL OF QUANTITATIVE LINGUISTICS 229
probably due to the fact that the texts which are lexically richer tend to use
longer words as well. In case of Omnilingual, the highest value of ATL has
been measured in Section 14, which is, in its aforementioned technicality,
prone to the employment of long, scientiﬁc expressions. The fact that it also
ranks high in repeat rate may, on the other hand, shake the presupposition
that ATL rises with vocabulary richness, as a genre may play a role in the
As to lambda, the results are not easy to be interpreted; it seems that
there is a passage with high ﬁgures (Sections 9–12), which means that the
distances between neighbouring ranked words are constantly high; this is
broken by Section 13, where the deﬁnite article ‘the’substantially prevails
over all the other words. This part of the novelette focuses on the descrip-
tion of the Martian premises, where objects and events are treated from
diﬀerent viewpoints, and many speculations are made. On the other hand,
the following Section 14 continues in the trend of the part with elevated
To ﬁnish, the text was analysed as to its activity. Here, the core ﬁnding is
that most chapters are signiﬁcantly active, which may lead to various
interpretations –it can be a feature of the Martian subgenre’s conventions,
of the greater part of the 20th-century science ﬁction, or a matter of Piper’s
personal preference. More light will be cast on the issue when the results are
compared to other pieces within the aforementioned domains, or when the
V/A ratio is identiﬁed through a diﬀerent method.
4.2. Qualitative Terms
A few observations that may escape the computer-assisted quantitative
First, if present-day readers have one quibble with Piper’s story, that may
regard the words ‘smoking’on more than one occasion and having libations
on planet Mars by way of ‘cocktail pitchers’and ‘(counterfeit) Martinis’.
Admittedly, these ‘catchwords’are not accidental: they served to perk the
atmosphere up and may be adduced to the private baggage of the author
and the time in which he lived. Qualitatively speaking, such lexical choices
function as ‘shibboleths’(a peculiarity of speech/writing; cf. Edelstein, 2003,
p. 19; Juola, 2008, pp. 237–238), by which inferences on speciﬁc stylistic or
broad social habits can be made. Second, in Section 4, while examining the
string of letters of a few words, Martha Dane deduces that when ‘Martians
had needed a new word; they had just pasted a couple of existing words
together.’And H. Beam Piper, who is impersonating the main character,
states without delay, ‘It would probably turn out to be a grammatical
horror.’Examples from languages the morphology of which is based on
the agglutinating features abound. As highly synthetic languages go,
230 T. S. MELKA AND M. MÍSTECKÝ
members of the Finno-Ugric family attach aﬃxes to a stem to create many
grammatical forms, e.g. the Hungarian word <legeslegmagasabb>(‘the very
highest’) has the stem ‘magas’(‘tall, high’) at its core, the preﬁx‘legesleg’(in
an exaggerated way), and the suﬃx‘abb’(as a link vowel). Or the German
tongue (of the Indo-European family), e.g. <Bestandsbuchführung> trans-
lated as ‘inventory’, with the constituents lining up in that order,
<Bestand>, ‘stock, supplies’,<buch>‘book’, and <führung>‘direction, con-
duct’. In this sense, the morphological complexity of Hungarian, and to a
lesser degree of German, might have been somewhat intrusive to Piper’s
eyes (a native speaker of English), eventually preconceiving these foreign
Third, in Section 6, through the character of von Ohlmhorst,
Piper voices as a passing detail that ‘Cretan language’(i.e. Linear B) was
read ‘until the ﬁnding of the Greek-Cretan bilingual in 1953’. Considering
the time of writing, and the fact that Piper was not a professional linguist or
an archaeologist, he risked in creating a scientiﬁc-like Martian story, and
any risk-taking involves certain mistakes. In May 1953, Carl W. Blegen,
who was excavating at the town of Pylos (Greece), made use of the earlier
suggested sign-list of Michael Ventris to read a freshly found clay tablet,
conﬁrming the correctness of the decipherment (Chadwick, 2000 ,
pp. 81–84; Robinson, 2002, pp. 14, 100–101). Strictly speaking, this parti-
cular Pylos document is not a ‘bilingual’text. It is rather new evidence that
validates independently the hypothesis of M. Ventris regarding an old form
of Greek language underlying the Linear B script. Similarly, in section 16,
the author says through Anthony Lattimer that the decipherment of the
‘Hittite language’(Anatolian hieroglyphs) was done ‘when they found
Hittite-Assyrian bilinguals’. This moment rings false as the decipherment
was conﬁrmed through the ‘Hittite’[Luwian]-Phoenician bilinguals of
Karatepe hill (in Osmaniye Province, modern Turkey), dated from the
late part of 8th century BCE (see Friedrich, 1957/1971, pp. 98–101; Pope,
1975/1999, pp. 141–142). It so happens that in the selfsame section the
referenced German epigrapher is under the appellation ‘that distinguished
Hittitologist, Johannes Friedrich.’
On the other hand, a plus point observation that commends the
Pennsylvanian author is related to the choice of Martha Dane as the
protagonist of the story –a bright, observant, and strong-willed woman.
The study attempts to make a corpus-based statistical inspection in order to
extract style-related features of H. Beam Piper’sOmnilingual. The feature of
relevance of the written text –vocabulary richness –may shed light on
characteristic traits of Piper’s style and/or his socio-psychological back-
ground. It must be borne in mind, however, that style is a complex
JOURNAL OF QUANTITATIVE LINGUISTICS 231
phenomenon, and cannot be captured on the basis of a few indicators only
(cf. Tweedie, 2005, p. 390). In addition, given the neurological structure of
the human brain, it is quite improbable that the present research ‘lays bare
the soul’(Raleigh, 1897/1904, pp. 126–127) of the author. How much of the
writer’s meta-knowledge –i.e. of his psychological and sociological inclina-
tions (cf. Daelemans, 2013)–such an investigation reveals thus remains an
open question. Further developments in quantitative methods that accu-
rately correlate with some intuitions, together with cutting-edge break-
throughs in neurosciences and AI, will help in gaining a signiﬁcant
advantage on meta-knowledge.
Both statistical indicators, the repeat rate and Gini’scoeﬃcient, show that
H. Beam Piper (1957) has on the whole an estimable level of vocabulary
richness. The observation suggests that in spite of his wanting academic
training, Piper was an assiduous reader of ﬁction and non-ﬁction literature
(see, for instance, the discrepancy between Sections 13/14 and the rest of the
text). As to MATTR, the piece yields very similar results, as the type-token
ratio seems to be oscillating throughout the text; the only two exceptions are
Sections 4 and 14, the former treating –in a repetitive, explanatory manner –
the workings of the Martian language, the latter describing the premises of the
local university. What is striking, however, are the results of the activity
analysis, as nearly all sections tend to be verb-infused; the role of the classical,
adjective-based description is thus considerably diminished.
The vocabulary richness in various sections of Omnilingual appears
sensitive to a time axis. The novelette very likely was not written at one
sitting, but over diﬀerent sessions, some more non-linear than the others.
We may largely concede that HBP did not plan in advance the length of the
whole story or that of each chapter (cf. Popescu, Altmann et al., 2009,p.
70). He was an ingenious and spontaneous writer, and was not forced to
create according to ﬁxed instructions, or space-constrained norms.
In any event, ﬁxing the temporal gap (hours, days, weeks) among ses-
sions, i.e. building a time-structured succession, is far beyond the capability
of the applied quantitative measures. All that said, we only can speculate
about such perceived distances and the reasons behind them.
While the quantitative approach is regarded as a potential discriminant
for meta- knowledge, we are unwilling to dismiss salient qualitative aspects
(cf. also Tuldava, 2005, p. 370). Although the application of quantitative
indicators takes priority in this study, downplaying the importance of
qualitative observations is not advisable for our part. For all practical
purposes, a complementary methodology would have more usefulness;
there are things that quantitative and qualitative approaches can and cannot
do alone (cf. Creswell & Plano Clark, 2007).
Comparative studies involving Omnilingual (Beam Piper, 1957) and other
stories of the author/other authors/diﬀerent languages are scheduled for the
232 T. S. MELKA AND M. MÍSTECKÝ
near future. The condition for a comparison can be theoretically ‘satisﬁed’if
the range of lexis is near, or comparatively near Piper’s, with the inter-
authorial tests made with some ‘vintage’sf text. At ﬁrst, English as the chosen
language comes in handy, though it would be both interesting and advanta-
geous to explore science ﬁction cross-linguistically, too, e.g. in French, Czech,
German, Italian, Spanish (inﬂectional), or in Finnish, Hungarian (agglutina-
tive), etc. In this vein, the obtained picture may help in checking if the present
results hold true in various languages, or in deﬁning some kind of diﬀerence
as a function of all genre diﬀerences (cf. Popescu et al., 2011,p.49).
From a non-stylometric position, if there is praise for Omnilingual,it
should concern the assumption of science as a universal code of commu-
nication among intelligent cultures. Algis Budrys (1967, p. 168), for exam-
ple, postulates that the translation of an alien dead language by analysing its
periodic table sounds like a perfectly valid proposition, and that archaeol-
ogists (i.e. decipherers) ought to keep this notation on ﬁle.
Now, despite the diﬀerent perception, organization, and rationalization of
science by other non-human vectors (cf. e.g. Rescher, 1985), the basic scien-
tiﬁc tenet still stands. For instance, the decimal system used to date by many
of the Earth’s cultures
is due to the fact humans have anatomically 10
ﬁngers. A non-terrestrial society, whose membership evolved under ‘pressure
of natural selection’and happen to have ‘12 standard ﬁngers’, may choose a
duodecimal system of counting, i.e. computing by 12s. At any rate, the
structure or ‘language’spanning these sentient living systems is currently
mathematics, which suggests, in theory, some form of interaction or exchange.
The case in point is as simple and graspable as it can be, though we should be
aware that exploring and conﬁrming these matters are much more complex
(e.g. Ellis, 2004/2005, or Traphagan, 2014, pp. 161–162).
1. E.g. B. Gray (1969, p. 7), ‘Few problems in literary scholarship continue to
generate so much endeavor and so much conﬂict as the problem of style.’
2. For additional statistical models, reviews, and problems in stylometry,
authorship, and/or forensic linguistics, see Argamon et al. (2007); Bell et al.
(2009); Holmes (1998); Juola (2008); McMenamin (2002); Oakes (2009);
Rudman (1998); Thisted and Efron (1987); Tuldava (2005); Tweedie (2005),
and referenced literature thereof.
3. It needs to be pointed out that the raw count was performed by the QUITA
software (see later in the text), which treats contracted forms as two separate
tokens; for example, ‘hadn’t’is segmented as ‘had’and ‘not’. This is why
other token counters may yield slightly diﬀerent results, though the discre-
pancies have not been tested as high (cf. Melka, 2018). It is hard to say that
our way of counting is more correct, or less correct, than those of other
counters; it is, as long as consistently carried out, a possible way in
JOURNAL OF QUANTITATIVE LINGUISTICS 233
segmenting and evaluating the English text of Omnilingual (cf. also Popescu
et al., 2011, p. 14).
4. Cf. for instance, Strauß, Grzybek, & Altmann (2006,p. 293) with regard to
undersized samples, ‘short texts have the disadvantage of not allowing a
property to take appropriate shape.’
5. Examples include Brunet’s‘W’(1978); Orlov’s‘Z’(in Orlov & Chitashvili,
1983); Simpson’s‘Diversity index’(1949)/Yule’s‘characteristic K’(1944/2014),
or entropy, as a measure of uniformity (Cover & Thomas, 1991/2006). Yet,
the debate among experts over their real discriminative power is hardly
6. For a summary and literature on the subject, see Esteban and Morales (1995);
cf. also Cover and Thomas (1991/2006); Popescu et al. (2011).
7. M. Kubát (2014,p. 105), however, diﬀers in opinion.
8. On the suggested modiﬁcation of the coeﬃcient, see G. Altmann (1978,
9. The assumed break could have responded to any physical or personal recrea-
tional activity of H. Beam Piper: sleeping, sipping coﬀee/drinking rum, light-
ing up his pipe, hiking for a non-determined period of time, cleaning
ﬁrearms, hunting, and so forth (e.g. Anonymous, 1953, p. 7).
10. In The Penssy (Anonymous, 1953,p. 7) is clearly reported that Mr. Beam
Piper used to drink black Jamaica rum at home and light up his pipe with
Serene tobacco, having smoked that brand for the last 30 years.
11. The linguistic bias towards such a complex morphology ﬁnds in particular a
humorous expression in Mark Twain’s(1880) essay ‘The awful German
12. For several diﬀerent systems of counting and historical related trivia, see T.
Dantzig (1930/2005), J. S. Petersson’s(1996)Numerical Notation and A.
The online repository sites Project Gutenberg, Archive.org, and The Library Service
of Parkland College, Champaign, IL. (USA) have been of assistance with several
No potential conﬂict of interest was reported by the authors.
Adams, J. N. (2003). Bilingualism and the Latin language. Cambridge, UK:
Cambridge University Press.
Altmann, G. (1978). Zur Anwendung der Quotiente in der Textanalyse [About the
application of the quotient in text analysis]. Glottometrika,1,91–106.
Altmann, G. (2018a). Some properties of adjectives in texts. Glottometrics,41,67–79.
Altmann, G. (2018b). The nature and hierarchy of Belza chain. Glottometrics,42,75–85.
234 T. S. MELKA AND M. MÍSTECKÝ
Altmann,G.,&Köhler,R.(2015). Forms and degrees of repetitions in texts: Detection and
Anonymous. (1953, September 7). Typewriter ‘Killer’: Altoona’s H. Beam Piper.
Watchman –Mystery writer, ﬁnds job helps plots. The Pennsy,2(9),
Pennsylvania Railroad Company, Philadelphia, PA. Retrieved from http://www.
Antosch, F. (1969). The diagnosis of literary style with the verb-adjective ratio. In L.
Doležel & R. W. Bailey (Eds.), Statistics and style (pp. 57–65). New York:
Argamon, S., Whitelaw, C., Chase, P., Hota, S. R., Garg, N., & Levitan, S. (2007).
Stylistic text classiﬁcation using functional lexical features. Journal of the
American Society for Information Science and Technology,58(6), 802–822.
Retrieved from https://www.researchgate.net/publication/220435559_Stylistic_
Baayen, R. H. (2001). Word frequency distributions. Text, speech and language
technology, Vol. 18. Ide, N., & Véronis, J. (Series Eds.). Dordrecht, Netherlands:
Kluwer Academic Publishers.
Baker, S. (1966). The complete stylist. New York: Thomas Y. Crowell Company.
Bakker, F. J. (1965). Untersuchungen zur Entwicklung des Aktionsquotienten
[Investigations on the development of action’s quotient]. Archiv für die
Baroni, M., & Evert, S. (2009). Statistical methods for corpus exploitation. In A.
Lüdeling & M. Kytö (Eds.), Corpus linguistics: An international handbook.
Handbücher zur Sprach- und Kommunikations-wissenschaft/Handbooks of
Linguistics and Communication Science, Band 29/2 (pp. 777–802). Berlin:
Mouton de Gruyter.
Beam Piper, H. (1947, April). Time and time again. Astounding Science Fiction,39
(2), Retrieved from http://www.gutenberg.org/ebooks/18831
Beam Piper, H. (1953). Murder in the Gunroom. New York: Alfred A. Knopf.
Retrieved from https://www.gutenberg.org/ﬁles/17866/17866-h/17866-h.htm
Beam Piper, H. (1957). Omnilingual. Originally published in Astounding Science
Fiction, 58 (6), February 1957, pp. 8–46; with cover and interior illustration by
Frank Kelly Freas. Retrieved from http://www.gutenberg.org/ﬁles/19445/19445-
Beam Piper, H. (1962). Little Fuzzy. New York: Avon. Retrieved from http://www.
Beam Piper, H. (1963). Space Viking. New York: Ace Books. Retrieved from https://
Beam Piper, H. (1965). Lord Kalvan of otherwhen. New York: Ace Books. Retrieved
Beam Piper, H. (1981). Federation. Preface by Jerry Pournelle. New York: Ace Books.
Bear, G. (1993). Moving Mars. New York: Tor Books.
Beaugrande, R.-A. de, & Dressler, W. U. (1972/1981). Introduction to text linguis-
tics. R. de Beaugrande, Trans.. London: Longman Group Limited. German
edition © Max Niemeyer Verlag, Tübingen.
Belkin,N.J.,&Robertson,S.E.(1976). Information science and the phenomenon of
information. Journal of the American Society for Information Science,27(4), 197–204.
Retrieved from https://www.researchgate.net/publication/227838588/download
JOURNAL OF QUANTITATIVE LINGUISTICS 235
Bell, E. J. L., Berridge, D., & Rayson, P. (2009). Measuring style with the authorship
ratio: An invariant metric of lexical similarity. Retrieved from http://ucrel.lancs.
Belza, M. I. (1971). K voprosu o nekotorych osobennostjach semanticheskoj struk-
tury svjaznych tekstov [On some features of the semantic structure of coherent
texts]. In Skorokhodko, É. F. (Ed.), Semanticheskie problemy avtomatizacii i
informacionnogo poiska [Semantic problems of automation and information
search] (pp. 58–73). Kyiv: Naukova dumka.
Bleiler, E. F. (1990). Science-ﬁction: The early years. Kent, OH: Kent State University
Boder, D. P. (1940). The adjective-verb quotient: A contribution to the psychology
of language. Psychological Record,3, 310–343.
Bogdanov, A. (1908/1984). Red Star. Engineer Menni. A Martian stranded on Mars.
Ch. Rougle, Trans. Bloomington and Indianapolis: Indiana University Press.
Retrieved from https://archive.org/details/BogdanovRedStar
Bradbury, R. (1950). The Martian chronicles. New York: Doubleday.
Březina, V. (2018). Statistics in corpus linguistics: A practical guide. Cambridge, UK:
Cambridge University Press.
Brown, F. (1954). Martians, go home. Illustrations by Freas. In Astounding Science
Fiction (New York), 44(1), 9–55. Retrieved from https://archive.org/stream/
Brunet, E. (1978). Vocabulaire de Jean Giraudoux: Structure et Évolution; Statistique
et Informatique Appliquées à l’Étude des Textes, à partir du Trésor de la Langue
Française. [The vocabulary of Jean Giraudoux: Structure and evolution; statistics
and informatics applied to the study of texts, based on the thesaurus of the
French language]. Paris: Slatkine.
Budrys, A. (1967). Great science ﬁction stories about Mars, T. E. Dikty (Ed.).
Review by Algis Budrys. In F. Pohl (Ed.). Galaxy bookshelf; cover by Douglas
Chaﬀee. Galaxy Magazine, April 1967, 25(4),166–169. New York: Galaxy
Bunge, M. A. (1983). Treatise on basic philosophy. Vol. 6. Epistemology and meth-
odology II: Understanding the world. Dordrecht/Boston/Lancaster: D. Reidel
Publishing Company/Kluwer Academic Publishers Group.
Burroughs, E. R. (1912/1917). A princess of Mars [Original title, Under the moons
of Mars]. Chicago, IL: A. C. McClurg & Co. Retrieved from https://www.guten
Busemann, A. (1925). Die Sprache der Jugend als Ausdruck der
Entwicklungsrhythmik [Youth’s speech as an imprint of the rhythm of develop-
ment]. Jena: Fischer.
Campanile, E., Cardona, G. R., & Lazzeroni, R., Eds. (1988). Bilinguismo e
biculturalismo nel mondo antico. Atti del Colloquio interdisciplinare tenuto a
Pisa il 28 e 29 settembre 1987. Testi Linguistici 13. Pisa: Giardini Editori e
Carr, J. F. (2008). H. Beam Piper: A biography. Series Editors, Palumbo, D. E. &
Sullivan III, C. W. Critical Explorations in Science Fiction and Fantasy, 8.
Jeﬀerson, NC: McFarland & Company, Inc.
Carr, M. H., & Head, J. W. (2010). Acquisition and history of water on Mars. In N.
A. Cabrol & E. A. Grin (Eds.), Lakes on Mars (pp. 31–67). Amsterdam: Elsevier
Science. Retrieved from http://www.planetary.brown.edu/pdfs/3757.pdf
236 T. S. MELKA AND M. MÍSTECKÝ
Carter, R., & Simpson, P. (Eds.). (1989). Language, discourse and literature: An
introductory reader in discourse stylistics. London: Unwin Hyman, Ltd.
Čech, R., Popescu, -I.-I., & Altmann, G. (2014). Metody kvantitativní analýzy
(nejen) básnických textů[Methods of quantitative analysis of (not only) poetic
texts]. Olomouc: Univerzita Palackého v Olomouci.
Ceriani, L., & Verme, P. (2012). The origins of the Gini index: Extracts from
Variabilità e Mutabilità (1912) by Corrado Gini. The Journal of Economic
Inequality (Springer),10(3), 421–443.
Chadwick, J. (1958/2000). The decipherment of linear B. Cambridge: The Press
Syndicate of the Cambridge University.
Chatman, S. B. (Ed.). (1971). Literary style: A symposium. London & New York:
Oxford University Press.
Chen, R., & Altmann, G. (2015). Conceptual inertia in texts. Glottometrics,30,73–88.
Clarke, A. C. (1951). The sands of Mars. London: Sidgwick & Johnson.
COCA (2018). Word frequency data –Corpus of Contemporary American English
(COCA). Retrieved from http://www.wordfrequency.info/free.asp?s=y
Coleridge, S. (1914 [1907, 1818]). Coleridge’s essays & lectures on Shakspeare &
some other old poets & dramatists. London: J. M. Dent & Sons/New York: E. P.
Dutton & Co. Retrieved from https://ia902303.us.archive.org/0/items/coleridge
Cooper, L. (1907/1930). Theories of style, with especial reference to prose composi-
tion. New York: The Macmillan Company. Retrieved from https://archive.org/
Cover, T. M., & Thomas, J. A. (1991/2006). Elements of information theory. New
York: John Wiley & Sons, Inc. Retrieved from http://www.cs-114.org/wp-con
Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian Knot: The Moving
Average Type-Token Ratio (MATTR). Journal of Quantitative Linguistics,17(2),
94–100. Retrieved from https://pdfs.semanticscholar.org/5fe8/
Creswell, J. W., & Plano Clark, V. (2007). Designing and conducting mixed methods
research. Thousand Oaks, CA: Sage Publications, Inc.
Daelemans, W. (2013). Explanation in computational stylometry. In A. Gelbukh (Ed.),
Computational linguistics and intelligent text processing. CICLing 2013. Lecture notes
in computer science, 7817 (pp. 451–464). Berlin, Heidelberg: Springer. Retrieved from
Damgaard, C., & Weiner, J. (2000). Describing inequality in plant size or fecundity.
Ecology,81, 1139–1142. Retrieved from https://www.researchgate.net/publica
Daniels, P. T. (1996). Methods of decipherment. In P. T. Daniels & W. Bright
(Eds.), The world’s writing systems (pp. 141–159). New York: Oxford University
Daniels, P. T., & Bright, W. (Eds.). (1996). The world’s writing system. Oxford, NY:
Oxford University Press.
Dantzig, T. (1930/2005). Number: The language of science. J. Mazur (Ed.), The
Masterpiece Science Edition. New York: Pi Press/An imprint of Pearson
Davies, P. (1995). Are we alone?: Philosophical implications of the discovery of
extraterrestrial life. New York: Basic Books/Harper Collins Publishers.
Dick, P. K. (1964). Martian time-slip. New York: Ballantine Books/Random House.
JOURNAL OF QUANTITATIVE LINGUISTICS 237
Dick, S. J. (1998). Life on other worlds: The 20th century extraterrestrial debate.
Cambridge, UK: Cambridge University Press.
Dijk, T. A. V. (1977/1992). Text and context. Sixth Impression. London and New
York: Longman Group UK Limited.
Disch, T. M. (1976). Echo round his bones. New York: Berkley Medallion Books/
Dixon, R. M. W. (1991/2005). A semantic approach to English grammar. Revised
and enlarged second edition. Oxford Textbooks in Linguistics. Oxford: Oxford
Dontcheva-Navratilova, O., Jančaříková, R., Miššíková, G., & Povolná, R. (2017).
Coherence and cohesion in English discourse. Brno: Masaryk University Press.
Drake, F., & Sobel, D. (1992). Is anyone out there? The scientiﬁc search for extra-
terrestrial intelligence. New York: Delacorte Press.
Eckert, P., & Rickford, J. R. (Eds.). (2001). Style and sociolinguistic variation.
Cambridge, UK: Cambridge University Press.
Edelstein, S. (2003). Dubious doublets: A delightful compendium of unlikely word
pairs of common origin. Hoboken, NJ: John Wiley & Sons, Inc.
Eder, M. (2010/2013). Does size matter? Authorship attribution, small samples, big
problem. Literary and Linguistic Computing,30(2), 167–182. Based on a previous
draft –DH 2010: DIGITAL HUMANITIES, Conference Abstracts. King’s College
London, pp. 132–135. Retrieved from http://dh2010.cch.kcl.ac.uk/academic-pro
Ellis, G. F. R. (2004/2005). True complexity and its associated ontology. In J. D.
Barrow, P. C. W. Davies, & C. L. Harper Jr. (Eds.), Science and ultimate reality:
Quantum theory, cosmology, and complexity (pp. 607–636). Cambridge:
Cambridge University Press.
Engdahl, S., Ed. (2001/2006). Extraterrestrial life. Contemporary Issues ●
Companion. Detroit: Greenhaven Press/An imprint of Thomson Gale.
Esteban, M. D., & Morales, D. (1995). A summary of entropy statistics. Kybernetica,
Fergus, C. (2013, May 1). Beyond Earth: Mars fever. Penn State News. Pennsylvania
State University. Retrieved from http://news.psu.edu/story/140745/2003/05/01/
Fiebelkorn, I. C., Pinsk, M. A., & Kastner, S. (2018). A dynamic interplay within the
frontoparietal network underlies rhythmicspatial attention. Neuron,99(4), 842–853.
Fortis, J.-M., & Fagard, B. (2010). Space in language. Part IV: Adnominals.
Adnominals: Topological-functional adpositions, spatial phrases and spatial
cases. DGfS-CNRS Summer School on Linguistic Typology. Leipzig, August
15–September 3, 2010. Retrieved from https://www.eva.mpg.de/lingua/confer
Freeman, D. C. (Ed.). (1970). Linguistics and literary style. New York: Holt,
Rinehart and Winston, Inc.
Friedrich, J. (1957/1971). Extinct languages. F. Gaynor, Trans. Westport, CT:
Greenwood Press, Publishers.
Frye, N. (1963). The well-tempered critic. Bloomington, IN: Indiana University Press.
Gastwirth, J. L. (2017). Is the Gini index of inequality overly sensitive to changes in
the middle of the income distribution? Statistics and Public Policy,4(1), 1–11.
Gelb, I. J., & Whiting, R. M. (1975). Methods of decipherment. Journal of the Royal
Asiatic Society of Great Britain and Ireland,107,95–104.
238 T. S. MELKA AND M. MÍSTECKÝ
Gini, C. (1921). Measurement of inequality of incomes. The Economic Journal,31
(121), 124–126. Retrieved from https://www.jstor.org/stable/pdf/2223319.pdf?
Gissing, G. (2016 [2008, 1891]). New Grub Street (Vol. 3, 2nd ed.). London: Smith, Elder,
Golomb, S. W. (1961/1968). Extraterrestrial linguistics. Word Ways,1(4/5), 202–
205. Retrieved from http://digitalcommons.butler.edu/wordways/vol1/iss4/5
Gombrich, E. H. (1982). The image and the eye: Further studies in the psychology of
pictorial representation. Ithaca, NY: Cornell University Press/Phaidon Books.
Gordon, C. H. (1968/1987). Forgotten scripts. New York: Dorset Press/Marboro
Gray, B. (1969). Style: The problem and its solution. The Hague: Mouton.
Greg, P. (1880). Across the zodiac: The story of a wrecked record. London: Trübner
& Co. Retrieved from http://www.archive.org/details/acrosszodiacstor01greg.
Grieve, J. W. (2005). Quantitative authorship attribution: A history and an evalua-
tion of techniques (Master’s Thesis). Simon Fraser University, Burnaby, BC,
Canada. Retrieved from http://www.summit.sfu.ca/system/ﬁles/iritems1/8840/
Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. London: Routledge.
Harrison,A.A.(2014). Speaking for Earth: Projecting cultural values across deep space
and time. In D. A. Vakoch (Ed.), Archaeology, anthropology,andinterstellarcommu-
nication. The NASA History Series (pp. 175–191). Washington, DC: Oﬃce of
Communications, National Aeronautics and Space Administration.
Hawkes, T. (1977/2004). Structuralism and semiotics. London: Routledge/Taylor &
Hawkins, J. D., & Morpurgo Davies, A. (1978). On the problems of Karatepe: The
hieroglyphic text. Anatolian Studies (British Institute at Ankara),28, 103–119.
Retrieved from http://www.ling-phil.ox.ac.uk/ﬁles/hawkins-amd_karatepe_the_
Heidmann, J. (1995). Extraterrestrial intelligence (2nd ed.). Cambridge: Cambridge
Heinlein, R. (1961). Stranger in a strange land. New York: G. P. Putnam’s Sons.
Helbig, G., & Schenkel, W. (2011 [1991, 1969]). Wörterbuch zur Valenz und
Distribution deutscher Verben [Lexicon of the valency and distribution of
German verbs]. Berlin: de Gruyter.
Herbst, T., Heath, D., Roe, I. F., & Götz, D. (2004). A valency dictionary of English:
A corpus based analysis of the complementation patterns of English verbs, nouns
and adjectives. Berlin: Mouton de Gruyter.
Herdan, G. (1964). Quantitative linguistics. London: Butterworths.
Hines, D. (2002). H. Beam Piper. Retrieved from http://www.mib.org/~hradzka/piper/
Holmes, D. (1998). The evolution of stylometry in humanities scholarship. Literary
and Linguistic Computing,13, 111–117.
Jackson, H., & Amvela, E. Z. (2000). Words, meaning and vocabulary: An introduc-
tion to modern English lexicology. London: A & C Black.
Johnson, K. (2008). Quantitative methods in linguistics. Malden, MA: Blackwell.
Juola, P. (2008). Author attribution. Foundations and Trends® in Information
Retrieval,I(3), 233–334. Retrieved from http://www.mathcs.duq.edu/~juola/
Kelih, E., Rovenchak, A., & Buk, S. (2014). Analyzing h-point in lemmatized and
non-lemmatized texts. In G. Altmann, R. Čech, I. Mačutek, & L. Uhlířová (Eds.),
JOURNAL OF QUANTITATIVE LINGUISTICS 239
Empirical approaches to text and language analyses –Dedicated to LuděkHřebíček
on the occasion of his 80th birthday (pp. 81–94). Lüdenscheid: RAM-Verlag.
Khrennikov, A. Y. (2014). Cognitive processes of the brain: An ultrametric model of
information dynamics in unconsciousness. P-Adic Numbers, Ultrametric
Analysis, and Applications,6(4), 293–302.
Kipper, K., Korhonen, A., Ryant, N., & Palmer, M. (2008). A large-scale classiﬁca-
tion of English verbs. Language Resources and Evaluation Journal,42,21–40.
Retrieved from http://verbs.colorado.edu/~kipper/Papers/lrec.pdf
Knight, K., & Sproat, R. (2009). Writing systems, transliteration and decipherment.
Retrieved from http://www.isi.edu/natural-language/people/naac109-print-1x2.pdf
Köhler, R. (2005a). Synergetic linguistics. In R. Köhler,G.Altmann,&R.G.Piotrowski
(Eds.), Quantitative Linguistik/Quantitative linguistics: Ein Internationales Handbuch/
An international handbook. Handbücher zur Sprach- und Kommunikations-wis-
senschaft, Band 27 (pp. 760–774). Berlin: Walter de Gruyter.
Köhler,R.(2005b). Quantitative Untersuchungen zur Valenz deutscher Verben
[Quantitative investigations on the valency of German verbs]. Glottometrics,9,13–20.
Köhler, R., & Galle, M. (1993). Dynamic aspects of text characteristics. In L.
Hřebíček & G. Altmann (Eds.), Quantitative text analysis. Quantitative
Linguistics, 52 (pp. 46–53). Trier: Wissenschaftlicher Verlag Trier.
Kramer,M.(2014, December 16). Curiosity Rover drills into Mars rock, ﬁnds water.space.
com, Retrieved from https://www.space.com/28030-mars-water-curiosity-rover.html.
Kubát, M. (2014). Moving window type-token ratio and text length. In G. Altmann,
R. Čech, I. Mačutek, & L. Uhlířová (Eds.), Empirical approaches to text and
language analyses –dedicated to LuděkHřebíček on the occasion of his 80th
birthday (pp. 105–114). Lüdenscheid: RAM-Verlag.
Kubát, M. (2016). Kvantitativní analýza žánrů[Quantitative analysis of genres].
Ostrava: FF OU.
Kubát, M., Matlach, V., & Čech, R. (2014). QUITA –Quantitative index text
analyzer. Lüdenscheid: RAM-Verlag. Retrieved from https://code.google.com/
Kubát, M., & Milička, J. (2013). Vocabulary richness measure in genres. Journal of
Quantitative Linguistics,20(4), 339–349.
LancsBox (2018). Lancaster University corpus toolbox. © Vaclav Březina, Lancaster
University. Retrieved from http://corpora.lancs.ac.uk/lancsbox/download.php.
Landis, G. A. (2000). Mars crossing. New York: Tor Books.
Lasswitz, K. (1897/1971). Auf Zwei Planeten [Two planets]. Weimar: Emil Felber.
(H. H. Rudnick, Trans.). Carbondale: Southern Illinois University Press.
Retrieved from http://www.gasl.org/refbib/Lasswitz__Auf_2_Planeten.pdf
Leech, G., Rayson, P., & Wilson, A. (2001). Word frequencies in written and spoken
English: Based on the British National Corpus. London: Longman.
Levickij, V., & Lučak, M. (2005). Category of tense and verb semantics in the
English language. Journal of Quantitative Linguistics,12(2–3), 212–238.
Lyons, J. (1977/1996). Semantics. Vol. 1. Cambridge, UK: Cambridge University Press.
Malvern, D., Richards, B. J., Chipere, N., & Durán, P. (2004). Lexical diversity and
language development: Quantiﬁcation and assessment. Basingstoke, UK: Palgrave
McAuley, P. J. (1994). Red dust. New York: Avon/HarperCollins.
McIntosh, R. P. (1967). An index of diversity and the relation of certain concepts to
diversity. Ecology,48(3), 392–404. Retrieved from https://www.researchgate.net/
240 T. S. MELKA AND M. MÍSTECKÝ
McMenamin, G. R. (2002). Forensic linguistics: Advances in forensic stylistics. Boca
Ratón, FL: CRC Press LLC.
Melka, T. S. (2018). Stylistic study of Omnilingual by H. Beam Piper. Glottometrics,
Michaud, M. A. G. (2007). Contact with alien civilizations: Our hopes and fears
about encountering extraterrestrials. New York: Copernicus Books/Springer
Science + Business Media LLC.
Místecký, M. (2018). Belza chains in Machar’sLetní sonety.Glottometrics,41,46–56.
Místecký, M., Yiang, J., & Altmann, G. (2018). Belza chain analysis: Weighting
Nekula, M. (2002). Text. In P. Karlík, M. Nekula, & J. Pleskalová (Eds.),
Encyklopedický slovník češtiny [New encyclopaedic dictionary of Czech language]
(pp. 489). Praha: Nakladatelství Lidové noviny.
O’Connor, M. (1996). The Berber scripts. In P. T. Daniels & W. Bright (Eds.), The
world’s writing systems (pp. 112–116). New York: Oxford University Press.
Oakes, M. P. (2009). Corpus linguistics and stylometry. In A. Lüdeling & M. Kytö
(Eds.), Corpus linguistics: An international handbook. Handbücher zur Sprach- und
Kommunikations-wissenschaft/Handbooks of Linguistics and Communication
Science, Band 29/2 (pp. 1070–1090). Berlin: Mouton de Gruyter.
Orlov,J.K.,&Chitashvili,R.Y.(1983). Generalized z-distribution generating the well-
known ‘rank-distributions. BulletinoftheAcademyofSciences,Georgia,110(2), 269–
Pan, X., & Liu, H. (2014). Adnominal constructions in Modern Chinese and their
distribution properties. Glottometrics,29,1–30.
Parkinson, R. B. (1999). Cracking codes: The Rosetta stone and decipherment.
Berkeley: The University of California Press.
Petersson, J. S. (1996). Numerical notation. In P. T. Daniels & W. Bright (Eds.), The
world’s writing systems (pp. 795–806). New York: Oxford University Press.
Pohl, F. (1976). Man plus. New York: Random House.
Pope, M. (1975/1999). The story of decipherment: From Egyptian hieroglyphs to
Maya script. Rev. ed. London: Thames & Hudson.
Popescu, -I.-I., & Altmann, G. (2006). Some aspects of word frequencies. Glottometrics,
Popescu, -I.-I., & Altmann, G. (2007). Writer’s view of text generation. Glottometrics,
Popescu, I.-I, & Altmann, G. (2015). A simpliﬁed lambda indicator in text analysis.
Popescu, -I.-I., Altmann, G., Grzybek., P., Jayaram, B. D., Köhler, R., Krupa, V., . . .
Vidya, M. N. (2009). Word frequency studies. Berlin: Mouton de Gruyter.
Popescu, -I.-I., Čech, R., & Altmann, G. (2011). The lambda-structure of texts.
Studies in Quantitative Linguistics 10. Lüdenscheid: RAM-Verlag.
Popescu, -I.-I., Mačutek, J., & Altmann, G. (2009). Aspects of word frequencies.
Studies in Quantitative Linguistics 3. Lüdenscheid: RAM-Verlag.
Pratchett, T., & Baxter, S. (2014). The long Mars. Series The Long Earth. New York:
Raber, D. (2003). The problem of information. Library and Information Science.
Lanham, MD: Scarecrow Press, Inc.
Raleigh, W. (1897/1904). Style. Fifth Impression. London: Edward Arnold. Retrieved
JOURNAL OF QUANTITATIVE LINGUISTICS 241
Ray, J. (2007). The Rosetta stone and the rebirth of ancient Egypt. London: Proﬁle.
Rescher, N. (1985). Extraterrestrial science. In E. Regis Jr. (Ed.), Extraterrestrials: Science
and alien intelligence (pp. 83–116). Cambridge: Cambridge University Press.
Robinson, A. (2002). Lost lang