If you want to read the PDF, try requesting it from the authors.
The paper focuses on dynamics of changes of several linguistic and text properties in diachronic development of Czech. Specifically, we analyze the proportion of identical word-forms (types), the average type length, text length, the proportion of hapax legomena, the moving average type-token ratio, and entropy. For the analysis, seven translations of the Gospel of Matthew from the 14 th to the 21 st century were used. The study reveals some differences in dynamics of changes of particular properties.
The paper deals with two important questions in linguistic research: 1) What do we actually model
when we model language usage? and 2) What is an appropriate sample or ‘text unit’ for the analysis of language usage? In the beginning, we critically discuss several approaches to the analysis
of language behaviour. Then, we introduce the most important characteristics of both Zipf’s linguistic theory and synergetic linguistics. We focus in particular on the aspects of these theories
which are connected to the above-mentioned questions. Specifically, we emphasize that one of
the fundamental features of these theories is the assumption that there are linguistic laws which
govern human language behaviour and which can be best detected by observing the language
behaviour of an individual (in a particular context). As a consequence, if the goal of the research
is to examine laws of this kind, the individual text is used as a basic unit for the analysis. The mixing of texts can, in some cases, lead to the “concealing” of the laws, as is presented in an example.
We also offer another example which shows how characteristics of the same law (in this case, the
Menzerath-Altmann law) differ in different texts. Finally, we emphasize that using individual texts
in linguistic research is but one possible approach to analysis, i.e. we do not attempt to make it
a linguistic research dogma.