Ján Mačutek

Ján Mačutek
  • Slovak Academy of Sciences

About

52
Publications
6,079
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
314
Citations
Current institution
Slovak Academy of Sciences

Publications

Publications (52)
Article
Full-text available
Grammatical cases of nouns are expressed by inflectional endings in Slovak. Therefore, nouns have several word forms, with the nominative considered the basic form. In addition to the endings, in some word forms there are morphophonological changes also in stems. The differences between the basic form and inflected forms are evaluated using the Lev...
Article
Full-text available
Declensional morphology of nouns in Czech and Russian is investigated and compared. It is shown that, in general, word forms which are more similar to their lemmas are preferred, but there are differences between animate and inanimate nouns and also among grammatical genders. The frequency distribution of grammatical cases is also studied, with ani...
Chapter
According to the Menzerath-Altmann law, the mean word length is greater in shorter clauses than in longer ones. In Czech, negation is mostly realized by adding the prefix ne- to the beginning of the word, which makes the word longer (and, consequently, it also increases the mean word length in the clause). Therefore, we predict that clauses in whic...
Chapter
Stress position in three languages with free stress is scrutinized. Although there are no deterministic rules for stress in such languages, some statistical tendencies are clearly observable. Simple mathematical models for the mean stress position and for the relative mean stress position are suggested. As a “byproduct,“ we show that word length di...
Article
Full-text available
Non-alcoholic fatty liver disease (NAFLD) is a liver pathology affecting around 25% of the population worldwide. Excess oxidative stress, inflammation and aberrant cellular signaling can lead to this hepatic dysfunction and eventual carcinoma. Molecular hydrogen has been recognized for its selective antioxidant properties and ability to attenuate i...
Article
Full-text available
For every discrete probability distribution, there is one and only one partial summation which leaves the distribution unchanged. This invariance property is reconsidered for distributions with one parameter. We show that if we change the parameter value in the function which defines the summation, two families of distributions can be observed. The...
Chapter
The aim of the paper is to test the validity of the Menzerath-Altmann law for Czech poems from K. J. Erben’s ballad collection Kytice z pověstí národních (A Bouquet of Folk Legends). We focus particularly on the relationship between word length and syllable length. The Menzerath-Altmann law predicts that the mean syllable length will be longer in s...
Chapter
Specialists in quantitative linguistics the world over have recourse to a solid and universal methodology. These days, their methods and mathematical models must also respond to new communication phenomena and the flood of data produced daily. While various disciplines (computer science, media science) have different ways of processing this onslaug...
Chapter
Specialists in quantitative linguistics the world over have recourse to a solid and universal methodology. These days, their methods and mathematical models must also respond to new communication phenomena and the flood of data produced daily. While various disciplines (computer science, media science) have different ways of processing this onslaug...
Article
Full-text available
The paper focuses on dynamics of changes of several linguistic and text properties in diachronic development of Czech. Specifically, we analyze the proportion of identical word-forms (types), the average type length, text length, the proportion of hapax legomena, the moving average type-token ratio, and entropy. For the analysis, seven translations...
Article
Full-text available
Rewriting books was a widespread phenomenon during the Baroque period of the Czech literature. The manuscripts were not always „honest copies”, on the contrary, scribes often compiled several sources or added their own texts to the original. The famous book Golden Key of Heaven by Martin of Cochem is compared with a manuscript Key of Heaven from a...
Article
Full-text available
The paper deals with two important questions in linguistic research: 1) What do we actually model when we model language usage? and 2) What is an appropriate sample or ‘text unit’ for the analysis of language usage? In the beginning, we critically discuss several approaches to the analysis of language behaviour. Then, we introduce the most importan...
Article
Full-text available
The Zipf-Mandelbrot distribution serves as a mathematical model for ranked frequencies in many areas of scientific research, including linguistics. Many linguistic units, like e.g., words or word n-grams, follow this distribution. However, in some cases, such as for graphemes in linguistics or species abundance and diversity data in biology, the pa...
Book
Full-text available
In Tackling the Toolkit, we focus on the methodological innovations, challenges, obstacles and even shortcomings associated with applying quantitative methods to poetry specifically and poetics more broadly. Using tools including natural language processing, web ontologies, similarity detection devices and machine learning, our contributors explore...
Article
Full-text available
The problem of iterated partial summations is solved for some discrete distributions defined on finite supports. The power method, usually used as a computational approach to the problem of finding matrix eigenvalues and eigenvectors, is in some cases an effective tool to prove the existence of the limit distribution, which is then expressed as a s...
Article
Annual speeches of Czech and Czechoslovak presidents on the occasion of the end of the year are analyzed in this study. Several stylometric methods are used, namely, vocabulary richness expressed by the moving-average type–token ratio, an index of text activity, mean word length, mean verb distance, and cluster analysis of the most frequent words....
Article
Full-text available
The paper focuses on analyzing the relationship among word order positions of pronominal enclitics in the history of Czech. Specifically, we look at the Wackernagel’s position and the contact position and we try to decide whether these two positions compete, as usually taken for granted, or whether there is a certain kind of cooperation between the...
Preprint
Full-text available
Bivariate partial-sums discrete probability distributions are defined. The question of the existence of a limit distribution for iterated partial summations is solved for finite-support bivariate distributions which satisfy conditions under which the power method (known from matrix theory) can be used. An oscillating sequence of distributions, a ph...
Preprint
Full-text available
The problem of iterated partial summations is solved for some discrete distributions defined on discrete supports. The power method, usually used as a computational approach to finding matrix eigenvalues and eigenvectors, is in some cases an effective tool to prove the existence of the limit distribution, which is then expressed as a solution of a...
Conference Paper
Full-text available
The paper is focused on the analysis of the relationship between the full valency of the predicate and the position of enclitics in the clause. For this analysis, ones of the oldest Old Czech prose texts were used. We set up the hypothesis - the higher the full valency of the predicate, the lower the probability of the occurrence of the enclitic af...
Conference Paper
Full-text available
Lengths (in words) of projective and non-projective sentences from a Czech UD dependency treebank are compared. It is shown that non-projective sentences are significantly longer (in addition, the same result was obtained in this study also for Arabic, Polish, Russian, and Slovak). The hyperpascal distribution, which was suggested as the model for...
Article
The presented study deals with the historical development of Czech (en)clitics (AuxP). Based on the data from the previous research (Kosek 2015a,b, 2017), it focuses on the development of one group the Czech (en)clitics – on the preterite auxiliary forms. In the article, three hypotheses are formulated and then tested on the data gained from select...
Article
Full-text available
The paper deals with the word order of reflexive sě, which is an item on the boundary between a pronominal form and a discrete morpheme. In the first part of the study, we investigate the (en)clitic status of sě in eight books of the oldest complete Czech Bible translation. The analysis focuses only on sě that is dependent on a finite verb: it iden...
Article
Full-text available
In this part of the paper, the distribution of clause positions of the reflexive pronoun sě is analyzed statistically. Specifically, the impact of both stylistic factors and the length of the element in the initial position are investigated. The authors also discuss the possible influence of the word order of the Latin pretext (the Vulgate) on the...
Chapter
Full-text available
The article presents a quantitative analysis of some syntactic dependency properties in Czech. A dependency frame is introduced as a linguistic unit and its characteristics are investigated. In particular, a ranked frequencies of dependency frames are observed and modelled and a relationship between particular syntactic functions and the number of...
Article
This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative o...
Article
Full-text available
According to the Menzerath-Altmann law, there is a relation between the size of the whole and the mean size of its parts. The validity of the law was demonstrated on relations between several language units, e.g., the longer a word, the shorter the syllables the word consists of. In this paper it is shown that the law is valid also in syntactic dep...
Presentation
Full-text available
Příspěvek je zaměřen na analýzu negace z perspektivy obecného jazykového zákona, který je znám jako Menzerathův-Altmannův zákon (Altmann 1980; Crammer 2005). Tento zákon vyjadřuje vztah mezi délkou jazykového konstruktu (v našem případě tzv. segmentu, viz níže) a průměrnou délkou bezprostředních jednotek daného konstruktu, tzv. konstituentů (v naše...
Article
A new type of mixtures of discrete probability distributions is presented. A family of discrete averaged mixed distributions is introduced. Its subclass of averaged mixed logarithmic distributions is analyzed. Probabilistic characterizations and connections with other types of mixing are derived. We show also some examples of the analyzed distribut...
Article
Full-text available
We present a review of the development and the state of the art of syntactic complex network analysis. Some characteristics of such networks and problems connected with their construction are mentioned. Relations between global network indicators and specific language properties are discussed. Applications of syntactic networks (language acquisitio...
Article
Full-text available
The relationship between two important semantic properties (polysemy and syn-onymy) of language and one of the most fundamental syntactic network properties (a degree of the node) is observed. Based on the synergetic theory of language, it is hypothesized that a word which occurs in more syntactic contexts, i.e. it has a higher degree, should be mo...
Article
Full-text available
The Ord's graph is a simple graphical method for displaying frequency distributions of data or theoretical distributions in the two-dimensional plane. Its coordinates are proportions of the first three moments, either empirical or theoretical ones. A modification of the Ord's graph based on proportions of indices of qualitative variation is present...
Article
Menzerath's law, the tendency of Z, the mean size of the parts, to decrease as X, the number of parts, increases is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z ~ 1/X. That scaling is a ver...
Article
At the very beginning I want to emphasize that this comment is meant neither as a criticism of complex networks in general, nor of the work by Cong and Liu [4] in particular. I consider complex networks a useful tool in language modelling, and the presented review is, in fact, more than a review - not only it sums up a considerable volume of previo...
Article
The paper questions the use of the Pearson chi-square goodness-of-fit test for discrete models in linguistics. It is argued that the stochastic independence, one of necessary conditions for a correct application of the test, is not realistic for linguistic data. Several alternative possibilities (computational and empirical approaches) are suggeste...
Article
Full-text available
Partial-sums discrete probability distributions occurred in description of many stochastic models. They were used also as a tool for creating new distributions, or as a link between known distributions. It is shown in this paper that every discrete distribution with only non-zero probabilities is a partial-sums distribution, and, moreover, that it...
Article
Syntax of natural language has been the focus of linguistics for decades. The complex network theory, being one of new research tools, opens new perspectives on syntax properties of the language. Despite numerous partial achievements, some fundamental problems remain unsolved. Specifically, although statistical properties typical for complex networ...
Article
Full-text available
The aim of this article is to find fixed points and regularities in musical texts, set up statistical tests for their comparison and observe their development. The analysis is based on rank-frequency distributions of pitches. The following indicators are described: the h-point and its angle, the a-indicator, the H-point and the H-coverage having an...
Article
A new discrete distribution which is a generalization of the right truncated geometric distribution is presented. Its basic properties are studied. The distribution is applied to modelling rank frequencies of graphemes.
Article
Full-text available
A generalization of the STER summation is presented. Relations between pro-bability generating functions and moments of the generating and generated dis-tributions are analyzed. It is shown that the Yule distribution is invariant with respect to the considered summation.
Article
A generalization of the partial summation given by N. L. Johnson, S. Kotz and A. W. Kamp [“Univariate discrete distributions” (1992; Zbl 0773.62007), p. 448] and by G. Wimmer and G. Altmann [Acta Univ. Palacki. Olomuc., Fac. Rerum Nat., Math. 39, 215–247 (2000; Zbl 1041.62009)] is presented. Relations between probability generating functions and mo...

Network

Cited By