Ján Mačutek

Ján Mačutek
Slovak Academy of Sciences | SAV · Mathematical Institute

About

45
Publications
4,030
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
219
Citations

Publications

Publications (45)
Article
Full-text available
Non-alcoholic fatty liver disease (NAFLD) is a liver pathology affecting around 25% of the population worldwide. Excess oxidative stress, inflammation and aberrant cellular signaling can lead to this hepatic dysfunction and eventual carcinoma. Molecular hydrogen has been recognized for its selective antioxidant properties and ability to attenuate i...
Article
Full-text available
For every discrete probability distribution, there is one and only one partial summation which leaves the distribution unchanged. This invariance property is reconsidered for distributions with one parameter. We show that if we change the parameter value in the function which defines the summation, two families of distributions can be observed. The...
Article
The paper focuses on dynamics of changes of several linguistic and text properties in diachronic development of Czech. Specifically, we analyze the proportion of identical word-forms (types), the average type length, text length, the proportion of hapax legomena, the moving average type-token ratio, and entropy. For the analysis, seven translations...
Article
Full-text available
Rewriting books was a widespread phenomenon during the Baroque period of the Czech literature. The manuscripts were not always „honest copies”, on the contrary, scribes often compiled several sources or added their own texts to the original. The famous book Golden Key of Heaven by Martin of Cochem is compared with a manuscript Key of Heaven from a...
Article
Full-text available
The paper deals with two important questions in linguistic research: 1) What do we actually model when we model language usage? and 2) What is an appropriate sample or ‘text unit’ for the analysis of language usage? In the beginning, we critically discuss several approaches to the analysis of language behaviour. Then, we introduce the most importan...
Article
Full-text available
The Zipf-Mandelbrot distribution serves as a mathematical model for ranked frequencies in many areas of scientific research, including linguistics. Many linguistic units, like e.g., words or word n-grams, follow this distribution. However, in some cases, such as for graphemes in linguistics or species abundance and diversity data in biology, the pa...
Book
Full-text available
In Tackling the Toolkit, we focus on the methodological innovations, challenges, obstacles and even shortcomings associated with applying quantitative methods to poetry specifically and poetics more broadly. Using tools including natural language processing, web ontologies, similarity detection devices and machine learning, our contributors explore...
Article
Full-text available
The problem of iterated partial summations is solved for some discrete distributions defined on finite supports. The power method, usually used as a computational approach to the problem of finding matrix eigenvalues and eigenvectors, is in some cases an effective tool to prove the existence of the limit distribution, which is then expressed as a s...
Article
Annual speeches of Czech and Czechoslovak presidents on the occasion of the end of the year are analyzed in this study. Several stylometric methods are used, namely, vocabulary richness expressed by the moving-average type–token ratio, an index of text activity, mean word length, mean verb distance, and cluster analysis of the most frequent words....
Article
Full-text available
The paper focuses on analyzing the relationship among word order positions of pronominal enclitics in the history of Czech. Specifically, we look at the Wackernagel’s position and the contact position and we try to decide whether these two positions compete, as usually taken for granted, or whether there is a certain kind of cooperation between the...
Preprint
Full-text available
Bivariate partial-sums discrete probability distributions are defined. The question of the existence of a limit distribution for iterated partial summations is solved for finite-support bivariate distributions which satisfy conditions under which the power method (known from matrix theory) can be used. An oscillating sequence of distributions, a ph...
Preprint
Full-text available
The problem of iterated partial summations is solved for some discrete distributions defined on discrete supports. The power method, usually used as a computational approach to finding matrix eigenvalues and eigenvectors, is in some cases an effective tool to prove the existence of the limit distribution, which is then expressed as a solution of a...
Conference Paper
Full-text available
The paper is focused on the analysis of the relationship between the full valency of the predicate and the position of enclitics in the clause. For this analysis, ones of the oldest Old Czech prose texts were used. We set up the hypothesis - the higher the full valency of the predicate, the lower the probability of the occurrence of the enclitic af...
Conference Paper
Full-text available
Lengths (in words) of projective and non-projective sentences from a Czech UD dependency treebank are compared. It is shown that non-projective sentences are significantly longer (in addition, the same result was obtained in this study also for Arabic, Polish, Russian, and Slovak). The hyperpascal distribution, which was suggested as the model for...
Article
The presented study deals with the historical development of Czech (en)clitics (AuxP). Based on the data from the previous research (Kosek 2015a,b, 2017), it focuses on the development of one group the Czech (en)clitics – on the preterite auxiliary forms. In the article, three hypotheses are formulated and then tested on the data gained from select...
Article
Full-text available
The paper deals with the word order of reflexive sě, which is an item on the boundary between a pronominal form and a discrete morpheme. In the first part of the study, we investigate the (en)clitic status of sě in eight books of the oldest complete Czech Bible translation. The analysis focuses only on sě that is dependent on a finite verb: it iden...
Article
Full-text available
In this part of the paper, the distribution of clause positions of the reflexive pronoun sě is analyzed statistically. Specifically, the impact of both stylistic factors and the length of the element in the initial position are investigated. The authors also discuss the possible influence of the word order of the Latin pretext (the Vulgate) on the...
Chapter
Full-text available
The article presents a quantitative analysis of some syntactic dependency properties in Czech. A dependency frame is introduced as a linguistic unit and its characteristics are investigated. In particular, a ranked frequencies of dependency frames are observed and modelled and a relationship between particular syntactic functions and the number of...
Article
This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative o...
Article
Full-text available
According to the Menzerath-Altmann law, there is a relation between the size of the whole and the mean size of its parts. The validity of the law was demonstrated on relations between several language units, e.g., the longer a word, the shorter the syllables the word consists of. In this paper it is shown that the law is valid also in syntactic dep...
Presentation
Full-text available
Příspěvek je zaměřen na analýzu negace z perspektivy obecného jazykového zákona, který je znám jako Menzerathův-Altmannův zákon (Altmann 1980; Crammer 2005). Tento zákon vyjadřuje vztah mezi délkou jazykového konstruktu (v našem případě tzv. segmentu, viz níže) a průměrnou délkou bezprostředních jednotek daného konstruktu, tzv. konstituentů (v naše...
Article
A new type of mixtures of discrete probability distributions is presented. A family of discrete averaged mixed distributions is introduced. Its subclass of averaged mixed logarithmic distributions is analyzed. Probabilistic characterizations and connections with other types of mixing are derived. We show also some examples of the analyzed distribut...
Article
Full-text available
We present a review of the development and the state of the art of syntactic complex network analysis. Some characteristics of such networks and problems connected with their construction are mentioned. Relations between global network indicators and specific language properties are discussed. Applications of syntactic networks (language acquisitio...
Article
Full-text available
The relationship between two important semantic properties (polysemy and syn-onymy) of language and one of the most fundamental syntactic network properties (a degree of the node) is observed. Based on the synergetic theory of language, it is hypothesized that a word which occurs in more syntactic contexts, i.e. it has a higher degree, should be mo...
Article
Full-text available
The Ord's graph is a simple graphical method for displaying frequency distributions of data or theoretical distributions in the two-dimensional plane. Its coordinates are proportions of the first three moments, either empirical or theoretical ones. A modification of the Ord's graph based on proportions of indices of qualitative variation is present...
Article
Menzerath's law, the tendency of Z, the mean size of the parts, to decrease as X, the number of parts, increases is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z ~ 1/X. That scaling is a ver...
Article
At the very beginning I want to emphasize that this comment is meant neither as a criticism of complex networks in general, nor of the work by Cong and Liu [4] in particular. I consider complex networks a useful tool in language modelling, and the presented review is, in fact, more than a review - not only it sums up a considerable volume of previo...
Article
The paper questions the use of the Pearson chi-square goodness-of-fit test for discrete models in linguistics. It is argued that the stochastic independence, one of necessary conditions for a correct application of the test, is not realistic for linguistic data. Several alternative possibilities (computational and empirical approaches) are suggeste...
Article
Full-text available
Partial-sums discrete probability distributions occurred in description of many stochastic models. They were used also as a tool for creating new distributions, or as a link between known distributions. It is shown in this paper that every discrete distribution with only non-zero probabilities is a partial-sums distribution, and, moreover, that it...
Article
Syntax of natural language has been the focus of linguistics for decades. The complex network theory, being one of new research tools, opens new perspectives on syntax properties of the language. Despite numerous partial achievements, some fundamental problems remain unsolved. Specifically, although statistical properties typical for complex networ...
Article
Full-text available
The aim of this article is to find fixed points and regularities in musical texts, set up statistical tests for their comparison and observe their development. The analysis is based on rank-frequency distributions of pitches. The following indicators are described: the h-point and its angle, the a-indicator, the H-point and the H-coverage having an...
Article
A new discrete distribution which is a generalization of the right truncated geometric distribution is presented. Its basic properties are studied. The distribution is applied to modelling rank frequencies of graphemes.
Article
Full-text available
A generalization of the STER summation is presented. Relations between pro-bability generating functions and moments of the generating and generated dis-tributions are analyzed. It is shown that the Yule distribution is invariant with respect to the considered summation.
Article
A generalization of the partial summation given by N. L. Johnson, S. Kotz and A. W. Kamp [“Univariate discrete distributions” (1992; Zbl 0773.62007), p. 448] and by G. Wimmer and G. Altmann [Acta Univ. Palacki. Olomuc., Fac. Rerum Nat., Math. 39, 215–247 (2000; Zbl 1041.62009)] is presented. Relations between probability generating functions and mo...

Network

Cited By

Projects

Project (1)
Project
The project focuses on the development of the word order of Czech pronominal (en)clitics mi "to me", si "REFLdat", ti "to you"; ho "him", mu "to him", sě "REFLacc", tě "you". The analysis is based on representative sonds parts of Old and Middle Czech Bible (created in 14th‒18th Century). The word order of pronominal (en)clitics is investigated: 1. in the phrase of finite verb, 2. in the infinitive, participle, (deverbative) adjective and (deverbative) substantive phrase. The research deals especially with the competition between the second position and contact (verb adjacent) position of the (en)clitics, with the (en)clitic cluster, with the change of originally orthotonic pronominal forms ho, mu, sě, tě to “constant” (en)clitics and with the proclitization of pronominal (en)clitics. The project methodology relates to the tradition of Czech dependence and functional syntax. As the analysis of historical development of (en)clitics is also based on frequency characteristics of the observed phenomena, methods of quantitative linguistics are used for a further interpretation of the data.