Article

On the Complexity of Finite Sequences

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A new approach to the problem of evaluating the complexity ("randomness") of finite sequences is presented. The proposed complexity measure is related to the number of steps in a self-delimiting production process by which a given sequence is presumed to be generated. It is further related to the number of distinct substrings and the rate of their occurrence along the sequence. The derived properties of the proposed measure are discussed and motivated in conjunction with other well-established complexity criteria.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Indeed, if an arbitrarily complex map was permitted, outputs could have arbitrary complexities and probabilities, and thereby remove any connection between probability and complexity. Finally (4), because many AIT applications rely on approximations of Kolmogorov complexity via standard lossless compression algorithms [9,10] (but see [11,12] for a fundamentally different approach), another condition that has been proposed is that the map should not generate pseudo-random outputs like π = 3.1415 . . ., which standard compressors cannot handle effectively. ...
... where N w (x) comes from the 1976 Lempel and Ziv complexity measure [9], and where the simplest strings 0 n and 1 n are separated because N w (x) assigns complexity K = 1 to the string 0 or 1, but complexity 2 to 0 n or 1 n for n ≥ 2, whereas the true Kolmogorov complexity of such a trivial string actually scales as log 2 (n) for typical n, because one only needs to encode n. Having said that, the minimum possible value is K(x) ≈ 0 for a simple set, and so, e.g., for binary strings of length n, we can expect 0 ≤ K(x) ≤ n bits. ...
... We now study simplicity bias in the sine map. Wolfram [48] described the sine map x k+1 = µ sin(π √ x k ) (9) and further displayed the bifurcation diagram for this map, which is broadly similar to that of the logistic map, to illustrate Feigenbaum's discovery of universality. For this map, we sample x 0 ∈ [0.0, 1.0] uniformly 10 6 times and return to the digitisation threshold of 0.5. ...
Article
Full-text available
Arguments inspired by algorithmic information theory predict an inverse relation between the probability and complexity of output patterns in a wide range of input–output maps. This phenomenon is known as simplicity bias. By viewing the parameters of dynamical systems as inputs, and the resulting (digitised) trajectories as outputs, we study simplicity bias in the logistic map, Gauss map, sine map, Bernoulli map, and tent map. We find that the logistic map, Gauss map, and sine map all exhibit simplicity bias upon sampling of map initial values and parameter values, but the Bernoulli map and tent map do not. The simplicity bias upper bound on the output pattern probability is used to make a priori predictions regarding the probability of output patterns. In some cases, the predictions are surprisingly accurate, given that almost no details of the underlying dynamical systems are assumed. More generally, we argue that studying probability–complexity relationships may be a useful tool when studying patterns in dynamical systems.
... Lempel-Ziv complexity of a finite string [2] is a complexity measure inspired by the universal compression algorithms of Lempel Ziv [3], [4]. Computing the LZ complexity of a string (a sequence of bytes) is inherently a serial process. ...
... As introduced by [2], the Lempel-Ziv complexity of a finite string S is proportional to the minimal number of substrings that are necessary to produce S via a simple copy operation. For instance, the string S = ababcabcabcbaa can be constructed from the five sub-strings, a, b, abc, abcabcb, aa and therefore its LZ-complexity equals 5. ...
... The standard serial algorithm for computing the LZ-complexity of a string of length n takes O(n 2 ) time. The definition of this complexity [2], [5] is as follows: let S,Q and R be strings of bytes that are defined over the alphabet A. Denote by l(S) the length of S, and S(i) denotes the i th element of S. We denote by S(i, j) the sub-string of S, which consists of bytes of S between position i and j (inclusive). An extension R = SQ of S is reproducible from S (denoted as S ! ...
... This study also used L-Z signal complexity 41 to evaluate the probability of emerging signal patterns and the feature of neural activity transitions. ...
... In the formula, the original signal (formula1) is segmented into a new sequence S according to l segments. And then the normalized complexity of the S sequence is calculated, 41 in which c is the complexity of sequence S. ...
Article
Full-text available
Alzheimer's disease (AD) constitutes a neurodegenerative disorder marked by a progressive decline in cognitive function and memory capacity. The accurate diagnosis of this condition predominantly relies on cerebrospinal fluid (CSF) markers, notwithstanding the associated burdens of pain and substantial financial costs endured by patients. This study encompasses subjects exhibiting varying degrees of cognitive impairment, encompassing individuals with subjective cognitive decline, mild cognitive impairment, and dementia, constituting a total sample size of 82 participants. The primary objective of this investigation is to explore the relationships among brain atrophy measurements derived from magnetic resonance imaging, atypical electroencephalography (EEG) patterns , behavioral assessment scales, and amyloid β-protein (Aβ) indicators. The findings of this research reveal that individuals displaying reduced Aβ1-42/Aβ-40 levels exhibit significant atrophy in the frontotemporal lobe, alongside irregularities in various parameters related to EEG frequency characteristics, signal complexity, interregional information exchange, and microstates. The study additionally endeavors to estimate Aβ1-42/Aβ-40 content through the application of a random forest algorithm, amalgamating structural data, elec-trophysiological features, and clinical scales, achieving a remarkable predictive # Jingnan Sun, Zengmai Xie and Yike Sun contributed equally to this work. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
... In this study, the dataset was created by obtaining EEG signals from 33 participants while they were presented with olfactory stimuli by presenting scented and unscented versions of the identical product packaging. Nonlinear characteristics such as HFD [17], HPs [18], and LZC [19] were utilized to examine their effectiveness in olfactory stimuli classification and the analysis of emotional processes using EEG signals [4]. The performance of ML classifiers was tested using different metrics and compared with the authors' previous work, which was conducted using the same dataset but used PSDs of the EEG sub-bands [20]. ...
... This feature calculates the signal's temporal complexity [19]. To calculate LZC, first, a binary sequence is constructed using (2). ...
... Nonlinear dynamical features are used to describe the characteristics of nonlinear behavior within dynamical systems, and many scholars have applied them in the processing of real measured signals [6 9]. Commonly employed nonlinear dynamical features include Lempel-Ziv complexity (LZC) [10], Lyapunov exponent [11], and entropy [12]. Speci cally, LZC can characterize the rate at which new patterns emerge in a signal, whereas entropy quanti es the uncertainty of the signal. ...
... In this section, we conduct comparative experiments on the performance of DCN-TE using simulated signals, including its ability to capture dynamic changes and differentiate between various chaotic models. The comparison metrics include LZC [10], permutation entropy (PE) [31] and dispersion entropy (DE) [30]. Additionally, we also conduct a statistical analysis of the computational consumption of these metrics. ...
Preprint
Full-text available
In signal acquisition, various forms of noise interference are inevitably present, and the resulting nonlinear signals severely limit the applicability of traditional signal processing methods. To address this challenge, this study proposes a novel complexity measurement metric called dispersion complex network-transition entropy (DCN-TE), which integrates the concepts of complex networks and information entropy. Specifically, we use the single cumulative distribution function values as nodes and employ Markov chains to represent the links, thereby transforming the signal into a complex network with directional weights. Then, we assess both the significance of nodes and the links to compute the DCN-TE value, and combine it with classifiers for signal processing tasks. Subsequent experiments comprehensively evaluate the performance of DCN-TE using simulated chaotic models and real hydroacoustic signals. The results indicate that compared with Lempel-Ziv complexity, permutation entropy, and dispersion entropy, DCN-TE can more rapidly and accurately capture dynamic changes in signals. Importantly, DCN-TE also exhibits optimal performance in distinguishing between different categories of chaotic models, ships, and modulation signals, thereby demonstrating its significant potential in signal processing.
... We used the Lempel-Ziv complexity algorithm (59) to examine the complexity of a text as an additional indicator for creativity. This measure was initially developed for the purpose of lossless data compression; the modified Lempel-Ziv complexity evaluates the compressibility of a signal, which, in this instance, is a collection of text strings (rendered from a series of bytes). ...
Preprint
Full-text available
The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence suggesting that LLMs can indeed surpass human capabilities in specific creative tasks such as divergent association and creative writing. Our quantitative benchmarking framework opens up new paths for the development of more creative LLMs, but it also encourages more granular inquiries into the distinctive elements that constitute human inventive thought processes, compared to those that can be artificially generated.
... Introduced by Claude Shannon in 1948, Shannon entropy measures dataset uncertainty and guides data compression and cryptography [50]. Kolmogorov complexity, proposed by Andrei Nikolaevich Kolmogorov in 1963 and formally developed by Lempel and Ziv in 1976 through LZ78, quantifies the minimal bit-length required for sequence generation and reflects sequence complexity [51,52]. Higuchi's Hurst exponent, which measures long-range dependencies in time series, indicates persistence with values over 0.5, playing a crucial role in analyzing data patterns [53]. ...
Article
Electroencephalography (EEG) is essential for diagnosing neurological disorders such as epilepsy. This paper introduces a novel approach that employs the Allen-Cahn (AC) energy function for the extraction of nonlinear features. Drawing on the concept of multifractals, this method facilitates the acquisition of features across multi-scale. Features extracted by our method are combined with a support vector machine (SVM) to create the AC-SVM classifier. By incorporating additional measures such as Kolmogorov complexity, Shannon entropy, and Higuchi's Hurst exponent, we further developed the AC-MC-SVM classifier. Both classifiers demonstrate excellent performance in classifying epilepsy conditions. The AC-SVM classifier achieves 89.97% accuracy, 94.17% sensitivity, and 89.95% specificity, while the AC-MC-SVM reaches 97.19%, 97.96%, and 94.61%, respectively. Furthermore, our proposed method significantly reduces computational costs and demonstrates substantial potential as a tool for analyzing medical signals.
... While we acknowledge here and elsewhere (below) that our choice of algorithmic complexity is motivated by a similarity in spirit and mathematical heritage, we feel that this approach-given the relative novelty of connecting the FEP with theoretical linguistics-is suitably noncommittal with respect to which algorithmic complexity measure is ultimately going to show the most direct sympathy with syntactic derivational architecture, and below we will discuss some other possible measures. Searching the structure from top to bottom, identifying each branching node and its elements (e.g., inputting [α λ α β γ δ ε]), we used a Lempel-Ziv implementation (Faul, 2021) of the classical Kolmogorov complexity algorithm (Kaspar & Schuster, 1987;Lempel & Ziv, 1976) to measure the number of unique sub-patterns when scanning the string of compiled nodes. 6 This Lempel-Ziv algorithm computes a Kolmogorov complexity estimate derived from a limited programming language that permits only copy and insertion in strings (Kaspar & Schuster, 1987). ...
Article
Full-text available
Natural language syntax yields an unbounded array of hierarchically structured expressions. We claim that these are used in the service of active inference in accord with the free-energy principle (FEP). While conceptual advances alongside modelling and simulation work have attempted to connect speech segmentation and linguistic communication with the FEP, we extend this program to the underlying computations responsible for generating syntactic objects. We argue that recently proposed principles of economy in language design—such as “minimal search” criteria from theoretical syntax—adhere to the FEP. This affords a greater degree of explanatory power to the FEP—with respect to higher language functions—and offers linguistics a grounding in first principles with respect to computability. While we mostly focus on building new principled conceptual relations between syntax and the FEP, we also show through a sample of preliminary examples how both tree-geometric depth and a Kolmogorov complexity estimate (recruiting a Lempel–Ziv compression algorithm) can be used to accurately predict legal operations on syntactic workspaces, directly in line with formulations of variational free energy minimization. This is used to motivate a general principle of language design that we term Turing–Chomsky Compression (TCC). We use TCC to align concerns of linguists with the normative account of self-organization furnished by the FEP, by marshalling evidence from theoretical linguistics and psycholinguistics to ground core principles of efficient syntactic computation within active inference.
... A higher permutation entropy value indicates greater complexity or irregularity in the time series, whereas a lower value suggests more regular and predictable behavior. Note that other metrics can also be applied to a symbolic time-series to obtain complexity estimates, for instance, Lempel-Ziv, which measures data compressibility and often yields similar results to entropy-based measures Lempel and Ziv, 1976;Mateos et al., 2020;Pascovich et al., 2022) Figure 2. Measuring neural complexity. Top: The left panel shows an ECoG recording during wakefulness in a freely behaving rat. 5 seconds are shown; each trace corresponds to one cortical location. ...
Preprint
Full-text available
Understanding brain activity across multiple scales is essential for unraveling the complexities of neural function. From the macroscopic to the microscopic level, the brain exhibits diverse dynamics, shaped by billions of neurons and their intricate synaptic connections. However, navigating across these scales presents significant challenges due to technical and conceptual limitations. Complexity analysis provides a promising framework to address these challenges, offering insights into how neural activity spans across scales and how alterations, such as those induced by drugs, impact brain function. This article explores the use of complexity analysis to study brain activity, emphasizing its role in navigating neural scales and elucidating the intricate relationship between microscopic neuronal dynamics and macroscopic brain function. Through this perspective, we aim to foster a deeper understanding of brain complexity and its implications for neuroscience research.
... However, distinguishing the interbeat interval time series of different diseased and healthy states is not necessarily an easy task if only a single scale of the signal is considered. In addition, there is no clear relationship between the entropy-based regularity and the "complexity" (it should be noted that different "complexity" measures of random sequences were proposed in the literature: the linear complexity, the maximum-order complexity, the nonlinear complexity [9], the 2-adic complexity [10], the Lempel-Ziv (LZ) complexity [11,12] in which the signal is converted to a binary sequence by comparing the signal with threshold(s) determined by CG methods (the mean, the median, the midpoint, or the k means), some variants of the LZ complexity such as its extension to the multiscale [13], the permutation LZ complexity [14] and its extensions to the multiscale [15,16], the dispersion LZ complexity and its extension to the multiscale analysis [17], the eigen complexity [18,19], the statistical complexity based on the Rényi entropy [20], and the T complexity. Moreover, some studies were conducted to define the relationship between complexity measures and Shannon entropy. ...
Article
Full-text available
In various applications, multiscale entropy (MSE) is often used as a feature to characterize the complexity of the signals in order to classify them. It consists of estimating the sample entropies (SEs) of the signal under study and its coarse-grained (CG) versions, where the CG process amounts to (1) filtering the signal with an average filter whose order is the scale and (2) decimating the filter output by a factor equal to the scale. In this paper, we propose to derive a new variant of the MSE. Its novelty stands in the way to get the sequences at different scales by avoiding distortions during the decimation step. To this end, a linear-phase or null-phase low-pass filter whose cutoff frequency is well suited to the scale is used. Interpretations on how the MSE behaves and illustrations with a sum of sinusoids, as well as white and pink noises, are given. Then, an application to detect attentional tunneling is presented. It shows the benefit of the new approach in terms of p value when one aims at differentiating the set of MSEs obtained in the attentional tunneling state from the set of MSEs obtained in the nominal state. It should be noted that CG versions can be replaced not only for the MSE but also for other variants.
... Finally, observe that finite concatenations of itineraries represent a quite regular (although usually nonperiodic) type of finite binary sequences. For example, their Lempel-Ziv complexity (known as a good measure of the repetitiveness of binary sequences, see [41]) is low in comparison with random finite sequences, but we will not prove that fact here. ...
Article
Full-text available
Modeling nerve cells can facilitate formulating hypotheses about their real behavior and improve understanding of their functioning. In this paper, we study a discrete neuron model introduced by Courbage et al. [Chaos 17, 043109 (2007)], where the originally piecewise linear function defining voltage dynamics is replaced by a cubic polynomial, with an additional parameter responsible for varying the slope. Showing that on a large subset of the multidimensional parameter space, the return map of the voltage dynamics is an expanding Lorenz map, we analyze both chaotic and periodic behavior of the system and describe the complexity of spiking patterns fired by a neuron. This is achieved by using and extending some results from the theory of Lorenz-like and expanding Lorenz mappings.
... In other words, if the amplitude after stimulation exceeds the 95th percentile of the baseline, we consider it as a responded activity to the stimulus. Responsivity and complexity are defined as PR i (t) and Lempel-Ziv complexity 39 of PR i (t) respectively. Responsivity measures the amount of the significant responses, and complexity measures how complex the patterns of the responses are. ...
Article
Full-text available
Sickle cell disease (SCD) is a genetic disorder causing painful and unpredictable Vaso-occlusive crises (VOCs) through blood vessel blockages. In this study, we propose explosive synchronization (ES) as a novel approach to comprehend the hypersensitivity and occurrence of VOCs in the SCD brain network. We hypothesized that the accumulated disruptions in the brain network induced by SCD might lead to strengthened ES and hypersensitivity. We explored ES's relationship with patient reported outcome measures (PROMs) as well as VOCs by analyzing EEG data from 25 SCD patients and 18 matched controls. SCD patients exhibited lower alpha frequency than controls. SCD patients showed correlation between frequency disassortativity (FDA), an ES condition, and three important PROMs. Furthermore, stronger FDA was observed in SCD patients with a higher frequency of VOCs and EEG recording near VOC. We also conducted computational modeling on SCD brain network to study FDA's role in network sensitivity. Our model demonstrated that a stronger FDA could be linked to increased sensitivity and frequency of VOCs. This study establishes connections between SCD pain and the universal network mechanism, ES, offering a strong theoretical foundation. This understanding will aid predicting VOCs and refining pain management for SCD patients.
... the classical Lempel-Ziv measure [148] and the OP concept [12]. ...
Book
Full-text available
On the understanding and improvement of the ordinal methodologies applied to time series analysis and forecasting
... Note that a data compressor is a software that reduces the size of digital data while preserving the essential information contained in them. The most famous algorithm, Lempel-Ziv Algorithm (LZA) (Lempel and Ziv, 1976) was improved by Welch (1984). LZA counts the minimal number of distinct patterns in a given time series. ...
Article
Natural fluid flow systems exhibit turbulent and chaotic behavior that determines their high-level complexity. Chaos has an accurate mathematical definition, while turbulence is a property of fluid flow without an accurate mathematical definition. Using the Kolmogorov complexity (KC) and its derivative (KC spectrum), permutation entropy (PE), and Lyapunov exponent (LE), we considered how chaos affected the predictability of natural fluid flow systems. This paper applied these measures to investigate the turbulent, complex and chaotic behaviors of monthly streamflow of rivers from Bosnia and Herzegovina, the United States, and the Mendoza Basin (Argentina) and evaluated their time horizons using the Lyapunov time (LT). Based on the measures applied for river streamflow, we derived four modes of the interrelationship between turbulence, complexity, and chaos. Finally, using the measures, we clustered rivers with similar time horizons representing their predictability.
... The lifetime and probability of each state's occurrence, and three information theoretic metrics -Lempel Ziv Complexity (LZC), Block Decomposition Method of Complexity (BDMC), and transition entropy -were computed as previously reported 36 (Fig. 1). LZC and BDMC were computed on 4-bit binarized state timeseries via an LZC algorithm, LZ76 37 and BDMC algorithms 38-40 , respectively. ...
Preprint
Full-text available
One approach to addressing the immense unmet need for treatments of severe Opioid Use Disorder (sOUD) is to understand more about associated changes in the brain’s reward circuitry. It has been shown that during reward anticipation in the Monetary Incentive Delay (MID) task, people with severe substance use disorder (SUD) show blunted responses in reward neural circuitry compared with healthy controls (HC). Conversely, drug-related cues result in heightened responses in the same neural reward circuitry in those with SUD compared with HC. However, it is unclear how such dysfunctional reward processing is related to neural correlates of other processes commonly dysregulated in addiction, such as attention and cognition. The aim of this work was to evaluate whether people with sOUD show different spatiotemporal relationships between reward networks to cognitive and attentional networks. We collected fMRI data while people with sOUD receiving methadone (MD; n = 22) and HC (n = 22) completed the MID and Cue Reactivity tasks. We evaluated differences in functional connectivity (FC) and measures of brain state dynamics. We explored the relationship between FC to µ-Opioid receptor (MOR) and Dopamine D 2 Receptor (DRD2) availability due to their involvement in reward processing. During both the MID and Cue Reactivity tasks, MD participants showed significantly higher mutual information FC between regions in the reward network to those in attention and cognitive networks. We found significant, positive relationships between the higher FC in MD vs HC participants and the sum of MOR and D2 receptor availability during the Cue Reactivity task. In summary, the higher integration among reward, attentional, and cognitive networks in MD participants during both non-drug and drug-related tasks suggests that the relationship between these networks is dysregulated in addiction. These mechanistic insights provide alternative targets for treatment to improve sOUD outcomes.
... The estimation of entropy rate can be challenging since it is difficult to know the joint probability distribution of finite sequences in real-world data. Here, we introduce an estimation algorithm based on Lempel-Ziv data compression [39], which is known to rapidly converge to the real entropy rate of a time series. For a time series with length n, the entropy rate is estimated by where i is the length of the shortest substring starting at position i which doesn't previously appear from position 1 to i − 1 . ...
Article
Full-text available
Regularity is an important aspect of physical activity that can provide valuable insights into how individuals engage in physical activity over time. Accurate measurement of regularity not only advances our understanding of physical activity behavior but also facilitates the development of human activity modeling and forecasting. Furthermore, it can inform the design and implementation of tailored interventions to improve population health outcomes. In this paper, we aim to assess the regularity of physical activities through longitudinal sensor data, which reflects individuals’ all physical activities over an extended period. We explore three entropy models, including entropy rate, approximate entropy, and sample entropy, which can potentially offer a more comprehensive evaluation of physical activity regularity compared to metrics based solely on periodicity or stability. We propose a framework to validate the performance of entropy models on both synthesized and real-world physical activity data. The results indicate entropy rate is able to identify not only the magnitude and amount of noise but also macroscopic variations of physical activities, such as differences on duration and occurrence time. Simultaneously, entropy rate is highly correlated with the predictability of real-world samples, further highlighting its applicability in measuring human physical activity regularity. Leveraging entropy rate, we further investigate the regularity for 686 individuals. We find the composition of physical activities can partially explain the difference in regularity among individuals, and the majority of individuals exhibit temporal stability of regularity.
... In the realm of nonlinear features, several measures are commonly employed to extract the nonlinear characteristics of respiratory signals, including Central Tendency Measure (CTM), Lempel-Ziv complexity, and Approximate Entropy (ApEn) [49,51,52,97,98]. CTM quantifies the variability degree within a time series [99], Lempel-Ziv complexity measures complexity in finite sequences [100], and ApEn assesses the irregularity of a time series by assigning higher values to higher irregularity [101]. The three nonlinear features can essentially serve as representatives of the nonlinear characteristics of respiratory signals. ...
Article
Full-text available
Background and Objective: Sleep-disordered breathing (SDB) poses health risks linked to hypertension, cardiovascular disease, and diabetes. However, the time-consuming and costly standard diagnostic method, polysomnography (PSG), limits its wide adoption and leads to underdiagnosis. To tackle this, cost-effective algorithms using single-lead signals (like respiratory, blood oxygen, and electrocardiogram) have emerged. Despite respiratory signals being preferred for SDB assessment, a lack of comprehensive reviews addressing their algorithmic scope and performance persists. This paper systematically reviews 2012-2022 literature, covering signal sources, processing, feature extraction, classification, and application, aiming to bridge this gap and provide future research references. Methods: This systematic review followed the registered PROSPERO protocol (CRDXXXXXXX), initially screening 342 papers, with 32 studies meeting data extraction criteria. Results: Respiratory signal sources include nasal airflow (NAF), oronasal airflow (OAF), and respiratory movement-related signals such as thoracic respiratory effort (TRE) and abdominal respiratory effort (ARE). Classification techniques include threshold rule-based methods (8), machine learning (ML) models (13), and deep learning (DL) models (11). The NAF-based algorithm achieved the highest average accuracy at 94.11%, surpassing 78.19% for other signals. Hypopnea detection sensitivity with single-source respiratory signals remained modest, peaking at 73.34%. The TRE and ARE signals proved to be reliable in identifying different types of SDB because distinct respiratory disorders exhibited different patterns of chest and abdominal motion. Conclusions: Multiple detection algorithms have been widely applied for SDB detection, and their accuracy is closely related to factors such as signal source, signal processing, feature selection, and model selection.
... Lower LZC and ER values indicate greater regularity and simplicity in time series. These metrics are utilised to characterise signal complexity in various domains, including EEG, ECG, speech, and music [61][62][63]. Table 7 displays the LZ complexity and ER values of various time series. These measures assess the level of randomness and predictability of the electrical spiking signals produced by the hybrids of proteinoids and ZnO colloids. ...
Article
Full-text available
We are studying the remarkable electrical properties of Proteinoids-ZnO microspheres with the aim of exploring their potential for a new form of computing. Our research has revealed that these microspheres exhibit behavior similar to neurons, generating electrical spikes that resemble action potentials. Through our investigations, we have studied the underlying mechanism behind this electrical activity and proposed that the spikes arise from oscillations between the degradation and reorganization of proteinoid molecules on the surface of ZnO. These findings offer valuable insights into the potential use of Proteinoids-ZnO colloids in unconventional computing and the development of novel neuromorphic liquid circuits.
... Along these lines, promising research has been done on correlating measures of entropy with various kinds of consciousness using Lempel-Ziv complexity (LZc) (Lempel and Ziv, 1976), which quantifies the rate at which non-redundant patterns appear in an EEG signal in time (Boncompte et al., 2021)-normalized from 0 to 1, where 0 is no redundancy and 1 is complete redundancy. The pattern we have largely observed is that LZc average increases across different levels of consciousness-anesthesia-induced sleep results in low LZc (Zhang et al., 2001), dreaming in higher LZc, and full wakefulness in even higher LZc. ...
Preprint
Full-text available
Common mental health pathologies such as depression, anxiety, addiction, and PTSD have recently seen treatment inroads. These issues share a phenomenological core—a dissociative quality that involves the disintegration of self and other. Interestingly, some of the most effective treatments of these pathologies are themselvesacutely dissociative. For example, psychedelic therapy has been effective at treating these pathologies and involves highly dissociative altered states. This pattern holds across other dissociative methods such as hypnosis, CBT, and meditation, leading some to hypothesize that there is a common pathway from acutely altered states to long-term treatment of conditions that involve pathological dissociation. Among the proposed mechanisms are pivotal mental states, the entropic brain, REBUS, and pattern breaking. In this paper, I highlight the methods and mechanisms behind the observed clinical efficacy in treating pathologically dissociative mental health issues. I also propose a simplified underlying structure and an experimental approach that could result in effective treatment without complicated pharmacological interventions.
... Finally, the Lempel-Ziv complexity (Lempel and Ziv, 1976) is a measure of the edge of chaos phase transitions (O'Byrne and Jerbi, 2022) indexing complexity and is inversely related to the compressibility of a string of symbols (the temporal sequence of microstates in our case). ...
Article
Full-text available
Background The investigation of mindfulness meditation practice, classically divided into focused attention meditation (FAM), and open monitoring meditation (OMM) styles, has seen a long tradition of theoretical, affective, neurophysiological and clinical studies. In particular, the high temporal resolution of magnetoencephalography (MEG) or electroencephalography (EEG) has been exploited to fill the gap between the personal experience of meditation practice and its neural correlates. Mounting evidence, in fact, shows that human brain activity is highly dynamic, transiting between different brain states (microstates). In this study, we aimed at exploring MEG microstates at source-level during FAM, OMM and in the resting state, as well as the complexity and criticality of dynamic transitions between microstates. Methods Ten right-handed Theravada Buddhist monks with a meditative expertise of minimum 2,265 h participated in the experiment. MEG data were acquired during a randomized block design task (6 min FAM, 6 min OMM, with each meditative block preceded and followed by 3 min resting state). Source reconstruction was performed using eLORETA on individual cortical space, and then parcellated according to the Human Connect Project atlas. Microstate analysis was then applied to parcel level signals in order to derive microstate topographies and indices. In addition, from microstate sequences, the Hurst exponent and the Lempel-Ziv complexity (LZC) were computed. Results Our results show that the coverage and occurrence of specific microstates are modulated either by being in a meditative state or by performing a specific meditation style. Hurst exponent values in both meditation conditions are reduced with respect to the value observed during rest, LZC shows significant differences between OMM, FAM, and REST, with a progressive increase from REST to FAM to OMM. Discussion Importantly, we report changes in brain criticality indices during meditation and between meditation styles, in line with a state-like effect of meditation on cognitive performance. In line with previous reports, we suggest that the change in cognitive state experienced in meditation is paralleled by a shift with respect to critical points in brain dynamics.
... Because the Kolmogorov complexity is not computable, lossless compression algorithms have been employed to approximate an upper bound on the degree of complexity. We applied the Lempel-Ziv algorithm, defined as the number of different substrings encountered as the string is viewed from beginning to the end by the computer program (Lempel & Ziv, 1976;Ziv & Lempel, 1977). ...
Article
Full-text available
Of the four interrelated concepts in the title, only symmetry has an exact mathematical definition. In mathematical development, symmetry is a graded variable-in marked contrast with the popular binary conception of symmetry in and out of the laboratory (i.e. an object is either symmetrical or nonsymmetrical). Because the notion does not have a direct graded perceptual counterpart (experimental participants are not asked about the amount of symmetry of an object), students of symmetry have taken various detours to characterize the perceptual effects of symmetry. Current approaches have been informed by information theory, mathematical group theory, randomness research, and complexity. Apart from reviewing the development of the main approaches, for the first time we calculated associations between figural goodness as measured in the Garner tradition and measures of algorithmic complexity and randomness developed in recent research. We offer novel ideas and analyses by way of integrating the various approaches.
Article
b> Introduction: Acquisition of a deeper understanding of microvascular function across physiological and pathological conditions can be complicated by poor accessibility of the vascular networks and the necessary sophistication or intrusiveness of the equipment needed to acquire meaningful data. Laser Doppler fluximetry (LDF) provides a mechanism wherein investigators can readily acquire large amounts of data with minor inconvenience for the subject. However, beyond fairly basic analyses of erythrocyte perfusion (fluximetry) data within the cutaneous microcirculation (i.e., perfusion at rest and following imposed challenges), a deeper understanding of microvascular perfusion requires a more sophisticated approach that can be challenging for many investigators. Methods: This manuscript provides investigators with clear guidance for data acquisition from human subjects for full analysis of fluximetry data, including levels of perfusion, single- and multiscale Lempel-Ziv complexity (LZC) and sample entropy (SampEn), and wavelet-based analyses for the major physiological components of the signal. Representative data and responses are presented from a recruited cohort of healthy volunteers, and computer codes for full data analysis (MATLAB) are provided to facilitate efforts by interested investigators. Conclusion: It is anticipated that these materials can reduce the challenge to investigators integrating these approaches into their research programs and facilitate translational research in cardiovascular science.
Preprint
Full-text available
As sequencing becomes more accessible, there is an acute need for novel compression methods to efficiently store this data. Omics technologies can enhance biomedical research and individualize patient care, but they demand immense storage capabilities, especially when applied to longitudinal studies. Addressing the storage challenges posed by these technologies is crucial for omics technologies to achieve their full potential. We present a novel lossless, reference-free compression algorithm, GeneSqueeze, that leverages the patterns inherent in the underlying components of FASTQ files (i.e., nucleotide sequences, quality scores and read identifiers). GeneSqueeze provides several benefits, including an auto-tuning compression protocol based on each sample's distribution, lossless preservation of IUPAC nucleotides and read identifiers, and unrestricted FASTQ/A file attributes (i.e., read length, read depth, or read identifier format). We compared GeneSqueeze to the general-purpose compressor, gzip, and to the domain-specific compressor, SPRING. GeneSqueeze achieved up to three times higher compression ratios as compared to gzip, regardless of read length, read depth, or file size. GeneSqueeze achieved 100% lossless compression, with the original and decompressed files perfectly matching for all tested samples, preserving read identifiers, quality scores, and IUPAC nucleotides, in contrast to SPRING. Overall, GeneSqueeze represents a competitive and specialized compression method optimized for FASTQ/A files containing nucleotide sequences that has the potential to significantly reduce the storage and transmission costs associated with large omics datasets without sacrificing data integrity.
Preprint
Full-text available
In the search for EEG markers of human consciousness, alpha power has long been considered a reliable marker which is fundamental for the assessment of unresponsive patients from all etiologies. However, recent evidence questioned the role of alpha power as a marker of consciousness and proposed the spectral exponent and spatial gradient as more robust and generalizable indexes. In this study, we analyzed a large-scale dataset of 260 unresponsive patients and investigated etiology-specific markers of level of consciousness, responsiveness and capacity to recover. We compare a set of candidate EEG makers: 1) absolute, relative and flattened alpha power; 2) spatial ratios; 3) the spectral exponent; and 4) signal complexity. Our results support the claim that alpha power is an etiology-specific marker, which has higher diagnostic value for anoxic patients. Meanwhile, the spectral slope showed diagnostic value for non-anoxic patients only. Changes in relative power and signal complexity were largely attributable to changes in the spectral slope. Grouping unresponsive patients from different etiologies together can confound or obscure the diagnostic value of different EEG markers of consciousness. Our study highlights the importance of analyzing different etiologies independently and emphasizes the need to develop clinical markers which better account for inter-individual and etiology-dependent differences.
Article
Lempel-Ziv complexity indicator (LZCI), as one of the complexity indicators, is effectively used for identifying the bearing fault severity due to its own advantages. However, it has suffered from issues of low computational efficiency and inaccurate coding during the calculation process. To address these problems, this paper proposed a fast and accurate LZCI for the recognition of bearing fault severity. The proposed method focused on enhancing the computational efficiency of LZCI by incorporating a data compression algorithm. This compression process not only improved the computational efficiency of LZCI, but also preserved the original shape of the signal. Then, the compressed signal was subjected to multiscale encoding. Finally, the Lempel-Ziv complexity value (LZCV) was calculated and standardized to reduce the influence of data length on LZCV. The proposed method was verified by two single-point fault datasets and two life-cycle datasets. The results showed that the data compression method can effectively compress data while maintaining the original shape, and the proposed method can improve the calculation efficiency and accuracy of LZCI.
Article
Full-text available
Nonoscillatory measures of brain activity such as the spectral slope and Lempel–Ziv complexity are affected by many neurological disorders and modulated by sleep. A multitude of frequency ranges, particularly a broadband (encompassing the full spectrum) and a narrowband approach, have been used especially for estimating the spectral slope. However, the effects of choosing different frequency ranges have not yet been explored in detail. Here, we evaluated the impact of sleep stage and task engagement (resting, attention, and memory) on slope and complexity in a narrowband (30–45 Hz) and broadband (1–45 Hz) frequency range in 28 healthy male human subjects (21.54 ± 1.90 years) using a within-subject design over 2 weeks with three recording nights and days per subject. We strived to determine how different brain states and frequency ranges affect slope and complexity and how the two measures perform in comparison. In the broadband range, the slope steepened, and complexity decreased continuously from wakefulness to N3 sleep. REM sleep, however, was best discriminated by the narrowband slope. Importantly, slope and complexity also differed between tasks during wakefulness. While narrowband complexity decreased with task engagement, the slope flattened in both frequency ranges. Interestingly, only the narrowband slope was positively correlated with task performance. Our results show that slope and complexity are sensitive indices of brain state variations during wakefulness and sleep. However, the spectral slope yields more information and could be used for a greater variety of research questions than Lempel–Ziv complexity, especially when a narrowband frequency range is used.
Chapter
Autism spectrum disorder is an increasingly prevalent and debilitating neurodevelopmental condition and an electroencephalogram (EEG) diagnostic challenge. Despite large amounts of electrophysiological research over many decades, an EEG biomarker for autism spectrum disorder (ASD) has not been found. We hypothesized that reductions in complex dynamical system behaviour in the human central nervous system as part of the macroscale neuronal function during cognitive processes might be detectable in whole EEG for higher-risk ASD adults. In three studies, we compared the medians of correlation dimension, largest Lyapunov exponent, Higuchi’s fractal dimension, multiscale entropy, multifractal detrended fluctuation analysis and Kolmogorov complexity during resting, cognitive and social skill tasks in 20 EEG channels of 39 adults over a range of ASD risk. We found heterogeneous complexity distribution with clusters of hierarchical sequences pointing to potential cognitive processing differences, but no clear distinction based on ASD risk. We suggest that there is indication of statistically significant differences between complexity measures of brain states and tasks. Though replication of our studies is needed with a larger sample, we believe that our electrophysiological and analytic approach has potential as a biomarker for earlier ASD diagnosis.
Chapter
We give algorithms that, given a straight-line program (SLP) with g rules that generates (only) a text T[1..n], build within O(g) space the Lempel-Ziv (LZ) parse of T (of z phrases) in time \(O(n\log ^2 n)\) or in time \(O(gz\log ^2(n/z))\). We also show how to build a locally consistent grammar (LCG) of optimal size \(g_{lc} = O(\delta \log \frac{n}{\delta })\) from the SLP within \(O(g+g_{lc})\) space and in \(O(n\log g)\) time, where \(\delta \) is the substring complexity measure of T. Finally, we show how to build the LZ parse of T from such an LCG within \(O(g_{lc})\) space and in time \(O(z\log ^2 n \log ^2(n/z))\). All our results hold with high probability.
Chapter
We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure \(\delta \) based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial in areas like Bioinformatics). The extension, called iterated SLPs (ISLPs), allows rules of the form \(A \rightarrow \varPi _{i=k_1}^{k_2} B_1^{i^{c_1}}\cdots B_t^{i^{c_t}}\), for which we show how to extract any substring of length \(\lambda \), from the represented text \(T[1\mathinner {.\,.}n]\), in time \(O(\lambda + \log ^2 n\log \log n)\). This is the first compressed representation for repetitive texts breaking \(\delta \) while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. As a byproduct, we extend Ganardi et al.’s technique to balance any SLP (so it has a derivation tree of logarithmic height) to a wide generalization of SLPs, including ISLPs.
Article
We present a no-go theorem for the distinguishability between quantum random numbers (i.e., random numbers generated quantum mechanically) and pseudorandom numbers (i.e., random numbers generated algorithmically). The theorem states that one cannot distinguish these two types of random numbers if the quantum random numbers are efficiently classically simulatable and the randomness measure used for the distinction is efficiently computable. We derive this theorem using the properties of cryptographic pseudorandom number generators, which are believed to exist in the field of cryptography. Our theorem is found to be consistent with the analyses on the actual data of quantum random numbers generated by IBM Quantum and also those obtained in the Innsbruck experiment for the Bell test, where the degrees of randomness of these two set of quantum random numbers turn out to be essentially indistinguishable from those of the corresponding pseudorandom numbers. Previous observations on the algorithmic randomness of quantum random numbers are also discussed and reinterpreted in terms of our theorems and data analyses.
Article
Full-text available
Developing a drug or particular immunotherapy medication for a worldwide epidemic illness caused by viruses (current pandemic) necessitates comprehensive evaluation and annotation of the metagenomic datasets to filter nucleotide sequences quickly and efficiently. Because of the homologs' origin of aligning sequences, space complexity, and time complexity of the analyzing system, traditional sequence alignment procedures are unsuccessful. This necessitates employing an alignment-free sequencing approach in this research that solves the foregoing issue. We suggest a distance function that compresses performance metrics for automatically identifying Short nucleotide sequences used by SARS coronavirus variants to identify critical features in genetic markers and genomic structure. This method provides easy recognition of data compressed by using a set of mathematical and computational tools in the study. We also show that by using our suggested technique to examine extremely short regions of nucleotide sequences, we can differentiate SAR-CoV-2 from SAR-CoV-1 viruses. Later, the Lipinski descriptor (rule of 5) was used to predict the drug-likeness of the target protein in SARS-CoV-2. A regression model using random forest was created to validate the machine learning model for computational analysis. This work was furthered by comparing the regressor model to other machine learning models using lezypredict, allowing scientists to swiftly and accurately identify and describe the SARS coronavirus strains.
Article
Corpus-based contrastive studies have successfully addressed the empirical study of crosslinguistic similarities and differences and may also contribute to understanding complexity across languages. This paper aims at (dis)proving whether the Spanish subjunctive mood shows greater complexity than its English correspondences as translations or sources of the Spanish subjunctive forms. It also explores a trade-off between language levels, i.e., whether higher morphological complexity is linked to syntactic and lexical complexity. The data come from a bidirectional English-Spanish corpus, and the word-alignment-based metric system has been used to quantify morphological complexity. Syntactic and lexical complexity were also investigated and computed using tests of statistical significance. Results corroborate that Spanish presents higher complexity in this verbal area in all three levels, morphological, syntactic, and lexical: the hypothesized trade-off between higher morphological complexity on the one hand, and lower syntactic and lexical complexity on the other is not validated by our data.
Preprint
Full-text available
Transcranial magnetic stimulation (TMS) is a frequently used intervention for brain modulation with highly promising scientific and therapeutic applications. Two shortcomings of TMS applications, however, are the high within-subject and between-subjects variability in response to stimulation, which undermine the robustness and reproducibility of results. A possible solution is to optimize individual responses to TMS by exploiting rapidly fluctuating state variables such as the phase and power of neural oscillations. However, there is widespread uncertainty concerning the appropriate frequency and/or phase to target. Here, we evaluate two different approaches which do not require a choice of frequency or phase but instead utilize properties of the broadband EEG signal to predict corticospinal excitability (CSE). Our results suggest that both the spectral exponent (i.e., the steepness of the EEG 1/f background or aperiodic component) and the entropy or “complexity” of the EEG signal are both useful predictors of CSE above and beyond band-limited features, and may be deployed in brain state-dependent TMS applications.
Article
In a class of periodically driven systems, multifractal states in nonequilibrium conditions and robustness of dynamical localization when the driving is made aperiodic have received considerable attention. In this paper, we explore a family of one-dimensional Aubry-André-Harper models that are quasiperiodically kicked with protocols following different binary quasiperiodic sequences, which can be realized in ultracold atom systems. The relationship between the systems' localization properties and the sequences' mathematical features is established utilizing the Floquet theorem and the Baker-Campbell-Hausdorff formula. We investigate the multifractality and prethermalization of the eigenstates of the unitary evolution operator combined with an analysis of the transport properties of initially localized wave packets. We further contend that the quasiperiodically kicked Aubry-André-Harper model provides a rich phase diagram as the periodic case but also brings the range of parameters to observe multifractal states and prethermalization to a regime more amenable to experiments.
Article
We propose a variant of the Kolmogorov concept of complexity which yields a common theory of finite and infinite random sequences. Processes are sequential coding schemes such as prefix coding schemes. The process complexity is the minimal length of the description of a sequence in a sequential coding scheme. The process complexity does not oscillate. We establish some concepts of effective tests which are proved to be equivalent.
Article
Kolmogorov has defined the conditional complexity of an object y when the object x is already given to us as the minimal length of a binary program which by means of x computes y on a certain asymptotically optimal machine. On the basis of this definition he has proposed to consider those elements of a given large finite population to be random whose complexity is maximal. Almost all elements of the population have a complexity which is close to the maximal value. In this paper it is shown that the random elements as defined by Kolmogorov possess all conceivable statistical properties of randomness. They can equivalently be considered as the elements which withstand a certain universal stochasticity test. The definition is extended to infinite binary sequences and it is shown that the non random sequences form a maximal constructive null set. Finally, the Kollektivs introduced by von Alises obtain a definition which seems to satisfy all intuitive requirements.