PreprintPDF Available

Abstract and Figures

The creativity and emergence of biological and psychological behavior tend to be nonlinear—biological and psychological measures contain degrees of irregularity. The linear model might fail to reduce these measurements to a sum of independent random factors (yielding a stable mean for the measurement), implying nonlinear changes over time. The present work reviews some of the concepts implicated in nonlinear changes over time and details the mathematical steps involved in their identification. It introduces multifractality as a mathematical framework helpful in determining whether and to what degree the measured series exhibits nonlinear changes over time. These mathematical steps include multifractal analysis and surrogate data production for resolving when multifractality entails nonlinear changes over time. Ultimately, when measurements fail to fit the structures of the traditional linear model, multifractal modeling allows making those nonlinear excursions explicit, that is, to come up with a quantitative estimate of how strongly events may interact across timescales. This estimate may serve some interests as merely a potentially statistically significant indicator, but we suspect that this estimate might serve more generally as a predictor of perceptuomotor or cognitive performance.
Content may be subject to copyright.
Multifractal test for nonlinearity of interactions
across scales in time series
Damian G. Kelty-Stephen1, Elizabeth Lane2, Lauren Bloomfield3, and Madhur Mangalam4
1Department of Psychology, State University of New York-New Paltz, New Paltz, NY, USA
2Department of Psychiatry, University of California-San Diego, San Diego, CA, USA
3Department of Psychology, Grinnell College, Grinnell, IA, USA
4Department of Physical Therapy, Movement and Rehabilitation Sciences, Northeastern
University, Boston, MA, USA
ORCIDs:
Damian G. Kelty-Stephen (0000-0001-7332-8486)
Madhur Mangalam (0000-0001-6369-0414)
E-mails: keltystd@newpaltz.edu; m.manglam@northeastern.edu
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Abstract
The creativity and emergence of biological and psychological behavior tend to be
nonlinear, and correspondingly, biological and psychological measures contain degrees of
irregularity. The linear model might fail to reduce these measurements to a sum of
independent random factors (yielding a stable mean for the measurement), implying
nonlinear changes over time. The present work reviews some of the concepts implicated
in nonlinear changes over time and details the mathematical steps involved in their
identification. It introduces multifractality as a mathematical framework helpful in
determining whether and to what degree the measured series exhibits nonlinear changes
over time. These mathematical steps include multifractal analysis and surrogate data
production for resolving when multifractality entails nonlinear changes over time.
Ultimately, when measurements fail to fit the structures of the traditional linear model,
multifractal modeling allows making those nonlinear excursions explicit, that is, to come
up with a quantitative estimate of how strongly events may interact across timescales.
This estimate may serve some interests as merely a potentially statistically significant
indicator of independence failing to hold, but we suspect that this estimate might serve
more generally as a predictor of perceptuomotor or cognitive performance.
Keywords: fractal; Fourier; heterogeneity; long-range memory; Markov; multifractal
nonlinearity; non-Gaussian; self-report; surrogate testing
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Introduction
We hope in this work to make the case that multifractal modeling is crucial for
psychological science—for theory no less than for data analysis. We will begin with a
familiar measure (i.e., self-report) and a more familiar feeling (i.e., wondering about
nonlinearity in data). Then, we will make the case that multifractality addresses what is
troubling about the familiar measure and elaborates on our scientific ability to act on the
familiar feeling.
“Do I want to understand the nonlinearity in my psychological measures?” is a
self-report measure readers may make at the outset of reading this manuscript.
Psychological sciences deal heavily with self-report as an often effective, expedient way
for a quick look at the underlying thought processes. Then again, self-report is no less a
critical filter for us as researchers on our planning, for instance, as we carry our research
work forward, no matter the project. For instance, “Am I planning the right study?” or
“Do I have the right measures?” are two critical self-report measures we can all relate to,
no matter the research domain. So let us acknowledge and explore the self-report
measure of whether each of us should want to explore nonlinearity in our psychological
measures. Most self-report measures in practice can prompt intriguing answers, and
though they do not always prompt the correct answer every time and across issues
(Jeong et al., 2018), they have a curious psychometric texture that may capture the
thought process in an interesting dynamic, which can be helpful in the proper context
(Baer, 2019).
The familiarity of the self-report hides the maybe alarming truth that self-report
poses severe challenges to our most familiar linear models. Long-lived psychological
measures like self-report exhibit a perfect storm of violated statistical assumptions to
prevent the linear model from linking any measures of cause and effect. Lurking below
the statistical concern of a well-behaved linear model is the more challenging question of
whether psychological causes and effects can be entirely linear. Do self-report measures
meet the necessary criteria for applying the linear models for assessing cause in the first
place? The answer is: in the long run, they do not. Perhaps they could meet those criteria
under contrived constraints when only looking at a small handful of self-reports.
However, for all we have known about the challenges posed by self-report measures, our
use of self-report measures has been anything but short-run: self-report measures have
persisted for over a century of psychological research (Baumeister et al., 2007).
Furthermore, if we consider how comfortable we may be consulting our feelings every
day and at every turn, it might feel somewhat surprising to realize that this most
intuitive and accessible kind of measure may be one of the least amenable to linear
modeling. Moreover, self-report is just the most accessible example of a more general
tangle of logic for psychological science making causal inference from its measurements.
In what follows, we aim towards a larger incompatibility: no matter the measurement,
psychological cause-and-effect relationships are time-sensitive, but linearly modeled
cause-and-effect relationships are not. Multifractal modeling is ready to fill this void with
a means to explicitly estimate the strength of interactions across time scales.
What a linear model needs in order to give us a valid test of cause
Mean, variance, and autocorrelation
resuming that measures like self-report could support linear estimates of cause-and-
effect relationships, what would we need for a linear model? It is important to remember
the fundamental criteria for a linear model. It is ideal for linear modeling that measures
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
show: 1) stable mean, 2) stable variance, and 3) stable autocorrelation over time
(Lutkepohl, 2013; Mandic et al., 2008). These criteria often appear in the statistical
literature as the assumption of independent and identically distributed (i.i.d.) Gaussian
noise, with ‘independence’ implying the lack of memory or sequential structure and
‘identically distributed’ implying stable, similar variance. The measures show Gaussian
distribution when the very many constituent causes shaping that measure all add
together.
Where have we seen the autocorrelation? Three possibilities
Possibility 1: Autocorrelation encodes the past in terms of regular previous
intervals. The autocorrelation exists on the boundaries of many a statistical training—
psychologists doing factorial designs may never need it, and psychologists using time-
series designs may need it very often. So, a brief summary of possibilities is warranted.
First, the autocorrelation offers us a way to encode the correlation of a current
measurement with a previous measurement of the same process. The autocorrelation is
in effect a set of regression coefficients (i.e., indicating “autoregression”), one for each
possible time lag. Lag here means “how many previous measurements ago” with each
coefficient representing the relationship of current measurement with each past
measurement. So, for instance, let us say we are measuring response time (RT), and we
may realize that current-trial RT is positively related to RT on the previous two trials and
much more on the just-previous trial than on the trial before the just-previous one. That
would amount to, in this example, an autocorrelation with large positive lag-1 coefficient
and a smaller but still positive lag-2 coefficient. And perhaps, for the sake of the
example, we might imagine every three trials were somehow similar and priming of one
another. That is, RT might decrease due to priming three trials before, and this
relationship would manifest in a negative lag-3 coefficient.
Possibility 2: Autocorrelation (or Fourier amplitudes) encodes the past in terms
of periodic cycles.
A second possibility is that the autocorrelation can encode regular cyclic change into
linear models. Indeed, some time-series approaches might avoid the autocorrelation
function by name, but they will often use the amplitude spectrum from the Fourier
transform. The amplitude spectrum (or for amplitude squared, the power spectrum)
encodes the size of the oscillations for a wide range of spectral frequencies (or inversely,
wavelength). Wavelength is just another another way to specify lag: for instance, the
time it takes for a cycle to unfold indicates the time between similar rises and falls in the
measurement. It is thus no coincidence that the autocorrelation bears a one-to-one
relationship with the amplitude spectrum of the Fourier transform (Wiener, 1964).
Psychological examples of periodic cycles are circadian rhythms of wakefulness versus
rest, bouts of consumption/production as between famine and feast, oscillation of limbs
during entrainment to a metronomic or musical beat (e.g., Haken et al., 1985).
The Fourier transform is an almost universally available description of a series.
Almost all series yield a Fourier series of amplitudes for oscillations of all possible
frequencies. This almost inevitable availability of a Fourier series is that oscillations are
almost always present in measurements at some timescale (Bloomfield, 2004). One
might not be able to estimate the Fourier series if the series had no oscillations, that is,
unbounded growth or decay of complete stasis. However, nonoscillatory series are
extremely rare and difficult to identify in practice. The lack of generative theory for
nonoscillatory processes goes hand in hand with the statistical problem of determining
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
when the measured oscillations might be reducible to different kinds of noise, for
example, whether measurement noise or noise modeled through moving-average
detrending (Dambros et al., 2019; Spencer-Smith, 1947; Wang et al., 2013). All
psychological or biological systems should recur and fluctuate (Riley & van Orden, 2005),
even the simplest physical models are used to build theories of psychological processes
(Richardson, 1930). Hence, our measurements in psychology should be oscillatory and
thus have estimable Fourier transforms. The primary concern within the
psychological/biological domain is not the presence of oscillations but rather the
stationarity of these oscillatory modes (given the one-to-one relationship of Fourier
transform and linear autocorrelation, compared to the question of stationarity of the
autocorrelation). A long-standing controversy about the applicability and interpretability
of Fourier modes is whether/how they converge (Bloomfield, 2004; Paley & Zygmund,
1930). It is much the same issue of a mean being stable for heavy-tailed probability
distribution functions: we can always calculate the arithmetic mean from a sample of
measurements, but the easily calculable mean of a heavy-tailed process may or may not
support the interpretation that the mean of a Gaussian or thin-tailed distribution would
(Richardson, 1926; Shlesinger et al., 1993). In the case of the Fourier then, the always-
(to-our-knowledge)-oscillatory psychological measures will always allow calculation of the
Fourier series, and the lingering questions are not about the existence of a frequency
domain but rather about the stability of the estimated amplitudes for the oscillatory
modes available to measurement (Singh et al., 2017).
Possibility 3: Linear models omitting autocorrelation assume zero memory
which is just zero autocorrelation. A third possibility of autocorrelation is an implicit
zero. When psychological research does not deal explicitly with autocorrelation, it is not a
denial of the role of autocorrelation but merely an implicit assumption of ‘no memory’ or
‘independence. We do not need to carry all those coefficients of the autocorrelation
function if we can just assume ‘no memory,’ that is, that all coefficients are zero. Another
way to think of zero memory is ‘white noise,’ a pattern of memoryless variability in which
measures oscillate at all timescales, and the magnitude of oscillations is comparable at
every timescale (Baxandall, 1968; Forgacs et al., 1971). Indeed, the always zero
autocorrelation function is mathematically equivalent to a set of sinusoidal oscillations
with similar amplitude for all oscillatory periods. In this sense, the standby assumption of
‘i.i.d Gaussian’ variability is usually an assumption of additive white noise. White noise
itself is an elegant way to generate a Gaussian distribution, that is, to sum together
many sinusoidal oscillations of all available frequencies (or inversely, wavelengths) with
all uniform amplitudes (Pearson, 1905).
Nonstationarities and quick fixes to overcome them in linear modeling
Types of nonstationarities
The stationarities of mean, variance, or autocorrelation each are capable of failing alone.
Stationarity of mean could fail when there is a persistent trend in the mean, while
variance around that mean stays the same (Fig. 1A). Stationarity of variance could fail
separately from stationarity of mean: our self-report might reverberate differently across
time, all the while maintaining the long-run trait, much like the audio waveform of a
high-hat cymbal in a jazz drum solo, shimmering wildly into large positive and negative
micropascals of pressure at intervals but always centered around the zero marks in the
middle (Fig 1B). Similarly, the linear autocorrelation could be nonstationary without
necessarily changing the mean and variance (Fig. 1C). The schedule of events or
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
sequence of trials can shift without aggregate rise or fall and without change in
aggregate dispersion.
Quick fixes
If only one part of the measure is nonstationary, quick fixes can massage the measure
back into conformity with linear-modeling criteria. For instance, a series of response-time
measures can easily show a long sequence of quick responses or a bout of very long,
slow responses (Bills, 1927, 1931, 1935; Holden et al., 2009; Van Orden et al., 2003).
Logarithmic scaling is a quick and surefire way to bring the excursions of the mean into
polite restraint. Another example is the temporal structure of mouse-tracking trajectories
to study online decision processes. In this case, competing computational processes in
the computer’s operating system can produce rather spurious contingencies between
successive measurements of the mouse cursor over extremely brief timescales. Research
using mouse-tracking has developed elegant means of ‘time-normalizing,’ that is, coarse-
graining this measure and imposing, at a longer timescale, a more regular sequence on
poorly sequenced finer-scaled raw measurements (Kieslich et al., 2019).
A perfect storm in our measurements: Failures to meet linear-modeling criteria
accrue and long-memory can compound the problem
Psychological experiences are unstable in the short term
Here, our ability to infer cause from familiar measurements implodes under the demands
of the linear model. Attempts to linearize a measurement series may only go so deep.
Failures to meet the required criteria can accrue, and the quick fixes may only uncover
persisting instability as in the case of self-report measures (Olthof, Hasselman, &
Lichtwarck-Aschoff, 2020). First, an ongoing series of self-report measures will exhibit
nonstationarity of the mean and variance. That is, the long-term changes in self-report
measures exhibit intermittent, irregular, and abrupt transitions from periods between
heightened self-report measures to lower self-report measures. Second, these transitions
can be so frequent that the valid prediction window ranges from 3 to 5 successive self-
report measures. So, despite all the expectable regular rhythms we might use to model
our self-reports, other less regular events might cause new variations. Nonstationarity
could be contagious, spreading from the mean to the autocorrelation function (Horvatic
et al., 2011).
Psychological experiences are unstable in the long term too
The strategy we mentioned of coarse-graining measures (e.g., in time-normalization)
reflects a more profound belief that fluctuations even out in the long run. This belief is a
core value of the linear model, with its expectations of stationarity. Indeed, we know
from cognitive psychology to beware of the ‘hot hand’ fallacy in which we might easily
see patterns where there are none in the long run (Gilovich et al., 1985). So, wary
cognitive psychologists may find it intuitive that, in the longer run, average behavior
starts showing less sequential variation and less sequential memory. However, the
complete opposite can emerge: when we coarse-grain our measurement series for
averages over longer time windows, our psychological-scientific measurement series can
show stronger sequential variations in the long run—or at least show sequential
variations resembling the shorter-run variations and that persist over a much longer
timescale than the premises of linear modeling suggest (Fig. 2).
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
The perhaps unintuitive possibility is that longer-term averages may be no more
stable than short-term averages, and the reason is 'long memory,' that is, initially short-
scale variability persisting across long scales.
Long-memory structure appears in self-report measures (Delignières et al., 2004;
Olthof, Hasselman, & Lichtwarck-Aschoff, 2020) as well as in various equally tried-and-
true psychological-scientific measures: response times, lexical decision, and word naming
(Gilden, 2001; Holden et al., 2009; Kello, 2013; Kello et al., 2008; Van Orden et al.,
2003). Long memory entails ‘fractal scaling’ (shortening the phrase ‘fractional scaling’)
which means that these measures exhibit the autocorrelation diminishing very slowly
with lags, that is, at a scale-invariant power function with a fractional exponent.
Memory in our measurements can mean that instability becomes permanent
This colorful long-memory might initially lull us into the idea that, for instance, more
memory could entail more predictability. However, we should clarify that here the
opposite can be true. Long-memory only means that variability grows similarly from
shorter to longer timescales. Now let us recall how grim our prospects are in that case: if
self-report measures show highly nonstationary mean and variance and a brief prediction
window, then long-memory suggests that what extends similarly over time is the
difficulty in predicting very far. Predictability falling off every 3 to 5 successive self-report
measures as noted before (Olthof, Hasselman, & Lichtwarck-Aschoff, 2020), but then,
long-memory means that predictability suffers even further, repeatedly across time,
becoming ever more unfeasible as short-run instability echoes out across instabilities
over the longer run.
Psychological causes are time-sensitive, but strictly-linear causes are not
Readers on the fence about nonlinearity might find nothing new in these concerns. For
instance, we have long known about the instability of variance under the term
‘heteroscedasticity’ (Tabachnik & Fidell, 2007). If we have a quick fix, what is the harm?
However, our concern is that fitting our misbehaving measurements into a shape that
linear models will recognize may have diminishing returns. At a certain point, quick fixes
shaving off this or that nonstationarity ignore the possibility that these nonlinearities
reflect deeper truths about real causes underlying psychological processes. We can
squeeze our measurements into a shape that will fit a linear model only so long before
they cease to resemble the actual measured behavior.
Linear models freeze any effect of history
Linear modeling in psychology suffers from an incompatibility over time: psychological
processes are rooted in experiential past and arcing towards anticipated futures, and
linear modeling are ultimately time-insensitive. As elegant as linear modeling can be, it
always aims at the same additive and time-independent source of variance.
A question that immediately follows these considerations is: would the linearly-
modeled cause offered above even be recognizable to our knowledge of how cognition
works? Our own self-report would be ‘not necessarily.’ For instance, if our response-series
measured the time taken to read individual words in sequence (e.g., Wallot et al., 2014),
then the autocorrelation is an expression of how each word-reading time might depend
on previous words in the sequence of words having been read. Of course, it makes good
sense that we read each word considering the words before it. Moreover, long-memory
suggests that each current word-reading event always carries some distant effect of long,
long-past words. The power-law decay of autocorrelation in long-memory does mean that
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
the current values of our measure have a dwindling correlation to past values with larger
lags. Nonetheless, the scale-invariant shape of the power-law entails that the
autocorrelation dwindles without ever converging to zero. So, then, yes, we read each
word with a memory of how the story started.
But this linear-modeling story becomes more challenging for how to interpret the
autocorrelation in our theories. The deep problem for strictly-linear psychology is that
linear autocorrelation entails a twofold stricture: first, linearly-autocorrelated sensitivity
to experiential past on the present never changes, and second, the linear autocorrelation
offers no larger context that might—at some point—redefine effects of past constituent
events. To the first part of the linear stricture, a stable autocorrelation is handy for
linearly modeling, but trotting out the whole autocorrelation (N–1 coefficients for an N-
length measurement) to confirm causal modeling, though, maybe of limited theoretical
value (Gilden, 2009). The autocorrelation entails that the contribution of activity at past
lags is always independent (Fig. 3A). Here is where the interpretation for our word-
reading example may stop feeling recognizable: a stationary autocorrelation function
with long-memory implies that all past words matter, but they do so independently. That
is, the time that, you, the reader spent reading the word ‘independently’ at the end of
the last sentence depended on time reading the words just before: ‘so,’ ‘do, ‘they,’ ‘but,
‘matter, and so forth. But the catch is that phrases (or any event longer than a single
word) would have no causal force of their own—and here we catch a glimpse of the
context that we will need for any event to mean anything. For instance, there is no room
in a linear autocorrelation to indicate syntactic structure, such as the adjective ‘past’
might modify ‘words.’ Gone from the linearly-modeled are regular features of readi
Context matters, to risk stating the obvious. If a careless writer omitted any single
word, then the linear autocorrelation offers no phrasal context here to support the reader
in comprehending the meaning of each current word in the greater context of a narrative.
Here we have the second constraint of the linear structure: the autocorrelation function is
definable only at one scale (e.g., the scale of individual words), and implicitly, that makes
impossible the use of one scale as context for processing information on another scale
(Kelty-Stephen & Wallot, 2017). And the lack of context or nesting of multiple scales is
unfortunate: to date, lexical priming research is clear that human readers depend on the
context at multiple different scales of experience (Troyer & McRae, 2021). The
psychological theory takes it for granted that intelligent use of information depends on
considering the information at one scale through a lens at another scale (Simon, 1969).
Mind grows and adapt; linear models do not.
Contextualized meaning is unavailable to linear modeling but may be available
to nonlinear modeling
What we take for granted could be explicit in our models. What we hope leads our
readers off the fence and into the rich potential of nonlinear modeling is precisely this
point: modeling nonlinearity might operationalize these interactions across scales, the
contextualizing of meaning. Indeed, we can agree that context matters in a broad class
of cases, but modeling nonlinearity offers the possibility that the contextualizations are
quantifiable—and testable for anyone doubting that they impact the measurement.
Interactions across timescales are not just ineffable truths but may find a quantitative
expression that can generalize and support formalism. The scaling relationships we have
noticed in the autocorrelations might not be just the coincidental sum of independently
estimable contributions (Fig. 3B). Instead, they might be shadows cast by a thoroughly
different nonlinear form of cause than articulated by linear modeling—one of causes
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
cascading across scale rather than hopping from one independent point in time to the
next. They might be control parameters governing or predicting the sequence of
psychological experiences as they evolve across time. Furthermore, this last possibility is
totally out of the reach of linear modeling: even the most hierarchical linear model will
fail to define a sequence if only because the linear model is time-symmetric and assumes
order does not matter. That is, linear-modeled context could not lead a process towards
any outcome it had not visited before.
What are our choices then? We see two major options. First, we could just give up
on the idea of contextualized cause because linear models are inapplicable. Or, second,
we could try to model the strength of nonlinearity in our measures. Our own self-report
measure is that the latter option feels more productive. Self-report measures are not
going away, and we have no intention of advocating a replacement. The same goes for
the psychological measures showing similar structure. In this vein, we have no interest in
recommending a wholesale jettisoning of standard psychological ontology or taxonomies.
We have no reason to doubt the reality of psychological experience as reported to or
shared with us by participants—whether through self-report or through any measurement
they consent to. For instance, the growth of multifractal modeling has not, in geophysics,
led to the proponents advocating the avoidance of traditional labels for multifractal
things, for example, ‘wave, ‘wind, ‘cloud, ‘storm, or ‘planet’ (Schertzer & Lovejoy,
2013). The only doubt we bring is for the analytical framework that we use to explain the
phenomenological reports. This manuscript is an invitation for all interested in facing up
to the limitations of the linear model’s portrayal of our field’s cherished measures—and in
hopping over the fence into the nonlinear territory.
Multifractal modeling allows probing nonlinear causal relationships across
scales
We hope that the preceding text might have convinced some of our readers that
‘nonlinearity’ is not simply a curiosity far afield from respectable, long-standing
psychological research. No, we hope that we might have shown that nonlinearity might
be rooted deep in our daily work. In this vein, we are indebted to Olthof et al.’s (2020)
work on self-report measures, which raises the concern without causing alarm. The rest
of the work aims to positively approach these issues, addressing the concepts of linear
and nonlinear changes in their own rights. We will aim to nod to the psychological
examples but will try to keep this discussion on the logic and the math. We will discuss
the math through the example of rainfall. This example may be disappointingly
nonpsychological, but we use it for two reasons: (1) a nonpsychological example series
offers the concepts without cluttering the concepts with existing psychological theory.
Seeing a series that does not provide a comfortable home for our most cherished
intuitions can help us see the logic in conceptual terms. For instance, we can invent
mathematically helpful illustrations that may not make any sense for this or that specific
psychological example. (2) Rainfall works neatly with the logic of multifractal analysis.
Multifractal analysis involves ‘binning’ our measurement, that is, seeing how much of our
measurement falls into nonoverlapping subsets. Abstract as some of the following math
and concepts may be, we think it may be helpful to consider rainfall as an immensely
tangible example. For instance, the rain can fall into actual bins, and we do not have to
lean both into the abstraction of math and the abstraction of psychological theory.
The following text is a conceptual description of the logic, along with some
mathematical detail. Readers who find the conceptual treatment incompletely satisfying
may refer to our Supplementary Material examining the multifractality of a small handful
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
of speech audio waveforms, one of human speech and two of text-to-speech (TTS)
synthesizers reading a brief text (Fig. 4). All raw series and code are attached for
replicating the process.
Nonlinearity: Not simply curviness but a failure to reduce to a sum
The analysis of linear or nonlinear changes over time evaluates whether or not the
measured series can be effectively modeled as a sum of independent random factors
(Fig. 3A). Nonlinear changes can mean that the series is not well-modeled as merely a
sum. The question of whether it is a sum is a foundational mathematical issue and goes
deeper than the relatively superficial question of whether changes over time ‘look’ linear
to the eye. Linear changes over time can include curvilinearity; some nonlinear
trajectories are perfectly compatible with modeling the series as a sum. Time-series with
peaks and valleys can invite a polynomial model. Psychological examples of polynomial
structure include a quadratic serial-position curve, with a greater likelihood of recall for
the earlier and the later items in a list (Ranjith, 2012), or a linear decay of maze-
completion errors by Tolman’s rats once they were given a reward (Tolman & Honzik,
1930). Polynomial models include linear effects of time and several powers of the time
(e.g., quadratic and cubic for second and third powers). Critically, polynomials are linear
in the growth parameters, or sums after all—sums of integer exponents of time (e.g.,
linear growth is proportional to ‘time’ which is just time raised to the power of 1, i.e.,
time1, but quadratic profiles are just the sum of ‘time1’ with ‘time squared’ which is just
time raised to the power of 2, i.e., ‘time2, and cubic profiles are ‘time3 + time2 + time1
and so forth). Choosing to test nonlinearity as a failure to reduce to a sum or a linear
model is mathematically deeper than just eyeballing the plots. That is, a curvy plot of
data points over time may still reduce to a sum of independent factors, but ‘seeming
linear’ does not guarantee that it is actually linear. The linear model (i.e., summing parts)
can make very many seemingly linear changes with time, but it can also produce very
many curvilinear profiles. So, it is important to distinguish between ‘seeming linear’ and
being a linear sum.
The failure of a measured behaviors’ nonlinearity to reduce to a sum has taken on
a growing urgency to psychological sciences (e.g., Riley and van Orden 2005).
Multifractality can provide us deeper insights into the failure of a series’ nonlinearity to
reduce to a sum of independent random factors (Fig. 3B). It is quickly becoming clear
that the failure of a series’ nonlinearity to reduce to a sum is important to psychology.
Beyond resemblance to old Gestalt wisdom that wholes differ from sums of parts,
estimates of multifractality have predicted outcomes in executive function, perception, or
cognition, such as in reaction time (Ihlen & Vereijken, 2010), gaze displacements (Kelty-
Stephen & Mirman, 2013), word reading times (Booth et al., 2018), speech audio
waveforms (Hasselman, 2015; Ward & Kelty-Stephen, 2018), rhythmic finger tapping
(Bell et al., 2019), gestural movements during the conversation (Ashenfelter et al.,
2009), electroencephalography (Kardan, Adam, et al., 2020), and functional magnetic
resonance imaging (Kardan, Layden, et al., 2020). Our purpose is not to review the
empirical meaning of multifractality in psychological terms; this question may not even
be answerable in full at present. Multifractality is the logical consequence of processes
that enlist interactions across timescales (Ihlen & Vereijken, 2010), suggesting that it is
essential to processes unfolding at many rates, such as Gottlieb’s (2002) probabilistic
epigenesis. However, the truth would be better served with a broader set of scholars
exploring the role of multifractality in psychological processes. So, our purpose is to
make the method more accessible.
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
This tutorial introduces multifractality as a mathematical framework helpful in
determining whether and to what degree a series exhibits nonlinear changes over time.
It is by no means the first to introduce multifractality; prior entries to the multifractal-
tutorial literature have sometimes taken a more conceptual perspective, introducing
nonlinearity over time as an interaction across multiple timescales (Kelty-Stephen et al.,
2013). Other tutorials have kept closer to detailing algorithmic steps through the use of
computational codes (Ihlen, 2012). The present work aims to tread a middle ground,
reviewing some of the concepts implicated in linear and nonlinear changes over time and
detailing the mathematical steps involved. These mathematical steps include multifractal
analysis and surrogate data production for resolving when multifractality entails nonlinear
interactions across timescales. The present work makes the case that multifractality may
be crucial for articulating cause and effect in psychology at large.
Multifractality: A type of nonlinearity for modeling processes that develop
through interactions across scales
Multifractality, a modeling framework developed in its current form about fifty years ago
(Halsey et al., 1986; Mandelbrot, 1976), is primarily a statistical description of
heterogeneity in how systems change across time. All mathematical frameworks work by
encoding more variability into a symbolic and logical structure. Multifractality is no
exception. What multifractality encodes is the heterogeneity, and it encodes this
heterogeneity as a range—maximum minus minimum—of fractional exponents. These
exponents represent the power-law growth of proportion and timescale. This relationship
between proportion and timescale is a pervading question for any time we observe a
changing system we want to understand: all time-varying processes vary with time, and
we are constantly dealing with the issue that a smaller sample of the whole process tells
us something but not everything about that whole process. So, the important question is,
“how long do we have to look before we see a representative sample of the time-varying
process?” The proportion of the process we can see will increase the longer we look, and
that proportion increases nonlinearly, with the proportion increasing as a power function
(also called a ‘power law’) of scale, with the proportion increasing with scaleα.
Multifractality becomes useful when there is not simply one alpha value but when, for
various reasons outlined below, there may be many. That means that multifractality can
help us understand how and why our samples of observations may align with the broader
structure of the time-varying process. In summary, multifractality encodes heterogeneity
as the range—maximum minus minimum—of fractional exponents that govern the power
laws relating the observed proportions of heterogeneous changes to a specific timescale
(i.e., how a change in measurement relates to a proportional change in time).
Multifractality arose from a long history of scientific curiosity about how fluid
processes generate complex patterns (Richardson, 1926; Turing, 1952) and remains one
of the leading ways to model fluid, nonlinear processes—as initially intended in
hydrodynamics (Schertzer & Lovejoy, 2004) and more recently as a framework for
understanding the fluid-structure of perception, action, and cognition (Dixon et al., 2012;
Kelty-Stephen, 2017; Kelty-Stephen et al., 2021). In what follows, we unpack both what
multifractality is and why it is helpful for quantifying nonlinear changes in series, with
specific examples from perception, action, and cognition.
Multifractality differs from but does not replace mean and standard deviation
What is the relationship between multifractality, mean, and standard deviation?
Multifractality sits apart from the more familiar descriptive statistics like mean and
standard deviation—indeed, it does not replace mean and standard deviation. A good
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
reason for the widespread use of mean and standard deviation is that they support a
wide range of inferential methods to test the effects of many types of hypothesized
causes. However, when our hypotheses about causes begin to probe the issue of changes
over time, mean and standard deviation no longer suffice. Mean and standard deviation
never fail to be helpful or necessary to statistical reasoning but fail to cover more
complex relationships that evolve continually. Therefore, the use of mean and standard
deviation alone does not test hypotheses about how systems change over time, that is,
with the sequence, the time-asymmetry, and the interactions over timescales that
nonlinear series can exhibit.
Why are mean and standard deviation not enough to model how a system
changes over time?
Stable mean and standard deviation are necessary to the linear model, but they are not
sufficient for fully specifying a linear model. There is a third and often lesser-known
component composing the linear model (Mandic et al., 2008). Changes with time require
us to acknowledge that the linear model has not just these two features but three
defining features: (1) mean, (2) standard deviation (i.e., square root of variance), and
(3) specification of the linear autocorrelation (or equivalently the amplitude spectrum of
the Fourier transform). The linear autocorrelation describes how a given time-varying
process correlates with past behavior (e.g., how behavior over the current month
resembles behavior over the past month). As noted above, autocorrelation can appear
through the function of regression coefficients across lags, the amplitude spectrum of the
Fourier transform, or zero memory of ‘white noise.
Linear models model measurements over time using the autocorrelation
function
What is entirely linear change in our dependent measures over time?
A linear model applied to a developing process assumes that it changes similarly over
time in the beginning, in the middle, and at the end of that process. This symmetry over
time that linearity assumes appears most clearly in the sinusoidal waves that the Fourier
transform uses to decompose a series—a sinusoidal wave oscillates around its midpoint,
extending into the future exactly as it had throughout its past.
Ironically, this elegant model for changes over time is locked into repeating the
same changes over time. Perhaps it is not irony so much as it is so simple as to sound
absurd on statement: the change over time does not itself change. The Fourier-
transform’s use of sinusoidal waves to model changes over time reflects the underlying
premise that all changes reverse (e.g., all changes over time balance out), or more
widely known as ‘regression to the mean,’ the idea that what changes over time will
hover around the mean (e.g., what goes up must come back down). Whether we use the
linear autocorrelation or the Fourier amplitude spectrum to quantify changes over time,
the entailment is the same: the linear model necessarily expects a relationship between
the past and the present that does not itself change.
When should we use something other than the linear model to understand
changes in a system over time?
It is worth consulting nonlinear methods anytime the linear model fails to exhaust the
empirically observed variability. However, the linear model is an exceptionally compact
and effective statistical framework. It is important to note that multifractality remains
unfamiliar to some audiences because the statistical literature often frames nonlinear
problems with linear solutions. Let us imagine that a developing process exhibits different
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
changes over time at the beginning versus at the end. Different changes over time are
problematic, but the linear model is simple. So, we might go about finding a simple fix to
allow us to keep using it. For instance, by finding a proposed breakpoint where a sudden
event brought about an abrupt change, after which point, behavior followed an entirely
different pattern. For instance, we might make the same set of footpaths on the way to
work every day and back home every day, but an earthquake could suddenly damage or
change the footpaths with trees or structures along the way. After the earthquake, we
might take a radically different set of footpaths to get to and from work every day. So, if
it were our job to determine the best fitting model of our footpaths over very many days,
it might make a lot of sense to just fit one model for the pre-earthquake footsteps and a
separate model of post-earthquake footsteps.
This find-a-breakpoint strategy has rough-and-ready appeal, but it can become
troubling to generalize it. Crucially, this strategy essentially involves the scientist
deploying multiple timescales because one timescale of observation does not generalize
to the other. One way to remove the individual scientist’s fingerprints from besmirching
the model is fitting long-term predictors for different-sized time scales. For instance, if
we take the more psychological issue of how people spend money predicting daily
spending behavior might involve looking at short-term predictors such as individuals’
previous days’ spending behavior. However, quite apart from these short-term, day-to-
day changes, longer-term trends may be more predictive. For instance, for a college
student, spending behavior may differ vastly during the school year from during the
summer. Summer months may offer the possibility of full-time employment, and so the
effect of previous days’ spending behavior may be entirely different during the longer-
term period of summer versus the rest of the year. Such long-term predictors are often
called ‘seasonal’ and suggestive of cyclic repetition, for example, summer arrives reliably
at the same point on the yearly school calendar. The challenge is that identifying those
long-term predictors requires not only theory and intuition but also account for long-term
periodicities or bins in them. Certainly, once a student graduates or leaves school, then
the cyclic effect of summer may disappear as they begin to work full-time all year round.
At such a point, the relevant cyclic patterns useful for predicting spending behavior may
change. A big unknown throughout here: do we need to rely on an intelligent modeler to
identify specific independent timescales? Has the breakpoint-finding scientist not just
smudged more of their own fingerprints on the process? Or does the time-varying
process itself rely on interactions on timescales that are not necessarily independent?
At least two good reasons exist to wish for an alternative to strictly linear models
of changes over time (e.g., climate change or housing-mortgage crashes). First, ‘changes
over time’ may be noncyclically continuous; that is, changes may shift over time without
any simple breakpoints. The lack of cyclicity may be necessary because a system may
change without returning to the initial ‘normal. It is essential to underscore that any
expectations of ‘regression to the mean’ result from the linear assumption of temporal
symmetry. The temporal symmetry of linear models means they look the same played
backward into the past do played forward into the future (Lutkepohl, 2013). However,
there is no statistical guarantee that what goes up must come down as most measured
systems grow, mature and decay.
The second reason to wish for an alternative to strictly linear models is the
interaction among changes over time across cycles. For instance, roughly cyclic periods
like the year, the month, and the day can be easily identified. However, a daily routine
may vary considerably across the span of a month (e.g., around weekly or biweekly
salary payments), and it may vary further across different months in a year (e.g., as
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
holiday bonuses and time off allow various ways to behave). The necessity to account for
the interaction of various differently scaled factors over time has prompted the need for
multifractal modeling. While there are hierarchical linear models that can estimate
interactions involving short- and long-term effects, they estimate these interactions with
the same expectation of ‘stationarity as in simple linear models’ (Singer & Willett, 2003).
Examples of changes over time and how linear models can respond with
progressively less stable autocorrelations
Given the origins of multifractal modeling in fluid dynamics (Mandelbrot, 1974; Meneveau
& Sreenivasan, 1987), an apt example to consider is daily rainfall in a given region. Daily
rainfall can be measured in centimeters to examine how it changes over time. Reasons
for abrupt changes include elevation, humidity, and temperature. Reasons for more
sustained changes can include the seasons and movement of tectonic plates. Appearance
or disappearance of currents, winds, and vegetation can also impact daily rainfall.
Perfect stability. The linear model aptly applies to changes in daily rainfall over
time under many circumstances. The simplest measurements of daily rainfall include (1)
no rainfall (i.e., perfect drought), or (2) always the same amount of rainfall (Fig. 5A). In
both cases, the present looks perfectly like the past and the future. Such processes are
temporally symmetric. They have perfectly stable means: drought entails a zero mean
for the entire series, and the same amount of rainfall reflects a stable mean for the entire
series. Of note, the standard deviation is zero in both cases. Note that, we cannot easily
characterize psychological variability as in speech audio waveforms with this profile. So, if
only to show the mathematical possibility of perfect stability, rainfall is a little more
accessible an example than many a psychological measure that we always expect to
fluctuate.
White noise. Another case perhaps more suitable to areas with more temperate
climates would be (3) uncorrelated random variation of daily rainfall, varying according to
‘white noise’ (Fig. 5B). White noise is the statistical term for the product of many
independent processes. Calling it ‘white’ reflects an almost poetic allusion to the fact that
some of the earliest uses of the Fourier transforms involved the application to
electromagnetic radiation (i.e., light), and some models of white light have indicated a
broadband contribution of radiation oscillating at many visible frequencies (Baxandall,
1968; Forgacs et al., 1971). Therefore, in the long run, white noise epitomizes the
temporal symmetry characteristic of linear changes over time and the regression to the
mean. A histogram of a white-noise process will approach (over long timescales) a
Gaussian (or Normal) distribution with stable mean and standard deviation. This
Gaussian profile of white noise is a close statistical cousin of the binomial distribution
that a fair coin would generate for large samples of progressively longer sequences of
coin flips (Box et al., 1986). Crucially, white noise regresses to the mean in the long run
and is uncorrelated in time (here, ‘uncorrelated’ implies no correlation between rainfall
across days, weeks, or months). In this case, the average rainfall for one day, one week,
or one month is as good a predictor of the next day, week, or month, respectively, as of
any other day, week, or month, respectively. In other words, the average rainfall of one
time period predicts future time periods—in large part because all of this sequence is
statistically the same.
Uniform seasonality. The measurement area may have a rainy season and,
therefore, a cyclical rainfall pattern (e.g., more in June and July than in April; Fig. 5C).
So long as this rainy season begins and ends reliably with the exact dates, the linear
model will produce an adequate description of this rainfall. The changes over the year
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
would show a peak across months, but in this example, with perfect timing of the
seasons, these changes across months do not change from year to year.
Irregular seasonality. The above examples are rare cases in practice but help
illustrate the temporal symmetry of linear models. However, rainfall is often more
irregular (Fig. 5D). Rainy seasons can come late one year or early; they can come early
multiple years and late the following year. Also, wet or dry years can exist in which the
rainy season varies in intensity across years, decades, and centuries. When considering
these longer timescales, the Fourier transform can spread longer and longer sinusoidal
waves, and similarly, the linear autocorrelation can incorporate progressively longer lags
or waves. In any event, as rainfall is measured over long timescales, our linear model
can be complicated with progressively more factors. However, no matter how long the
timescale or added factors, the linear model’s constraint is that the cyclical patterns must
be regular across time.
Irregular seasonality and its potential connection with nonstationary
autocorrelation. This issue of irregular seasonality is conceptually the same as the
nonstationarity highlighted in Fig. 1C, where the temporal structure varied across time,
suggesting that we could need different autocorrelation functions from the beginning to
the middle, and from the middle to the end of the series. Here we have just that issue
noted above, and a Fourier transform may be calculated, but that may be difficult to
interpret. The Fourier transform will probe a series for its oscillatory modes and estimate
each frequency's amplitudes. However, if the oscillations change over time, the Fourier
alone is not sensitive to that change. The Fourier model (and so the linear model, more
generally) will be definable, but it will not be specific to the structure available in the
measurement.
Nonstationary autocorrelation is a crucial failure of linear modeling
Whereas the first three of our examples above are amenable to linear modeling, these
last two begin to shake the foundations of linear modeling. Certainly, irregular
seasonality might look regular at a longer time scale, and a noisy fit of the Fourier model
or the autocorrelation is not in itself a problem (see next section “No over ever said linear
models have to be perfect”). However, the destabilization of the autocorrelation is the
clearest statistical symptom of the linear model losing its foothold on cause-and-effect
relationships. Specifically, unstable autocorrelations definitely indicate that the effects of
past events are changing, and an extremely interesting possibility is that these changes
could depend on events at another scale. That is to say, long-term seasonality and short-
term variation are not fully separable for many behaving systems.
The choices for how to address this autocorrelational instability are plenty. What
we choose to do with measurements whose temporal structure changes over time will
follow our theoretical interests. Analyses that blend frequency information with time
information (e.g., when oscillatory modes fade in or out) can begin to take better stock
of the measurement’s structure (Singh et al., 2017). We find wavelet models like those
that blend frequency and time information very intriguing and useful. And indeed, some
of these wavelet models have elaborations that allow the calculation of multifractal
spectra (Ihlen & Vereijken, 2010). However, theoretical choices have steered us away
from these wavelet methods for two reasons. First, Chhabra and Jensen’s (1989) method
does not require the steps that wavelet-based methods do, that is, the Legendre
transformation whose first derivatives of wavelet-estimated root-mean-square across
values of q can conflate estimation error with multifractal structure (Zamir, 2003).
Second, whereas wavelets still aim to parse a series into the independent contributions of
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
independent timescales, we use surrogate testing specifically out of an interest in
assessing the strength of interactions across timescales. Surrogate testing is of course
available to wavelet-based estimates of multifractality (Ihlen & Vereijken, 2010), but we
only highlight the limitation of wavelet methods as such.
No one ever said linear models have to be perfect
Prediction error is inevitable and expected
We are not claiming that multifractality is what to do because linear models are not
perfectly predicting. Perfectly valid linear models are expected to have error (i.e.,
differences between measured and predicted) across time. After all, no one expects the
empirical record to be perfectly regular. No linear model must thus demonstrate a perfect
fit. For instance, if our rainfall model predicted 26, 3, and 15 cm of rainfall next Monday,
Tuesday, and Wednesday, we might find that the measured rainfall turned out to be 19,
10, and 11 cm. That would entail errors in the prediction of 7, –7, and 4 for those three
days. Smaller errors beget more accurate predictions than larger errors. It would look
bizarre for a model to predict perfectly. And what makes a linear model valid is that the
errors exhibit the lack of temporal structure and so resemble white noise. That is, using
the linear autocorrelation to fit time-variability in the measurement voids any seeming
requirement that raw measurements of the developing process must always have the
same mean and variance. Said another way, the various methods of detrending and
modeling the autocorrelation can vanquish many of our worries about changing mean or
changing variance. It is only the prediction errors that must have a zero mean, stable
variance, and no correlation across time (i.e., nonzero autocorrelation coefficients overall
for the specified lag period). The model predicting the values of the series itself can have
all manner of structure to it (e.g., see examples of the autoregressive, integrated,
moving-average [ARIMA] models; (Box et al., 1974; Box & Jenkins, 1968). Our complaint
is thus not with linear models predicting imperfectly; it is only with linear models whose
errors take an invalid form, that is, errors with time-varying mean, variances, or
autocorrelation. Behind this concern is the threat that the measurement series does not
meet assumptions of ergodicity and so fails to help infer a stable linear cause (Mangalam
& Kelty-Stephen, 2021, 2022).
Prediction only must be good on average but on the same average throughout
What this means is that predictions need to be good only on average, but there needs to
be only a fair-coin’s worth of deviation between linear prediction and actual behavior. We
may know that predicted average can change, but linear predictions only expect the
same amount and type of error at any point from beginning to end of the series. Linear
prediction can manage limited degrees of irregularity in the mean, but it assumes that
the irregularity has the same form at the beginning, middle, and end of the series. Good
linear prediction is about minimizing the variance of predictions around a time-symmetric
portrayal of change with time. That is, linear modeling tackles time-variability by
assuming that the time-variability is the same across time. This concern is a point
specifically about the autocorrelation, which is the linear description of “how a measure
changes with time.” The more a given measurement series’ autocorrelation changes with
time, linear prediction becomes a progressively weaker strategy.
Linear modeling fails so long as prediction errors change with time
Time-symmetrical models can only explain time-symmetric measurements, and here is
where our patience with the linear model breaks. Most importantly, the linear model does
not permit prediction errors with temporal structure deviating from white noise,
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
regressing-to-the-mean process. We are not speaking here of simple failures of one or
another of the features, for example, trend (failure of stable mean), heteroscedasticity
(failure of stable variance), or linear autocorrelation (failure of memorylessness;
Tabachnik & Fidell, 2007). Indeed, we have long known about each of these failures.
These alone could be met with polynomial/sinusoidal detrending or logarithmic
transforms. Instead, we are raising the concern that, even when we use the known
stopgap measures to bandage these individual failures over, the residuals may persist in
misbehaving, and the variations in our measurements may reflect the change that is not
reducible to independent, additive causes. At a certain point, the issue is that time-
asymmetry is not just a bug of measurement but a persistent feature of our
measurement that our best-fitting and always time-symmetric linear modeling structures
can rarely capture. Furthermore, a deeper issue is that our most ornate linear models
like fractional integration raise mind-boggling questions about how many causal time-
asymmetric factors can be added in all while ignoring the plain fact that psychological
and biological experiences have a sequence, a developmental progression to them (Kelty-
Stephen & Wallot, 2017; Molenaar, 2008).
A major problem here for psychological science is in aiming for models to explain
and then predict cumulative progression. However, our frequently linear models are all
time-symmetric and time-invariant. Learning, remembering, forgetting, and reading
words in sequence—these processes depend on their sequence. We know in our
psychological theories that organisms should gather their processes together and then
break them down in systematic and context-dependent ways. However, our frequently
linear models only operate in a framework that suggests that sequence is irrelevant:
linear models add outcomes from constituent causal factors, and adding is commutative,
working the same way backward and forwards (Molenaar, 2008). So, as the organism
grows and learns and develops, it does mean that new causes come into effect or that
old causes find new ways to participate. If not, then learning and developing are just
reshuffling the same resources every time. However, we speak comfortably about
remembering or forgetting, gaining experience, or losing our faculties. Our theories
expect irreversibility almost as a premise, but the linear modeling strategy is
characteristically an analysis into independent parts in reversible relationships to each
other (Cariani, 1993). It is no wonder that psychological science often finds itself in the
position of finding measurement residuals flying off the rails of our predictive models
(Kelty-Stephen & Dixon, 2012). Time-symmetry cannot explain time-asymmetric
processes: linear models cannot give a voice to psychological theories of growth and
development.
When prediction errors begin to deviate from this uncorrelated random variation, it
signifies a systematic departure of the linear model’s measurement series. Just isolated
points in time do not tip us off to something being amiss. It is a statistical symptom that
will only show up as we examine how the linear model compares to the series in the long
run. That departure might be sudden and abrupt; it may be continuous or intermittent.
However, in the long run, the issue is that the prediction errors could be correlated with
themselves across time—across those same lags we had seen in the autocorrelation
function above. It might be tolerable to measure a short series with a linear model.
However, with progressively longer time, the deviation from linearity becomes
progressively apparent as sums of independent timescales keep on failing to capture
nonsummative interactions across timescales. This departure of prediction errors from
white noise is the empirical margin within which multifractal methods might help.
How do we perform multifractal analysis?
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
The present work uses one of the most straightforward variants of multifractal analysis—
called Chhabra and Jensen’s (1989) direct method (the Appendix at the end of this article
provides the mathematical details of this method). This method built on the foundation of
‘bin proportions’ (Fig. 6). We will unpack this idea of bin proportion as follows: ‘bins’
stand for subsections of the measured series and can also be called ‘time windows,
‘limited samples’ or ‘short snippets’ of the longer series. The question of concern is how
closely any single bin of the series (i.e., any small subset of measurements over time)
resembles other measurements over time (Fig. 6, top and bottom left). For example, will
one bin of the series look like another bin at some other time in the same measurement?
How do these subsets of measurements vary when looking at different timescales—do
measurements in one bin look like the measurements found in a longer bin?
Binning our time series to mimic time windows of observation
We will use the mathematical language of ‘bin proportion’—fractions to express
probabilities—to compare the amount of the measure within bins of different sizes. Bin
proportion is obtained by dividing “the amount of the measure in one bin” by “the
amount of all the measure across the entire series.” The unifying mathematical question
fundamental to multifractal analysis is: “how does bin proportion change with timescale?”
We will thus discuss bin proportion and timescale mathematically throughout. We will
encode bin proportion as P and the timescale as L for the bin's ‘length’ or size (Fig. 6, top
right).
Multifractal analysis probes three major features of bin proportions.
It considers the fact that the bin proportion P is sensitive to bin size L, and the P ~
L relationship allows us to estimate this sensitivity in terms of the singularity
strength α.
It considers the fact that heterogeneity in bin proportions can be sensitive to bin
size and that the relationship between Shannon entropy of bin proportions and bin
size allows us to estimate this sensitivity in terms of the ‘Hausdorff dimension’ f
(Note: Shannon entropy yields –f in this relationship, and although it is useful to
recognize the appearance of Shannon entropy in this calculation of heterogeneity,
the convention in the multifractal analysis is to use negative Shannon entropy to
quantify the positive value f rather than f which the positive Shannon-entropy
formula would yield; Halsey et al. 1986).
It considers how the items in the earlier two points change together as this
procedure is generalized to emphasize different-sized fluctuations gradually, using
a q exponent to accentuate larger or smaller bin proportions.
We would like to highlight that L and q operate complementarily. If the multifractal
analysis is envisioned as a mathematical microscope or telescope, then L and q can be
envisioned as the zoom and the focus. L helps us ratchet down or up in timescales to see
a smaller or bigger picture, respectively. But at any timescale, we may want our
telescopic view to accentuate different wavelengths of light. So, in this metaphorical
sense, if the multifractal analysis is a telescope pointed at the heavens, L is the
parameter that lets us zoom out to see a constellation in Earth’s autumn sky or zoom in
on this or that distant galaxy that the human eye could not resolve. And for this same
metaphorical telescope, q is a filter helping us to focus the reddish light (at the expense
of the blueish light) or the bluish-to-violet light (at the expense of the reddish light) and
the colors in between. In other words, L allows us to zoom in or zoom out from finer vs.
coarser timescales (e.g., milliseconds vs. centuries) instead of individual points of light
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
vs. whole-sky light, respectively, and q allows us to focus on the parts of the series with
more or larger events vs. parts with fewer or smaller events.
Setting up a rainfall example: Observed proportion P changes with observation
time L
The questions about L-dependent features and their q-dependent relationships can be
posed in terms of the daily rainfall example. As rainfall is observed over many days, the
amount of rainfall can be compared to the time length of measurement. For instance, let
us imagine that an impatient observer measures rainfall only for two days; if it rains the
same amount two days in a row, then each day represents half of the total rainfall over
two days. However, if it rains three times as much on the first day as it rains on the
second day, then the first day’s rainfall represents 75% of the total rainfall over two
days. Let us say that a more patient observer measures rainfall for ten days. In this case,
if it rains the same amount every day, then each day’s rainfall represents 10% of the
total rainfall over ten days.
Let us consider how these outcomes align with our examples of possible outcomes
for our series. If we assume the simplest case of perfect stability (e.g., perfectly uniform
amounts of rainfall across days), then we should predict that each day represents the
same proportion of all the total rainfall over multiple days and that the proportion of the
rainfall represented by a single day will decline with the increasing number of days.
Mathematically, we find the following relationship: Each day’s proportion of uniform
rainfall = 100% of total observed rainfall / length L of time in days observed. This
relationship is a power law where the day’s proportion P is a power law of the length of
time L, that is, P ~ L. This relationship is estimated on logarithmically scaled axes
because logP ~ logL.
The singularity strength is the slope of the logP ~ logL relationship. Here,
‘singularity’ indicates that the power-law exhibits a single form across all timescales. That
is, no matter how many more days a perfectly uniformly distributed process is observed,
this same relationship holds. The larger the singularity strength, the more sensitive bin
proportion is to bin size.
Multifractality increases with heterogeneity across bins
Perfectly uniform changes over time yield zero multifractality
Just as white noise converges to a stable standard deviation in the long run, the same
singular relationship will be seen when considering longer bins (i.e., greater amounts of
time and so longer subsets of the series) of perfectly homogeneous time series. For
instance, pooling the rainfall over nonoverlapping 4 days (days 1 through 4 in one bin,
days 5 through 8 in another bin, days 9 through 12 in yet another bin, and so on…)
allows examining rainfall in 4-day increments. And for a very long series, the average 4-
day proportion of total rainfall will converge towards the P = L1 relationship noted above.
Now, we explore the concept of singularity strength developed in the previous
section using the rainfall example. The singularity strength is a power-law exponent that
describes how bin proportions grow with bin size. In other terms, if only one short subset
of the series is considered, the singularity strength tells us how representative that
subset if of the entire series. In the simplest case of uniformity outlined above, the
singularity strength is 1. Singularity strength of 1 is effectively a baseline in which what
the bins represent is purely due to bin size because of the homogeneity of the series.
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
Homogeneity makes multifractality zero, whether homogeneity is perfect stability
or a white-noise addition of all possible frequency variations. If rainfall is perfectly
uniform or white noise, then at any bin size—any timescale—the proportion of the total
rainfall found in a random bin of that size can be predicted. For instance, looking at either
half of the series allows seeing half of the total rainfall; looking at any quarter of the
series allows seeing a quarter of the total rainfall. A one-to-one relationship exists
between the proportion of time and the measurement proportion for all timescales. So,
these timescales can be characterized by the same singularity strength: one and only
one power-law relationship exists between bin proportion and bin size (i.e., Bin
proportion ~ Bin size). The fact that only one power law describes this relationship (i.e.,
with exponent = 1) means that only one singularity strength exists, and so the range of
singularity strengths is zero, entailing an absence of multifractality.
Multifractal spectrum accrues width: Describing more heterogeneity requires
more singularity strengths
Singularity strength varies from 1 as the series becomes more heterogeneous. One way
to think about changes in singularity strength is an inverse representativeness. So, larger
singularity strength corresponds to steeper logP ~ logL relationships in which shorter
subsets of the observations contain relatively less of the measure relative to the entire
series. For our rainfall example, larger singularity strength suggests that short glimpses
of rainfall over a few days contain much less rainfall than the long-run rainfall over the
years. On the other hand, smaller singularity strengths correspond to shallower logP ~
logL relationships in which shorter subsets contain relatively more of the measure
compared to the entire series—less than the larger bins can hold, but more than shorter
subsets would hold for more homogeneous series.
The heterogeneity that changes the singularity strength can be attributed to the
events characterizing the time-series variation: relatively short and explosive or relatively
long and slow-growing. Steeper-than-slope-of-1 logP ~ logL relationships mean that the
smaller bins (i.e., shorter subsets) contain much less of the measure than they would if
the series was homogeneous. In other words, the larger bins carry more weight; that is,
the series distributes its variability very slowly, such that the variations are slow-moving.
On the other hand, the shallower-than-slope-of-1 logP ~ logL relationships mean that the
smaller bins contain much more of the measure than they would if the series was
homogeneous. In other words, they indicate very abrupt and brief—but huge—spikes in
the series. Dramatic increases on very short timescales may saturate the bins at ever
smaller timescales in this case, with slight variation left to capture with the larger bins
(i.e., longer subsets). In sum, lower singularity strengths suggest the contribution of
brief but large variations, and higher singularity strengths suggest slower, more gradual
variations.
A key point to remember is that multifractality is entirely about the number of
singularity strengths estimated from the measured series. The most straightforward way
to estimate multifractality is in terms of the range of the singularity strengths (i.e.,
maximum – minimum). So, perfectly uniform rainfall and homogeneous white-noise
rainfall show zero multifractality, as in both cases, the bin proportion of the rainfall is a
single stable function of the observation timescale. We also need the Hausdorff
dimension f to complement singularity strength α, but first we need to solidify this
concept of singularity strength. Currently, Hausdorff dimension f is much less explicit in
the issue of multifractality and multifractal spectrum width. Hausdorff dimension f is the
vertical position of points on the multifractal spectrum and has a deeper relevance to be
reviewed in a later section (“Estimating multifractality in terms of how the singularity
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
strength changes with magnitude” below; see also Section 9.2 in Supplementary
Material), that is, as affording a criterion including or excluding for singularity strengths α
from the multifractal spectrum. Specifically, as we detail later, singularity strength α for a
given value of q only becomes part of the multifractal spectrum only under two
conditions: (1) the logP ~ logL relationship generating the estimate of α for the current
value of q is linearly stable; and (2) the Shannon-entropy ~ logL relationship generating
the Hausdorff f for the corresponding value of q is linearly stable as well (Fig. 6; bottom
right). The use of Shannon entropy allows us to offer the initial introduction that
Hausdorff dimension f speaks to how the statistical variety available in the measurement
changes as we coarse grain with progressively larger bins.
The word ‘multifractality’ is a little more opaque than necessary—Mandelbrot
(2013) himself recognized as much and complained that other scholars had renamed his
modeling strategy poorly. This term is cluttered with the ‘fractal’ root, indicating a
fractional singularity strength (i.e., because the only singularity strength seen above so
far is equal to 1). Saying that singularity strengths can be fractional means that, whereas
Bin Proportion ~ Bin size1, it is equally possible to observe series for which Bin Proportion
~ Bin size1.3 or Bin Proportion ~ Bin size0.6. So, for multiple reasonable mathematical
possibilities, mean and standard deviation can suffice because, just like linear
autocorrelation or the Fourier transform, multifractality can be zero-valued. But for most
empirical series, the singularity strength can take on multiple and often fractional value.
The road to nonlinearity is paved with systematic error in linear parameters:
Three reasons that using linear fits on double-logarithmic space to fit power
law exponents is not a contradiction
A reviewer has asked what we expect could be a common concern: Is it a contradiction
that our proposed route to nonlinearity relies on linear fits of a power law? The answer is
‘no, and the reasons are many. First, just as we noted above in the section titled “Not
simply curviness but a failure to reduce to a sum,” apparent curviness is a poor guide to
nonlinearity. The nonlinearity of a power law has fooled many readers into the incorrect
conclusion that fractional Gaussian noise, a.k.a. ‘1/f’ or ‘pink noise, is nonlinear (Kelty-
Stephen & Wallot, 2017). In actual fact, fractional Gaussian noise is strictly linear
(Mandelbrot & Van Ness, 1968; Wagenmakers et al., 2004, 2005). Second, it is only the
variety of the possible linear slopes in this double-logarithmic space that gives us any
foothold into nonlinearity. The linear model has always allowed its parameters to vary
unsystematically. The only catch for the linear model is that it has no traction for
expecting systematic error. And multifractal spectrum is a systematic kind of instability in
the autocorrelation. Third, nonlinearity is not even readable in the single patterning of
one multifractal spectrum. As we highlight below in the section “Multifractality is a
heterogeneity which may reveal deeper nonlinearity, nonlinearity does not come into
focus except with comparison of the original series to synthetic series with matching
linear structure. At present, we are only introducing the analytical lens for eventually
finding nonlinearity.
Two equivalent ways to construe irregularity: Over different times or over
different magnitudes
Multifractality results from the measured series displaying irregular temporal sequence.
This point means that, suddenly, the measured process develops over time in a way that
does not resemble past changes over time. That is, multifractality grows beyond zero,
indexing temporal asymmetry. To put multifractality in contrast with more familiar time-
series structure, white-noise epitomizes time symmetry. So, multifractality contrasts with
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
white noise. And to foreshadow later sections, multifractality is evidence of nonlinearity
when it indexes time-asymmetry different from what a linear model of the measurement
would predict.
The changes exhibiting irregularity that registers multifractality (i.e., nonzero
range of singularity strength) can be conceived in two ways, and both yield
mathematically the same results. The first option is to think about irregularity as changes
in singularity strength over time. A rough way to characterize this option might be that,
for an irregular series, the first half of the series exhibits one power-law P = La, and the
second half exhibits another power-law P = Lb such that a b. When examining white-
noise rainfall, these halves would each contain the same proportion of the total rainfall.
The second option is that more rainfall days show one power-law P = Lc, and less rainfall
days show another power-law P = Ld, such that c d. When examining white-noise
rainfall, the days with less rainfall would show the same power-law as the days with more
rainfall. In both options, multifractality would be the nonzero absolute difference between
a and b or between c and d. In these brief examples of how series could change
irregularly, we do not suggest any reliable relationship between sequence and size. For
instance, there would be no cyclicity, and neither are we suggesting that the first half of
the series contains all of the larger daily rainfall values (i.e., that a = c). The goal is to
only illustrate that singularity strengths can vary with the measurements’ time and
magnitude. The next section will address how we estimate this from variation.
Despite the conceptual access to multifractality through these two modes (i.e.,
‘change with time’ or ‘change with magnitude’), many multifractal analyses assume the
latter mode as the default. That is, most multifractal analyses proceed by evaluating
singularity strengths for measurements of different sizes as opposed to across different
time windows. A good reason for this strategy is that evaluating changes in singularity
strength with time is often puzzling. For instance, in the examples above, the singularity
strength was illustrated in perfectly uniform series, and to do so, how proportions were
spread across the entire series of rainfall was considered. To examine how singularity
strength changes with time, the measured series had to be chopped up into many
smaller series. There is nothing wrong with this approach, and it can be quite useful
(Grech & Pamuła, 2008), but there are some unresolved questions about how many
subsets to estimate and what length of subsets are going to provide informative
estimates (França et al., 2018). The good news is that various attempts converge on a
similar or even identical estimate of multifractality as the ‘change with magnitude’
method does, so it is not a question of accuracy (Ihlen, 2012). Both methods are just as
effective and easily used. The ‘change with time’ variant of multifractal analysis is only
slightly less common than the ‘change with magnitude’ variant.
We can also draw on our earlier exposition of singularity strength as an index of
the changes unfolding in the series. We had noted above what it might mean if the
singularity strength was high or low, that is, with steep or shallow logP ~ logL
relationship. We have a novel point to reveal that series can embody both high and low
singularity strengths—and multiple gradations in between. The very same series can
carry both brief-but-explosive and long-and-slow changes. Multifractal analysis serves as
a unified formalism to account for multiple kinds of events in heterogeneous form.
Indeed, it runs the risk that the ‘brief-but-explosive’ and ‘slow-and-gradual’ descriptions
of variation sound too redundant with oscillatory frequency (i.e., high and low,
respectively). However, if multifractality describes nothing else but frequency, then we
could simply return to the Fourier-transform—and series with high-, middle-, and low-
frequencies could just be white noise.
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
The multifractal formalism offers a way to understand heterogeneity as potentially
comprising multiple timescales of activity without melting down into a homogeneous
mess. We have already seen that proportion P and bin size L can support an estimate of
a singularity strength. As suggested above, bin size acts as a lens that zooms in or out.
What remains now is to tease out these different kinds of variation by examining the
series using that other parameter q. With q, no matter which bin size we look at, we will
aim to bring events of different sizes into focus, much as our metaphorical telescope
above might bring different colored light into focus or defocus at whatever zoom we have
settled on, for example, the scale of the whole sky or the individual points of light.
Estimating multifractality in terms of how the singularity strength changes with
magnitude
The ‘change with magnitude’ calculation of multifractality is a mathematically compact
way to proceed. It might seem like evaluating singularity strengths for ‘larger’
measurements (e.g., for days with more rainfall) or for ‘smaller’ measurements (e.g., for
days for less rainfall) could be as slippery as the question of how often to check for
changes in the singularity strength over time—for instance, should ‘larger’ vs. ‘smaller be
defined?’ These distinctions sound challenging to define and can be no more likely to be
correctly guessed than to identify all appropriate subsets to estimate in the ‘change with
time’ strategy above.
Using q to elaborate proportion into ‘mass’
Fortunately, the multifractal formalism manages to generalize the ‘larger’ vs. ‘smaller’
distinction into a more continuous framework by spreading this dichotomy across several
degrees of relative sizes. This strategy bypasses the need to rely on any single cutoff for
magnitude to determine the singularity strength. This feat is accomplished using an
exponent abbreviated as ‘q,’ a parameter that can be set to any real value (Fig. 7). There
is no correct value of q, and more values of q could potentially reveal new singularity
strengths and Hausdorff dimensions (We address the decision process that we take in
both below in this section of main text; see also Sections 4.3 and 16 in Supplementary
Material). The objective is not necessarily to uncover ‘all of the multifractality’ with ‘all of
the right q values.’ Instead, the objective is to show that a variation of singularity
strength and Hausdorff dimension with any values of q. Whatever value of q is used, it is
applied to the proportion estimated for any given bin, and applying the q exponent to a
bin proportion generates a bin ‘mass’ µ.
Although this usage of the term ‘mass’ is a convention of multifractal formalism
(Halsey et al., 1986), it may evoke some thoughts about time-series heterogeneity.
Specifically, it addresses the evenness of the distribution and clumping of events. Indeed,
this ‘mass’ gets at the