PreprintPDF Available

Abstract and Figures

The creativity and emergence of biological and psychological behavior tend to be nonlinear—biological and psychological measures contain degrees of irregularity. The linear model might fail to reduce these measurements to a sum of independent random factors (yielding a stable mean for the measurement), implying nonlinear changes over time. The present work reviews some of the concepts implicated in nonlinear changes over time and details the mathematical steps involved in their identification. It introduces multifractality as a mathematical framework helpful in determining whether and to what degree the measured series exhibits nonlinear changes over time. These mathematical steps include multifractal analysis and surrogate data production for resolving when multifractality entails nonlinear changes over time. Ultimately, when measurements fail to fit the structures of the traditional linear model, multifractal modeling allows making those nonlinear excursions explicit, that is, to come up with a quantitative estimate of how strongly events may interact across timescales. This estimate may serve some interests as merely a potentially statistically significant indicator, but we suspect that this estimate might serve more generally as a predictor of perceptuomotor or cognitive performance.
Content may be subject to copyright.
Multifractal test for nonlinearity of interactions
across scales in time series
Damian G. Kelty-Stephen1, Elizabeth Lane2, Lauren Bloomfield3, and
Madhur Mangalam4
1Department of Psychology, State University of New York-New Paltz, New Paltz,
NY, USA
2Department of Psychiatry, University of California-San Diego, San Diego, CA,
USA
3Department of Psychology, Grinnell College, Grinnell, IA, USA
4Department of Physical Therapy, Movement and Rehabilitation Sciences,
Northeastern University, Boston, MA, USA
ORCIDs:
Damian G. Kelty-Stephen (0000-0001-7332-8486)
Madhur Mangalam (0000-0001-6369-0414)
E-mails: keltystd@newpaltz.edu; m.manglam@northeastern.edu
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Abstract
The creativity and emergence of biological and psychological behavior tend to be
nonlinear, and correspondingly, biological and psychological measures contain
degrees of irregularity. The linear model might fail to reduce these measurements
to a sum of independent random factors (yielding a stable mean for the
measurement), implying nonlinear changes over time. The present work reviews
some of the concepts implicated in nonlinear changes over time and details the
mathematical steps involved in their identification. It introduces multifractality as a
mathematical framework helpful in determining whether and to what degree the
measured series exhibits nonlinear changes over time. These mathematical steps
include multifractal analysis and surrogate data production for resolving when
multifractality entails nonlinear changes over time. Ultimately, when
measurements fail to fit the structures of the traditional linear model, multifractal
modeling allows making those nonlinear excursions explicit, that is, to come up
with a quantitative estimate of how strongly events may interact across timescales.
This estimate may serve some interests as merely a potentially statistically
significant indicator of independence failing to hold, but we suspect that this
estimate might serve more generally as a predictor of perceptuomotor or cognitive
performance.
Keywords: fractal; Fourier; heterogeneity; long-range memory; Markov;
multifractal nonlinearity; non-Gaussian; self-report; surrogate testing
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Introduction
We hope in this work to make the case that multifractal modeling is crucial for
psychological science—for theory no less than for data analysis. We will begin with
a familiar measure (i.e., self-report) and a more familiar feeling (i.e., wondering
about nonlinearity in data). Then, we will make the case that multifractality
addresses what is troubling about the familiar measure and elaborates on our
scientific ability to act on the familiar feeling.
“Do I want to understand the nonlinearity in my psychological measures?” is
a self-report measure readers may make at the outset of reading this manuscript.
Psychological sciences deal heavily with self-report as an often effective, expedient
way for a quick look at the underlying thought processes. Then again, self-report is
no less a critical filter for us as researchers on our planning, for instance, as we
carry our research work forward, no matter the project. For instance, “Am I
planning the right study?” or “Do I have the right measures?” are two critical self-
report measures we can all relate to, no matter the research domain. So let us
acknowledge and explore the self-report measure of whether each of us should want
to explore nonlinearity in our psychological measures. Most self-report measures in
practice can prompt intriguing answers, and though they do not always prompt the
correct answer every time and across issues (Jeong et al., 2018), they have a curious
psychometric texture that may capture the thought process in an interesting
dynamic, which can be helpful in the proper context (Baer, 2019).
The familiarity of the self-report hides the maybe alarming truth that self-
report poses severe challenges to our most familiar linear models. Long-lived
psychological measures like self-report exhibit a perfect storm of violated statistical
assumptions to prevent the linear model from linking any measures of cause and
effect. Lurking below the statistical concern of a well-behaved linear model is the
more challenging question of whether psychological causes and effects can be
entirely linear. Do self-report measures meet the necessary criteria for applying the
linear models for assessing cause in the first place? The answer is: in the long run,
they do not. Perhaps they could meet those criteria under contrived constraints
when only looking at a small handful of self-reports. However, for all we have
known about the challenges posed by self-report measures, our use of self-report
measures has been anything but short-run: self-report measures have persisted for
over a century of psychological research (Baumeister et al., 2007). Furthermore, if
we consider how comfortable we may be consulting our feelings every day and at
every turn, it might feel somewhat surprising to realize that this most intuitive and
accessible kind of measure may be one of the least amenable to linear modeling.
Moreover, self-report is just the most accessible example of a more general tangle
of logic for psychological science making causal inference from its measurements.
In what follows, we aim towards a larger incompatibility: no matter the
measurement, psychological cause-and-effect relationships are time-sensitive, but
linearly modeled cause-and-effect relationships are not. Multifractal modeling is
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
ready to fill this void with a means to explicitly estimate the strength of interactions
across time scales.
What a linear model needs in order to give us a valid test of cause
Mean, variance, and autocorrelation
resuming that measures like self-report could support linear estimates of cause-
and-effect relationships, what would we need for a linear model? It is important to
remember the fundamental criteria for a linear model. It is ideal for linear
modeling that measures show: 1) stable mean, 2) stable variance, and 3) stable
autocorrelation over time (Lutkepohl, 2013; Mandic et al., 2008). These criteria
often appear in the statistical literature as the assumption of independent and
identically distributed (i.i.d.) Gaussian noise, with ‘independence’ implying the lack
of memory or sequential structure and ‘identically distributed’ implying stable,
similar variance. The measures show Gaussian distribution when the very many
constituent causes shaping that measure all add together.
Where have we seen the autocorrelation? Three possibilities
Possibility 1: Autocorrelation encodes the past in terms of regular
previous intervals. The autocorrelation exists on the boundaries of many a
statistical training—psychologists doing factorial designs may never need it, and
psychologists using time-series designs may need it very often. So, a brief summary
of possibilities is warranted. First, the autocorrelation offers us a way to encode the
correlation of a current measurement with a previous measurement of the same
process. The autocorrelation is in effect a set of regression coefficients (i.e.,
indicating “autoregression”), one for each possible time lag. Lag here means “how
many previous measurements ago” with each coefficient representing the
relationship of current measurement with each past measurement. So, for instance,
let us say we are measuring response time (RT), and we may realize that current-
trial RT is positively related to RT on the previous two trials and much more on the
just-previous trial than on the trial before the just-previous one. That would
amount to, in this example, an autocorrelation with large positive lag-1 coefficient
and a smaller but still positive lag-2 coefficient. And perhaps, for the sake of the
example, we might imagine every three trials were somehow similar and priming of
one another. That is, RT might decrease due to priming three trials before, and this
relationship would manifest in a negative lag-3 coefficient.
Possibility 2: Autocorrelation (or Fourier amplitudes) encodes the
past in terms of periodic cycles.
A second possibility is that the autocorrelation can encode regular cyclic change
into linear models. Indeed, some time-series approaches might avoid the
autocorrelation function by name, but they will often use the amplitude spectrum
from the Fourier transform. The amplitude spectrum (or for amplitude squared, the
power spectrum) encodes the size of the oscillations for a wide range of spectral
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
frequencies (or inversely, wavelength). Wavelength is just another another way to
specify lag: for instance, the time it takes for a cycle to unfold indicates the time
between similar rises and falls in the measurement. It is thus no coincidence that
the autocorrelation bears a one-to-one relationship with the amplitude spectrum of
the Fourier transform (Wiener, 1964). Psychological examples of periodic cycles are
circadian rhythms of wakefulness versus rest, bouts of consumption/production as
between famine and feast, oscillation of limbs during entrainment to a metronomic
or musical beat (e.g., Haken et al., 1985).
The Fourier transform is an almost universally available description of a
series. Almost all series yield a Fourier series of amplitudes for oscillations of all
possible frequencies. This almost inevitable availability of a Fourier series is that
oscillations are almost always present in measurements at some timescale
(Bloomfield, 2004). One might not be able to estimate the Fourier series if the
series had no oscillations, that is, unbounded growth or decay of complete stasis.
However, nonoscillatory series are extremely rare and difficult to identify in
practice. The lack of generative theory for nonoscillatory processes goes hand in
hand with the statistical problem of determining when the measured oscillations
might be reducible to different kinds of noise, for example, whether measurement
noise or noise modeled through moving-average detrending (Dambros et al., 2019;
Spencer-Smith, 1947; Wang et al., 2013). All psychological or biological systems
should recur and fluctuate (Riley & van Orden, 2005)—even the simplest physical
models used to build theories of psychological processes (Richardson, 1930).
Hence, our measurements in psychology should be oscillatory and thus have
estimable Fourier transforms. The primary concern within the
psychological/biological domain is not the presence of oscillations but rather the
stationarity of these oscillatory modes (given the one-to-one relationship of Fourier
transform and linear autocorrelation, compared to the question of stationarity of
the autocorrelation). A long-standing controversy about the applicability and
interpretability of Fourier modes is whether/how they converge (Bloomfield, 2004;
Paley & Zygmund, 1930). It is much the same issue of a mean being stable for
heavy-tailed probability distribution functions: we can always calculate the
arithmetic mean from a sample of measurements, but the easily calculable mean of
a heavy-tailed process may or may not support the interpretation that the mean of a
Gaussian or thin-tailed distribution would (Richardson, 1926; Shlesinger et al.,
1993). In the case of the Fourier model, then, the always-(to-our-knowledge)-
oscillatory psychological measures will always allow calculation of the Fourier
series, and the lingering questions are not about the existence of a frequency
domain but rather about the stability of the estimated amplitudes for the oscillatory
modes available to measurement (Singh et al., 2017).
Possibility 3: Linear models omitting autocorrelation assume zero
memory which is just zero autocorrelation. A third possibility of
autocorrelation is an implicit zero. When psychological research does not deal
explicitly with autocorrelation, it is not a denial of the role of autocorrelation but
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
merely an implicit assumption of ‘no memory’ or ‘independence.’ We do not need to
carry all those coefficients of the autocorrelation function if we can just assume ‘no
memory,’ that is, that all coefficients are zero. Another way to think of zero memory
is ‘white noise,’ a pattern of memoryless variability in which measures oscillate at
all timescales, and the magnitude of oscillations is comparable at every timescale
(Baxandall, 1968; Forgacs et al., 1971). Indeed, the always zero autocorrelation
function is mathematically equivalent to a set of sinusoidal oscillations with similar
amplitude for all oscillatory periods. In this sense, the standby assumption of ‘i.i.d
Gaussian’ variability is usually an assumption of additive white noise. White noise
itself is an elegant way to generate a Gaussian distribution, that is, to sum together
many sinusoidal oscillations of all available frequencies (or inversely, wavelengths)
with all uniform amplitudes (Pearson, 1905).
Nonstationarities and quick fixes to overcome them in linear modeling
Types of nonstationarities
The stationarities of mean, variance, or autocorrelation each are capable of failing
alone. Stationarity of mean could fail when there is a persistent trend in the mean,
while variance around that mean stays the same (Fig. 1A). Stationarity of variance
could fail separately from stationarity of mean: our self-report might reverberate
differently across time, all the while maintaining the long-run trait, much like the
audio waveform of a high-hat cymbal in a jazz drum solo, shimmering wildly into
large positive and negative micropascals of pressure at intervals but always
centered around the zero marks in the middle (Fig. 1B). Similarly, the linear
autocorrelation could be nonstationary without necessarily changing the mean and
variance (Fig. 1C). The schedule of events or sequence of trials can shift without
aggregate rise or fall and without change in aggregate dispersion.
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
Fig. 1. Examples of mean, variance, and linear autocorrelation failing to be stationary.
(A) Variability in mean. (B) Variability in variance. (C) Variability in autocorrelation.
Quick fixes
If only one part of the measure is nonstationary, quick fixes can massage the
measure back into conformity with linear-modeling criteria. For instance, a series
of response-time measures can easily show a long sequence of quick responses or a
bout of very long, slow responses (Bills, 1927, 1931, 1935; Holden et al., 2009; Van
Orden et al., 2003). Logarithmic scaling is a quick and surefire way to bring the
excursions of the mean into polite restraint. Another example is the temporal
structure of mouse-tracking trajectories to study online decision processes. In this
case, competing computational processes in the computer’s operating system can
produce rather spurious contingencies between successive measurements of the
mouse cursor over extremely brief timescales. Research using mouse-tracking has
191
192
193
194
195
196
197
198
199
200
201
202
203
developed elegant means of ‘time-normalizing,’ that is, coarse-graining this
measure and imposing, at a longer timescale, a more regular sequence on poorly
sequenced finer-scaled raw measurements (Kieslich et al., 2019).
A perfect storm in our measurements: Failures to meet linear-modeling
criteria accrue and long-memory can compound the problem
Psychological experiences are unstable in the short term
Here, our ability to infer cause from familiar measurements implodes under the
demands of the linear model. Attempts to linearize a measurement series may only
go so deep. Failures to meet the required criteria can accrue, and the quick fixes
may only uncover persisting instability as in the case of self-report measures
(Olthof, Hasselman, & Lichtwarck-Aschoff, 2020). First, an ongoing series of self-
report measures will exhibit nonstationarity of the mean and variance. That is, the
long-term changes in self-report measures exhibit intermittent, irregular, and
abrupt transitions from periods between heightened self-report measures to lower
self-report measures. Second, these transitions can be so frequent that the valid
prediction window ranges from 3 to 5 successive self-report measures. So, despite
all the expectable regular rhythms we might use to model our self-reports, other
less regular events might cause new variations. Nonstationarity could be
contagious, spreading from the mean to the autocorrelation function (Horvatic et
al., 2011).
Psychological experiences are unstable in the long term too
The strategy we mentioned of coarse-graining measures (e.g., in time-
normalization) reflects a more profound belief that fluctuations even out in the long
run. This belief is a core value of the linear model, with its expectations of
stationarity. Indeed, we know from cognitive psychology to beware of the ‘hot hand’
fallacy in which we might easily see patterns where there are none in the long run
(Gilovich et al., 1985). So, wary cognitive psychologists may find it intuitive that, in
the longer run, average behavior starts showing less sequential variation and less
sequential memory. However, the complete opposite can emerge: when we coarse-
grain our measurement series for averages over longer time windows, our
psychological-scientific measurement series can show stronger sequential
variations in the long run—or at least show sequential variations resembling the
shorter-run variations and that persist over a much longer timescale than the
premises of linear modeling suggest (Fig. 2).
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
Fig. 2. This plot depicts a linear series with ‘fractal structure’ (i.e., 1/f noise). This series
has an autocorrelation that ‘diverges,’ which means that the correlation between the
present with previous values dwindles only very slowly with greater lags between the
present and previous values, decreasing at a power law with timescale with a fractional
power-law exponent. The autocorrelation with this power-law dwindling is not pictured
in this plot, but the example series in the time domain illustrates how contrary to the
common assumption, variance may not necessarily stabilize in the longer run. First, we
discuss in text the practice of using coarse-graining to diminish the ‘noisiness’ of shorter-
timescale variation, but then we discuss how fractal structure may thwart this intuition
by presenting a case in which longer-timescale variation may not be much less than
shorter-timescale variation. The green arrows (from roughly 18 to 28 days and then from
55 to 65 days) highlight shorter timescales, and the green brackets indicate a rough
approximation of the range of values in those roughly 10-day spans. The blue arrows
(from 18 to 48 days and 64 to 104 days) highlight longer timescales, and the blue
brackets show a rough approximation of the range of values in those 30- and 40-day
spans.
The perhaps unintuitive possibility is that longer-term averages may be no
more stable than short-term averages, and the reason is 'long memory,' that is,
initially short-scale variability persisting across long scales.
Long-memory structure appears in self-report measures (Delignières et al.,
2004; Olthof, Hasselman, & Lichtwarck-Aschoff, 2020) as well as in various equally
tried-and-true psychological-scientific measures: response times, lexical decision,
and word naming (Gilden, 2001; Holden et al., 2009; Kello, 2013; Kello et al.,
2008; Van Orden et al., 2003). Long memory entails ‘fractal scaling’ (shortening
the phrase ‘fractional scaling’) which means that these measures exhibit the
autocorrelation diminishing very slowly with lags, that is, at a scale-invariant power
function with a fractional exponent.
Memory in our measurements can mean that instability becomes
permanent
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
This colorful long-memory might initially lull us into the idea that, for instance,
more memory could entail more predictability. However, we should clarify that
here the opposite can be true. Long-memory only means that variability grows
similarly from shorter to longer timescales. Now let us recall how grim our
prospects are in that case: if self-report measures show highly nonstationary mean
and variance and a brief prediction window, then long-memory suggests that what
extends similarly over time is the difficulty in predicting very far. Predictability falls
off every 3 to 5 successive self-report measures as noted before (Olthof, Hasselman,
& Lichtwarck-Aschoff, 2020), but then, long-memory means that predictability
suffers even further, repeatedly across time, becoming ever more unfeasible as
short-run instability echoes out across instabilities over the longer run.
Psychological causes are time-sensitive, but strictly-linear causes are
not
Readers on the fence about nonlinearity might find nothing new in these concerns.
For instance, we have long known about the instability of variance under the term
‘heteroscedasticity’ (Tabachnik & Fidell, 2007). If we have a quick fix, what is the
harm? However, our concern is that fitting our misbehaving measurements into a
shape that linear models will recognize may have diminishing returns. At a certain
point, quick fixes shaving off this or that nonstationarity ignore the possibility that
these nonlinearities reflect deeper truths about real causes underlying
psychological processes. We can squeeze our measurements into a shape that will
fit a linear model only so long before they cease to resemble the actual measured
behavior.
Linear models freeze any effect of history
Linear modeling in psychology suffers from an incompatibility over time:
psychological processes are rooted in experiential past and arcing towards
anticipated futures, and linear modeling are ultimately time-insensitive. As elegant
as linear modeling can be, it always aims at the same additive and time-
independent source of variance.
A question that immediately follows these considerations is: would the
linearly-modeled cause offered above even be recognizable to our knowledge of how
cognition works? Our own self-report would be ‘not necessarily.’ For instance, if our
response-series measured the time taken to read individual words in sequence (e.g.,
Wallot et al., 2014), then the autocorrelation is an expression of how each word-
reading time might depend on previous words in the sequence of words having
been read. Of course, it makes good sense that we read each word considering the
words before it. Moreover, long-memory suggests that each current word-reading
event always carries some distant effect of long, long-past words. The power-law
decay of autocorrelation in long-memory does mean that the current values of our
measure have a dwindling correlation to past values with larger lags. Nonetheless,
the scale-invariant shape of the power-law entails that the autocorrelation dwindles
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
without ever converging to zero. So, then, yes, we read each word with a memory of
how the story started.
But this linear-modeling story becomes more challenging for how to interpret
the autocorrelation in our theories. The deep problem for strictly-linear psychology
is that linear autocorrelation entails a twofold stricture: first, linearly-
autocorrelated sensitivity to experiential past on the present never changes, and
second, the linear autocorrelation offers no larger context that might—at some
point—redefine effects of past constituent events. To the first part of the linear
stricture, a stable autocorrelation is handy for linearly modeling, but trotting out
the whole autocorrelation (N–1 coefficients for an N-length measurement) to
confirm causal modeling, though, maybe of limited theoretical value (Gilden,
2009). The autocorrelation entails that the contribution of activity at past lags is
always independent (Fig. 3A). Here is where the interpretation for our word-
reading example may stop feeling recognizable: a stationary autocorrelation
function with long-memory implies that all past words matter, but they do so
independently. That is, the time that, you, the reader spent reading the word
‘independently’ at the end of the last sentence depended on time reading the words
just before: ‘so,’ ‘do,’ ‘they,’ ‘but,’ ‘matter,’ and so forth. But the catch is that phrases
(or any event longer than a single word) would have no causal force of their own—
and here we catch a glimpse of the context that we will need for any event to mean
anything. For instance, there is no room in a linear autocorrelation to indicate
syntactic structure, such as the adjective ‘past’ might modify ‘words.’ Gone from the
linearly-modeled past are regular features of reading.
Fig. 3. Two perspectives about how a measured series is analyzed. (A) The linear
autoregressive perspective takes the premise that each measure in time entails the
summing of random and independent factors. (B) The multifractal perspective takes the
premise that each measure entails interactions among component processes at many
nested timescales.
Context matters, to risk stating the obvious. If a careless writer omitted any
single word, then the linear autocorrelation offers no phrasal context here to
support the reader in comprehending the meaning of each current word in the
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
greater context of a narrative. Here we have the second constraint of the linear
structure: the autocorrelation function is definable only at one scale (e.g., the scale
of individual words), and implicitly, that makes impossible the use of one scale as
context for processing information on another scale (Kelty-Stephen & Wallot,
2017). And the lack of context or nesting of multiple scales is unfortunate: to date,
lexical priming research is clear that human readers depend on the context at
multiple different scales of experience (Troyer & McRae, 2021). The psychological
theory takes it for granted that intelligent use of information depends on
considering the information at one scale through a lens at another scale (Simon,
1969). Mind grows and adapts; linear models do not.
Contextualized meaning is unavailable to linear modeling but may be
available to nonlinear modeling
What we take for granted could be explicit in our models. What we hope leads our
readers off the fence and into the rich potential of nonlinear modeling is precisely
this point: modeling nonlinearity might operationalize these interactions across
scales, the contextualizing of meaning. Indeed, we can agree that context matters in
a broad class of cases, but modeling nonlinearity offers the possibility that the
contextualizations are quantifiable—and testable for anyone doubting that they
impact the measurement. Interactions across timescales are not just ineffable
truths but may find a quantitative expression that can generalize and support
formalism. The scaling relationships we have noticed in the autocorrelations might
not be just the coincidental sum of independently estimable contributions (Fig.
3B). Instead, they might be shadows cast by a thoroughly different nonlinear form
of cause than articulated by linear modeling—one of causes cascading across scale
rather than hopping from one independent point in time to the next. They might be
control parameters governing or predicting the sequence of psychological
experiences as they evolve across time. Furthermore, this last possibility is totally
out of the reach of linear modeling: even the most hierarchical linear model will fail
to define a sequence if only because the linear model is time-symmetric and
assumes order does not matter. That is, linear-modeled context could not lead a
process towards any outcome it had not visited before.
What are our choices then? We see two major options. First, we could just
give up on the idea of contextualized cause because linear models are inapplicable.
Or, second, we could try to model the strength of nonlinearity in our measures. Our
own self-report measure is that the latter option feels more productive. Self-report
measures are not going away, and we have no intention of advocating a
replacement. The same goes for the psychological measures showing similar
structure. In this vein, we have no interest in recommending a wholesale jettisoning
of standard psychological ontology or taxonomies. We have no reason to doubt the
reality of psychological experience as reported to or shared with us by participants
—whether through self-report or through any measurement they consent to. For
instance, the growth of multifractal modeling has not, in geophysics, led to the
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
proponents advocating the avoidance of traditional labels for multifractal things,
for example, ‘wave,’ ‘wind,’ ‘cloud,’ ‘storm,’ or ‘planet’ (Schertzer & Lovejoy, 2013).
The only doubt we bring is for the analytical framework that we use to explain the
phenomenological reports. This manuscript is an invitation for all interested in
facing up to the limitations of the linear model’s portrayal of our field’s cherished
measures—and in hopping over the fence into the nonlinear territory.
Multifractal modeling allows probing nonlinear causal relationships
across scales
We hope that the preceding text might have convinced some of our readers that
‘nonlinearity’ is not simply a curiosity far afield from respectable, long-standing
psychological research. No, we hope that we might have shown that nonlinearity
might be rooted deep in our daily work. In this vein, we are indebted to Olthof et
al.’s (2020) work on self-report measures, which raises the concern without causing
alarm. The rest of the work aims to positively approach these issues, addressing the
concepts of linear and nonlinear changes in their own rights. We will aim to nod to
the psychological examples but will try to keep this discussion on the logic and the
math. We will discuss the math through the example of rainfall. This example may
be disappointingly nonpsychological, but we use it for two reasons: (1) a
nonpsychological example series offers the concepts without cluttering the concepts
with existing psychological theory. Seeing a series that does not provide a
comfortable home for our most cherished intuitions can help us see the logic in
conceptual terms. For instance, we can invent mathematically helpful illustrations
that may not make any sense for this or that specific psychological example. (2)
Rainfall works neatly with the logic of multifractal analysis. Multifractal analysis
involves ‘binning’ our measurement, that is, seeing how much of our measurement
falls into nonoverlapping subsets. Abstract as some of the following math and
concepts may be, we think it may be helpful to consider rainfall as an immensely
tangible example. For instance, the rain can fall into actual bins, and we do not
have to lean both into the abstraction of math and the abstraction of psychological
theory.
The following text is a conceptual description of the logic, along with some
mathematical detail. Readers who find the conceptual treatment incompletely
satisfying may refer to our Supplementary Material examining the multifractality of
a small handful of speech audio waveforms, one of human speech and two of text-
to-speech (TTS) synthesizers reading a brief text (Fig. 4). All raw series and code
are attached for replicating the process.
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
Fig. 4. Speech waveforms for the phrase “Sherine Valverde and her husband Alessandro
are determined to teach their baby.” (A) Human speaker (20-years-old female). (B)
Text-to-speech (TTS) synthesizer: Voice Dream’s Samantha, standard. (C) TTS
synthesizer: Voice Dream’s Samantha, enhanced.
Nonlinearity: Not simply curviness but a failure to reduce to a sum
The analysis of linear or nonlinear changes over time evaluates whether or not the
measured series can be effectively modeled as a sum of independent random factors
(Fig. 3A). Nonlinear changes can mean that the series is not well-modeled as
merely a sum. The question of whether it is a sum is a foundational mathematical
issue and goes deeper than the relatively superficial question of whether changes
over time ‘look’ linear to the eye. Linear changes over time can include
curvilinearity; some nonlinear trajectories are perfectly compatible with modeling
the series as a sum. Time-series with peaks and valleys can invite a polynomial
418
419
420
421
422
423
424
425
426
427
428
429
430
model. Psychological examples of polynomial structure include a quadratic serial-
position curve, with a greater likelihood of recall for the earlier and the later items
in a list (Ranjith, 2012), or a linear decay of maze-completion errors by Tolman’s
rats once they were given a reward (Tolman & Honzik, 1930). Polynomial models
include linear effects of time and several powers of the time (e.g., quadratic and
cubic for second and third powers). Critically, polynomials are linear in the growth
parameters, or sums after all—sums of integer exponents of time (e.g., linear
growth is proportional to ‘time’ which is just time raised to the power of 1, i.e.,
time1,’ but quadratic profiles are just the sum of ‘time1’ with ‘time squared’ which is
just time raised to the power of 2, i.e., ‘time2,’ and cubic profiles are ‘time3 + time2 +
time1’ and so forth). Choosing to test nonlinearity as a failure to reduce to a sum or
a linear model is mathematically deeper than just eyeballing the plots. That is, a
curvy plot of data points over time may still reduce to a sum of independent factors,
but ‘seeming linear’ does not guarantee that it is actually linear. The linear model
(i.e., summing parts) can make very many seemingly linear changes with time, but
it can also produce very many curvilinear profiles. So, it is important to distinguish
between ‘seeming linear’ and being a linear sum.
The failure of a measured behaviors’ nonlinearity to reduce to a sum has
taken on a growing urgency to psychological sciences (e.g., Riley & van Orden,
2005). Multifractality can provide us deeper insights into the failure of a series’
nonlinearity to reduce to a sum of independent random factors (Fig. 3B). It is
quickly becoming clear that the failure of a series’ nonlinearity to reduce to a sum is
important to psychology. Beyond resemblance to old Gestalt wisdom that wholes
differ from sums of parts, estimates of multifractality have predicted outcomes in
executive function, perception, or cognition, such as in reaction time (Ihlen &
Vereijken, 2010), gaze displacements (Kelty-Stephen & Mirman, 2013), word
reading times (Booth et al., 2018), speech audio waveforms (Hasselman, 2015;
Ward & Kelty-Stephen, 2018), rhythmic finger tapping (Bell et al., 2019), gestural
movements during the conversation (Ashenfelter et al., 2009),
electroencephalography (Kardan, Adam, et al., 2020), and functional magnetic
resonance imaging (Kardan, Layden, et al., 2020). Our purpose is not to review the
empirical meaning of multifractality in psychological terms; this question may not
even be answerable in full at present. Our point is: Multifractality is the logical
consequence of processes that enlist interactions across timescales (Ihlen &
Vereijken, 2010), suggesting that it is essential to processes unfolding at many
rates, such as Gottlieb’s (2002) probabilistic epigenesis. However, the truth would
be better served with a broader set of scholars exploring the role of multifractality
in psychological processes. So, our purpose is to make the method more accessible.
This tutorial introduces multifractality as a mathematical framework helpful
in determining whether and to what degree a series exhibits nonlinear changes over
time. It is by no means the first to introduce multifractality; prior entries to the
multifractal-tutorial literature have sometimes taken a more conceptual
perspective, introducing nonlinearity over time as an interaction across multiple
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
timescales (Kelty-Stephen et al., 2013). Other tutorials have kept closer to detailing
algorithmic steps through the use of computational codes (Ihlen, 2012). The
present work aims to tread a middle ground, reviewing some of the concepts
implicated in linear and nonlinear changes over time and detailing the
mathematical steps involved. These mathematical steps include multifractal
analysis and surrogate data production for resolving when multifractality entails
nonlinear interactions across timescales. The present work makes the case that
multifractality may be crucial for articulating cause and effect in psychology at
large.
Multifractality: A type of nonlinearity for modeling processes that
develop through interactions across scales
Multifractality, a modeling framework developed in its current form about fifty
years ago (Halsey et al., 1986; Mandelbrot, 1976), is primarily a statistical
description of heterogeneity in how systems change across time. All mathematical
frameworks work by encoding more variability into a symbolic and logical
structure. Multifractality is no exception. What multifractality encodes is the
heterogeneity, and it encodes this heterogeneity as a range—maximum minus
minimum—of fractional exponents. These exponents represent the power-law
growth of proportion and timescale. This relationship between proportion and
timescale is a pervading question for any time we observe a changing system we
want to understand: all time-varying processes vary with time, and we are
constantly dealing with the issue that a smaller sample of the whole process tells us
something but not everything about that whole process. So, the important question
is, “how long do we have to look before we see a representative sample of the time-
varying process?” The proportion of the process we can see will increase the longer
we look, and that proportion increases nonlinearly, that is, with the proportion
increasing as a power function (also called a ‘power law’) of scale, with the
proportion increasing with scaleα. Multifractality becomes useful when there is not
simply one alpha value but when, for various reasons outlined below, there may be
many. That means that multifractality can help us understand how and why our
samples of observations may align with the broader structure of the time-varying
process. In summary, multifractality encodes heterogeneity as the range—
maximum minus minimum—of fractional exponents that govern the power laws
relating the observed proportions of heterogeneous changes to a specific timescale
(i.e., how a change in measurement relates to a proportional change in time).
Multifractality arose from a long history of scientific curiosity about how fluid
processes generate complex patterns (Richardson, 1926; Turing, 1952) and remains
one of the leading ways to model fluid, nonlinear processes—as initially intended in
hydrodynamics (Schertzer & Lovejoy, 2004) and more recently as a framework for
understanding the fluid-structure of perception, action, and cognition (Dixon et al.,
2012; Kelty-Stephen, 2017; Kelty-Stephen et al., 2021). In what follows, we unpack
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
both what multifractality is and why it is helpful for quantifying nonlinear changes
in series, with specific examples from perception, action, and cognition.
Multifractality differs from but does not replace mean and standard
deviation
What is the relationship between multifractality, mean, and standard
deviation?
Multifractality sits apart from the more familiar descriptive statistics like mean and
standard deviation—indeed, it does not replace mean and standard deviation. A
good reason for the widespread use of mean and standard deviation is that they
support a wide range of inferential methods to test the effects of many types of
hypothesized causes. However, when our hypotheses about causes begin to probe
the issue of changes over time, mean and standard deviation no longer suffice.
Mean and standard deviation never fail to be helpful or necessary to statistical
reasoning but fail to cover more complex relationships that evolve continually.
Therefore, the use of mean and standard deviation alone does not test hypotheses
about how systems change over time, that is, with the sequence, the time-
asymmetry, and the interactions over timescales that nonlinear series can exhibit.
Why are mean and standard deviation not enough to model how a
system changes over time?
Stable mean and standard deviation are necessary to the linear model, but they are
not sufficient for fully specifying a linear model. There is a third and often lesser-
known component composing the linear model, namely, the autocorrelation
(Mandic et al., 2008). Changes with time require us to acknowledge that the linear
model has not just two but three defining features: (1) mean, (2) standard deviation
(i.e., square root of variance), and (3) specification of the linear autocorrelation (or
equivalently the amplitude spectrum of the Fourier transform). The linear
autocorrelation describes how a given time-varying process correlates with past
behavior (e.g., how behavior over the current month resembles behavior over the
past month). As noted above, autocorrelation can appear through the function of
regression coefficients across lags, the amplitude spectrum of the Fourier
transform, or zero memory of ‘white noise.’
Linear models model measurements over time using the
autocorrelation function
What is entirely linear change in our dependent measures over time?
A linear model applied to a developing process assumes that it changes similarly
over time in the beginning, in the middle, and at the end of that process. This
symmetry over time that linearity assumes appears most clearly in the sinusoidal
waves that the Fourier transform uses to decompose a series—a sinusoidal wave
oscillates around its midpoint, extending into the future exactly as it had
throughout its past.
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
Ironically, this elegant model for changes over time is locked into repeating
the same changes over time. Perhaps it is not irony so much as it is so simple as to
sound absurd on statement: the change over time does not itself change. The
Fourier-transform’s use of sinusoidal waves to model changes over time reflects the
underlying premise that all changes reverse (e.g., all changes over time balance
out), or more widely known as ‘regression to the mean,’ the idea that what changes
over time will hover around the mean (e.g., what goes up must come back down).
Whether we use the linear autocorrelation or the Fourier amplitude spectrum to
quantify changes over time, the entailment is the same: the linear model necessarily
expects a relationship between the past and the present that does not itself change.
When should we use something other than the linear model to
understand changes in a system over time?
It is worth consulting nonlinear methods anytime the linear model fails to exhaust
the empirically observed variability. However, the linear model is an exceptionally
compact and effective statistical framework. Difference changes over time are
problematic, but the linear model is simple. So, statistical literature can often frame
nonlinear problems with linear solutions. Let us imagine that a developing process
exhibits different changes over time at the beginning versus at the end. We might
look for a simple fix to allow us to keep using the linear model—for instance, by
finding a proposed breakpoint where a sudden event brought about an abrupt
change, after which point, behavior followed an entirely different pattern. For
instance, we might take the same set of footpaths on the way to work every day and
back home every day, but an earthquake could suddenly damage or change the
footpaths with trees or structures along the way. After the earthquake, we might
take a radically different set of footpaths to get to and from work every day. So, if it
were our job to determine the best fitting model of our footpaths over very many
days, it might make a lot of sense to just fit one linear model for the pre-earthquake
footsteps and a separate linear model of post-earthquake footsteps. This find-a-
breakpoint strategy has rough-and-ready appeal, but it can become troubling to
generalize it. Crucially, this strategy essentially involves the scientist deploying
their unmodeled awareness of multiple timescales because one timescale of
observation does not generalize to the other.
One way to remove the individual scientist’s fingerprints from besmirching
the model is fitting more objectively meaningful long-term predictors for different-
sized time scales. For instance, if we take the more psychological issue of how
people spend money predicting daily spending behavior might involve looking at
short-term predictors such as individuals’ previous days’ spending behavior.
However, quite apart from these short-term, day-to-day changes, longer-term
trends may be more predictive. For instance, for a college student, spending
behavior may differ vastly during the school year from during the summer. Summer
months may offer the possibility of full-time employment, and so the effect of
previous days’ spending behavior may be entirely different during the longer-term
period of summer versus the rest of the year. Such long-term predictors are often
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
called ‘seasonal’ and suggestive of cyclic repetition, for example, summer arrives
reliably at the same point on the yearly school calendar. The challenge is that
identifying those long-term predictors requires not only theory and intuition but
also account for how these long-term predictors vary their meaning. Certainly, once
a student graduates or leaves school, then the cyclic effect of summer may
disappear as they begin to work full-time all year round. At such a point, the
relevant cyclic patterns useful for predicting spending behavior may change. A big
unknown throughout here: do we need to rely on an intelligent modeler to identify
specific independent timescales? Has the breakpoint-finding scientist not just
smudged more of their own interpretive fingerprints on the process? Or does the
time-varying process itself rely on interactions on timescales that are not
necessarily independent?
At least two good reasons exist to wish for an alternative to strictly linear
models of changes over time. First, changes over time’ may be noncyclically
continuous; that is, changes may shift over time without any simple breakpoints.
The lack of cyclicity may be necessary because a system may change without
returning to the initial ‘normal.’ It is essential to underscore that any expectations
of ‘regression to the mean’ result from the linear assumption of temporal
symmetry. The temporal symmetry of linear models means they look the same
played backward into the past do played forward into the future (Lutkepohl, 2013).
However, there is no statistical guarantee that what goes up must come down as
most measured systems grow, mature and decay (e.g., in climate change or
economic repercussions of a global housing-mortgage crash).
The second reason to wish for an alternative to strictly linear models is that,
even for cyclic change, the interaction among changes over time across cycles. For
instance, roughly cyclic periods like the year, the month, and the day can be easily
identified. However, a daily routine may vary considerably across the span of a
month (e.g., around weekly or biweekly salary payments), and it may vary further
across different months in a year (e.g., as holiday bonuses and time off allow
various ways to behave). The necessity to account for the interaction of various
differently scaled factors over time has prompted the need for multifractal
modeling. While there are hierarchical linear models that can estimate interactions
involving short- and long-term effects, they estimate these interactions with the
same expectation of “stationarity just like in simple linear models” (Singer &
Willett, 2003).
Examples of changes over time and how linear models can respond with
progressively less stable autocorrelations
Given the origins of multifractal modeling in fluid dynamics (Mandelbrot, 1974;
Meneveau & Sreenivasan, 1987), an apt example to consider is daily rainfall in a
given region. Daily rainfall can be measured in centimeters to examine how it
changes over time. Reasons for abrupt changes include elevation, humidity, and
temperature. Reasons for more sustained changes can include the seasons and
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
movement of tectonic plates. Appearance or disappearance of currents, winds, and
vegetation can also impact daily rainfall.
Perfect stability. The linear model aptly applies to changes in daily rainfall
over time under many circumstances. The simplest measurements of daily rainfall
include (1) no rainfall (i.e., perfect drought), or (2) always the same amount of
rainfall (Fig. 5A). In both cases, the present looks perfectly like the past and the
future. Such processes are temporally symmetric. They have perfectly stable means:
drought entails a zero mean for the entire series, and the same amount of rainfall
reflects a stable mean for the entire series. Of note, the standard deviation is zero in
both cases. Note that, we cannot easily characterize psychological variability as in
speech audio waveforms with this profile. So, if only to show the mathematical
possibility of perfect stability, rainfall is a little more accessible an example than
many a psychological measure that we always expect to fluctuate.
640
641
642
643
644
645
646
647
648
649
650
651
652
Fig. 5. Possible series of daily rainfall in a given region. (A) Perfect stability. The
simplest measurements of daily rainfall include no rainfall (gray line) or always the same
amount of rainfall (red line). (B) White noise. Uncorrelated random variation of daily
rainfall, varying according to white noise. (C) Uniform seasonality. The measurement
area may have a rainy season and, therefore, a cyclical rainfall pattern (e.g., more in
June and July than in April). (D) Irregular seasonality. Rainy seasons can come late one
year or early; they can come early multiple years and late the following year. Also, not
shown here, but wet or dry years can exist in which the rainy season varies in intensity
across years, decades, and centuries.
White noise. Another case perhaps more suitable to areas with more
temperate climates would be (3) uncorrelated random variation of daily rainfall,
varying according to ‘white noise’ (Fig. 5B). White noise is the statistical term for
the product of many independent processes. Calling it ‘white’ reflects an almost
poetic allusion to the fact that some of the earliest uses of the Fourier transforms
653
654
655
656
657
658
659
660
661
662
663
664
665
666
involved the application to electromagnetic radiation (i.e., light), and some models
of white light have indicated a broadband contribution of radiation oscillating at
many visible frequencies (Baxandall, 1968; Forgacs et al., 1971). Therefore, in the
long run, white noise epitomizes the temporal symmetry characteristic of linear
changes over time and the regression to the mean. A histogram of a white-noise
process will approach (over long timescales) a Gaussian (or Normal) distribution
with stable mean and standard deviation. This Gaussian profile of white noise is a
close statistical cousin of the binomial distribution that a fair coin would generate
for large samples of progressively longer sequences of coin flips (Box et al., 1986).
Crucially, white noise regresses to the mean in the long run and is uncorrelated in
time (here, ‘uncorrelated’ implies no correlation between rainfall across days,
weeks, or months). In this case, the average rainfall for one day, one week, or one
month is as good a predictor of the next day, week, or month, respectively, as of any
other day, week, or month, respectively. In other words, the average rainfall of one
time period predicts future time periods—in large part because all of this sequence
is statistically the same.
Uniform seasonality. The measurement area may have a rainy season
and, therefore, a cyclical rainfall pattern (e.g., more in June and July than in April;
Fig. 5C). So long as this rainy season begins and ends reliably with the exact dates,
the linear model will produce an adequate description of this rainfall. The changes
over the year would show a peak across months, but in this example, with perfect
timing of the seasons, these changes across months do not change from year to
year.
Irregular seasonality. The above examples are rare cases in practice but
help illustrate the temporal symmetry of linear models. However, rainfall is often
more irregular (Fig. 5D). Rainy seasons can come late one year or early; they can
come early multiple years and late the following year. Also, wet or dry years can
exist in which the rainy season varies in intensity across years, decades, and
centuries. When considering these longer timescales, the Fourier transform can
spread longer and longer sinusoidal waves, and similarly, the linear autocorrelation
can incorporate progressively longer lags or waves. In any event, as rainfall is
measured over long timescales, our linear model can be complicated with
progressively more factors. However, no matter how long the timescale or added
factors, the linear model’s constraint is that the cyclical patterns must be regular
across time.
Irregular seasonality and its potential connection with
nonstationary autocorrelation. This issue of irregular seasonality is
conceptually the same as the nonstationarity highlighted in Fig. 1C, where the
temporal structure varied across time, suggesting that we could need different
autocorrelation functions from the beginning to the middle, and from the middle to
the end of the series. Here we have just that issue noted above, and a Fourier
transform may be calculated, but that may be difficult to interpret. The Fourier
transform will probe a series for its oscillatory modes and estimate each frequency's
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
amplitudes. However, if the oscillations change over time, the Fourier alone is not
sensitive to that change. The Fourier model (and so the linear model, more
generally) will be definable, but it will not be specific to the structure available in
the measurement.
Nonstationary autocorrelation is a crucial failure of linear modeling
Whereas the first three of our examples above are amenable to linear modeling,
these last two begin to shake the foundations of linear modeling. Certainly,
irregular seasonality might look regular at a longer time scale, and a noisy fit of the
Fourier model or the autocorrelation is not in itself a problem (see next section “No
over ever said linear models have to be perfect”). However, the destabilization of
the autocorrelation is the clearest statistical symptom of the linear model losing its
foothold on cause-and-effect relationships. Specifically, unstable autocorrelations
definitely indicate that the effects of past events are changing, and an extremely
interesting possibility is that these changes could depend on events at another
scale. That is to say, long-term seasonality and short-term variation are not fully
separable for many behaving systems.
The choices for how to address this autocorrelational instability are plenty.
What we choose to do with measurements whose temporal structure changes over
time will follow our theoretical interests. Analyses that blend frequency information
with time information (e.g., when oscillatory modes fade in or out) can begin to
take better stock of the measurement’s structure (Singh et al., 2017). We find
wavelet models like those that blend frequency and time information very
intriguing and useful. And indeed, some of these wavelet models have elaborations
that allow the calculation of multifractal spectra (Ihlen & Vereijken, 2010).
However, theoretical choices have steered us away from these wavelet methods for
two reasons. First, Chhabra and Jensen’s (1989) method does not require the steps
that wavelet-based methods do, that is, the Legendre transformation whose first
derivatives of wavelet-estimated root-mean-square across values of q can conflate
estimation error with multifractal structure (Zamir, 2003). Second, whereas
wavelets still aim to parse a series into the independent contributions of
independent timescales, we use surrogate testing specifically out of an interest in
assessing the strength of interactions across timescales. Surrogate testing is of
course available to wavelet-based estimates of multifractality (Ihlen & Vereijken,
2010), but we only highlight the limitation of wavelet methods as such.
No one ever said linear models have to be perfect
Prediction error is inevitable and expected
We are not claiming that multifractality is what to do because linear models are not
perfectly predicting. Perfectly valid linear models are expected to have error (i.e.,
differences between measured and predicted) across time. After all, no one expects
the empirical record to be perfectly regular. No linear model must thus
demonstrate a perfect fit. For instance, if our rainfall model predicted 26, 3, and 15
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
cm of rainfall next Monday, Tuesday, and Wednesday, we might find that the
measured rainfall turned out to be 19, 10, and 11 cm. That would entail errors in the
prediction of 7, –7, and 4 for those three days. Smaller errors enntail more accurate
predictions than larger errors. It would look bizarre for a model to predict perfectly.
And what makes a linear model valid is rather that the nonzero errors exhibit the
lack of temporal structure and so resemble white noise. That is, using the linear
autocorrelation to fit time-variability in the measurement voids any seeming
requirement that raw measurements of the developing process must always have
the same mean and variance. Said another way, the various methods of detrending
and modeling the autocorrelation can vanquish many of our worries about
changing mean or changing variance. It is only the prediction errors that must have
a zero mean, stable variance, and no correlation across time (i.e., nonzero
autocorrelation coefficients overall for the specified lag period). The model
predicting the values of the series itself can have all manner of structure to it (e.g.,
see examples of the autoregressive, integrated, moving-average [ARIMA] models
(Box et al., 1974; Box & Jenkins, 1968). Our complaint is thus not with linear
models predicting imperfectly; it is only with linear models whose errors take an
invalid form, that is, errors with time-varying mean, variances, or autocorrelation.
Behind this concern is the threat that the measurement series does not meet
assumptions of ergodicity and so fails to help infer a stable linear cause (Mangalam
& Kelty-Stephen, 2021, 2022).
Prediction only must be good on average but on the same average
throughout
What this means is that predictions need to be good only on average, but there
needs to be only a fair-coin’s worth of deviation between linear prediction and
actual behavior. We may know that predicted average can change, but linear
predictions only expect the same amount and type of error at any point from
beginning to end of the series. Linear prediction can manage limited degrees of
irregularity in the mean, but it assumes that the irregularity has the same form at
the beginning, middle, and end of the series. Good linear prediction is about
minimizing the variance of predictions around a time-symmetric portrayal of
change with time. That is, linear modeling tackles time-variability by assuming that
the time-variability is the same across time. This concern is a point specifically
about the autocorrelation, which is the linear description of “how a measure
changes with time.” The more a given measurement series’ autocorrelation changes
with time, linear prediction becomes a progressively weaker strategy.
Linear modeling fails so long as prediction errors change with time
Time-symmetrical models can only explain time-symmetric measurements, and
here is where our patience with the linear model breaks. Most importantly, the
linear model does not permit prediction errors with temporal structure deviating
from white noise, regressing-to-the-mean process. We are not speaking here of
simple failures of one or another of the features, for example, trend (failure of
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
stable mean), heteroscedasticity (failure of stable variance), or linear
autocorrelation (failure of memorylessness; Tabachnik & Fidell, 2007). Indeed, we
have long known about each of these failures. These alone could be met with
polynomial/sinusoidal detrending or logarithmic transforms. Instead, we are
raising the concern that, even when we use the known stopgap measures to
bandage these individual failures over, the residuals may persist in misbehaving,
and the variations in our measurements may reflect the change that is not reducible
to independent, additive causes. At a certain point, the issue is that time-
asymmetry is not just a bug of measurement but a persistent feature of our
measurement that our best-fitting and always time-symmetric linear modeling
structures can rarely capture. Furthermore, a deeper issue is that our most ornate
linear models like fractional integration raise mind-boggling questions about how
many causal time-asymmetric factors can be added in all while ignoring the plain
fact that psychological and biological experiences have a sequence, a developmental
progression to them (Kelty-Stephen & Wallot, 2017; Molenaar, 2008).
A major problem here for psychological science is in aiming for models to
explain and then predict cumulative progression. However, our frequently linear
models are all time-symmetric and time-invariant. Learning, remembering,
forgetting, and reading words in sequence—these processes depend on their
sequence. We know in our psychological theories that organisms should gather
their processes together and then break them down in systematic and context-
dependent ways. However, our frequently linear models only operate in a
framework that suggests that sequence is irrelevant: linear models add outcomes
from constituent causal factors, and adding is commutative, working the same way
backward and forwards (Molenaar, 2008). So, as the organism grows and learns
and develops, it does mean that new causes come into effect or that old causes find
new ways to participate. If not, then learning and developing are just reshuffling the
same resources every time. However, we speak comfortably about remembering or
forgetting, gaining experience, or losing our faculties. Our theories expect
irreversibility almost as a premise, but the linear modeling strategy is
characteristically an analysis into independent parts in reversible relationships to
each other (Cariani, 1993). It is no wonder that psychological science often finds
itself in the position of finding measurement residuals flying off the rails of our
predictive models (Kelty-Stephen & Dixon, 2012). Time-symmetry cannot explain
time-asymmetric processes: linear models cannot give a voice to psychological
theories of growth and development.
When prediction errors begin to deviate from this uncorrelated random
variation, it signifies a systematic departure of the linear model’s measurement
series. Just isolated points in time do not tip us off to something being amiss. It is a
statistical symptom that will only show up as we examine how the linear model
compares to the series in the long run. That departure might be sudden and abrupt;
it may be continuous or intermittent. However, in the long run, the issue is that the
prediction errors could be correlated with themselves across time—across those
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
same lags we had seen in the autocorrelation function above. It might be tolerable
to measure a short series with a linear model. However, with progressively longer
time, the deviation from linearity becomes progressively apparent as sums of
independent timescales keep on failing to capture nonsummative interactions
across timescales. This departure of prediction errors from white noise is the
empirical margin within which multifractal methods might help.
How do we perform multifractal analysis?
The present work uses one of the most straightforward variants of multifractal
analysis—called Chhabra and Jensen’s (1989) direct method (the Appendix at the
end of this article provides the mathematical details of this method). This method
built on the foundation of ‘bin proportions’ (Fig. 6). We will unpack this idea of bin
proportion as follows: ‘bins’ stand for subsections of the measured series and can
also be called ‘time windows,’ ‘limited samples’ or ‘short snippets’ of the longer
series. The question of concern is how closely any single bin of the series (i.e., any
small subset of measurements over time) resembles other measurements over time
(Fig. 6, top and bottom left). For example, will one bin of the series look like
another bin at some other time in the same measurement? How do these subsets of
measurements vary when looking at different timescales—do measurements in one
bin look like the measurements found in a longer bin?
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
Fig. 6. Multifractal analysis is founded on the notion of ‘bin proportion,’ P, and proceeds by
examining the statistics of the bin proportion of the series. Multifractal analysis proceeds by
using bin proportion and Shannon entropy of bin proportion to calculate a ‘singularity
strength,’ α, and a ‘Hausdorff dimension, f. The measured series appears in top-left panel.
‘Bins’ stand for subsections of the measured series of size L, as schematized in bottom-left
panel. Bin proportion is obtained by dividing “the amount of the measure in one bin” by “the
amount of all the measure across the entire series.” The slope of the linear regression of
logarithmic bin proportion, logP(L), against logarithmic bin size, logL, equals a singularity
strength, as shown in top-right panel. Shannon entropy is negative bin proportion, P(L),
multiplied by logarithmic bin proportion, logP(L), as shown in bottom-right panel. Shannon
entropy reduces with bin size, and the negative slope of the linear regression of Shannon
entropy against logarithmic bin proportion estimates negative one times the Hausdorff
dimension, f. The symbols ‘L’ and ‘P’ in the bottom-left panel have been included to schematize
the fact that larger bins (i.e., larger L’) have larger bin proportions (i.e., larger P’), and
‘Entropy’ has been printed smaller and larger text next to the larger and smallerLto indicate
the inverse relationship between L and Entropy. In subsequent sections, Shannon entropy will
be replaced with negative Shannon entropy to get rid of the negative sign next to f, and we will
further generalize this α and f with a q parameter that can distinguish between bin proportion
and fluctuation size.
Binning our time series to mimic time windows of observation
We will use the mathematical language of ‘bin proportion’—fractions to express
probabilities—to compare the amount of the measure within bins of different sizes.
Bin proportion is obtained by dividing “the amount of the measure in one bin” by
“the amount of all the measure across the entire series.” The unifying mathematical
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
question fundamental to multifractal analysis is: “how does bin proportion change
with timescale?” We will thus discuss bin proportion and timescale mathematically
throughout. We will encode bin proportion as P and the timescale as L for the bin's
‘length’ or size (Fig. 6, top right).
Multifractal analysis probes three major features of bin proportions.
It considers the fact that the bin proportion P is sensitive to bin size L, and
the P ~ L relationship allows us to estimate this sensitivity in terms of the
singularity strength α.
It considers the fact that heterogeneity in bin proportions can be sensitive to
bin size and that the relationship between Shannon entropy of bin
proportions and bin size allows us to estimate this sensitivity in terms of the
‘Hausdorff dimension’ f (Note: Shannon entropy yields –f in this relationship,
and although it is useful to recognize the appearance of Shannon entropy in
this calculation of heterogeneity, the convention in the multifractal analysis is
to use negative Shannon entropy to quantify the positive value f rather than –
f which the positive Shannon-entropy formula would yield; Halsey et al.
1986).
It considers how the items in the earlier two points