Content uploaded by Madhur Mangalam

Author content

All content in this area was uploaded by Madhur Mangalam on Jul 11, 2022

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

Multifractal test for nonlinearity of interactions

across scales in time series

Damian G. Kelty-Stephen1, Elizabeth Lane2, Lauren Bloomfield3, and

Madhur Mangalam4

1Department of Psychology, State University of New York-New Paltz, New Paltz,

NY, USA

2Department of Psychiatry, University of California-San Diego, San Diego, CA,

USA

3Department of Psychology, Grinnell College, Grinnell, IA, USA

4Department of Physical Therapy, Movement and Rehabilitation Sciences,

Northeastern University, Boston, MA, USA

ORCIDs:

Damian G. Kelty-Stephen (0000-0001-7332-8486)

Madhur Mangalam (0000-0001-6369-0414)

E-mails: keltystd@newpaltz.edu; m.manglam@northeastern.edu

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

Abstract

The creativity and emergence of biological and psychological behavior tend to be

nonlinear, and correspondingly, biological and psychological measures contain

degrees of irregularity. The linear model might fail to reduce these measurements

to a sum of independent random factors (yielding a stable mean for the

measurement), implying nonlinear changes over time. The present work reviews

some of the concepts implicated in nonlinear changes over time and details the

mathematical steps involved in their identification. It introduces multifractality as a

mathematical framework helpful in determining whether and to what degree the

measured series exhibits nonlinear changes over time. These mathematical steps

include multifractal analysis and surrogate data production for resolving when

multifractality entails nonlinear changes over time. Ultimately, when

measurements fail to fit the structures of the traditional linear model, multifractal

modeling allows making those nonlinear excursions explicit, that is, to come up

with a quantitative estimate of how strongly events may interact across timescales.

This estimate may serve some interests as merely a potentially statistically

significant indicator of independence failing to hold, but we suspect that this

estimate might serve more generally as a predictor of perceptuomotor or cognitive

performance.

Keywords: fractal; Fourier; heterogeneity; long-range memory; Markov;

multifractal nonlinearity; non-Gaussian; self-report; surrogate testing

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

Introduction

We hope in this work to make the case that multifractal modeling is crucial for

psychological science—for theory no less than for data analysis. We will begin with

a familiar measure (i.e., self-report) and a more familiar feeling (i.e., wondering

about nonlinearity in data). Then, we will make the case that multifractality

addresses what is troubling about the familiar measure and elaborates on our

scientific ability to act on the familiar feeling.

“Do I want to understand the nonlinearity in my psychological measures?” is

a self-report measure readers may make at the outset of reading this manuscript.

Psychological sciences deal heavily with self-report as an often effective, expedient

way for a quick look at the underlying thought processes. Then again, self-report is

no less a critical filter for us as researchers on our planning, for instance, as we

carry our research work forward, no matter the project. For instance, “Am I

planning the right study?” or “Do I have the right measures?” are two critical self-

report measures we can all relate to, no matter the research domain. So let us

acknowledge and explore the self-report measure of whether each of us should want

to explore nonlinearity in our psychological measures. Most self-report measures in

practice can prompt intriguing answers, and though they do not always prompt the

correct answer every time and across issues (Jeong et al., 2018), they have a curious

psychometric texture that may capture the thought process in an interesting

dynamic, which can be helpful in the proper context (Baer, 2019).

The familiarity of the self-report hides the maybe alarming truth that self-

report poses severe challenges to our most familiar linear models. Long-lived

psychological measures like self-report exhibit a perfect storm of violated statistical

assumptions to prevent the linear model from linking any measures of cause and

effect. Lurking below the statistical concern of a well-behaved linear model is the

more challenging question of whether psychological causes and effects can be

entirely linear. Do self-report measures meet the necessary criteria for applying the

linear models for assessing cause in the first place? The answer is: in the long run,

they do not. Perhaps they could meet those criteria under contrived constraints

when only looking at a small handful of self-reports. However, for all we have

known about the challenges posed by self-report measures, our use of self-report

measures has been anything but short-run: self-report measures have persisted for

over a century of psychological research (Baumeister et al., 2007). Furthermore, if

we consider how comfortable we may be consulting our feelings every day and at

every turn, it might feel somewhat surprising to realize that this most intuitive and

accessible kind of measure may be one of the least amenable to linear modeling.

Moreover, self-report is just the most accessible example of a more general tangle

of logic for psychological science making causal inference from its measurements.

In what follows, we aim towards a larger incompatibility: no matter the

measurement, psychological cause-and-effect relationships are time-sensitive, but

linearly modeled cause-and-effect relationships are not. Multifractal modeling is

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

ready to fill this void with a means to explicitly estimate the strength of interactions

across time scales.

What a linear model needs in order to give us a valid test of cause

Mean, variance, and autocorrelation

resuming that measures like self-report could support linear estimates of cause-

and-effect relationships, what would we need for a linear model? It is important to

remember the fundamental criteria for a linear model. It is ideal for linear

modeling that measures show: 1) stable mean, 2) stable variance, and 3) stable

autocorrelation over time (Lutkepohl, 2013; Mandic et al., 2008). These criteria

often appear in the statistical literature as the assumption of independent and

identically distributed (i.i.d.) Gaussian noise, with ‘independence’ implying the lack

of memory or sequential structure and ‘identically distributed’ implying stable,

similar variance. The measures show Gaussian distribution when the very many

constituent causes shaping that measure all add together.

Where have we seen the autocorrelation? Three possibilities

Possibility 1: Autocorrelation encodes the past in terms of regular

previous intervals. The autocorrelation exists on the boundaries of many a

statistical training—psychologists doing factorial designs may never need it, and

psychologists using time-series designs may need it very often. So, a brief summary

of possibilities is warranted. First, the autocorrelation offers us a way to encode the

correlation of a current measurement with a previous measurement of the same

process. The autocorrelation is in effect a set of regression coefficients (i.e.,

indicating “autoregression”), one for each possible time lag. Lag here means “how

many previous measurements ago” with each coefficient representing the

relationship of current measurement with each past measurement. So, for instance,

let us say we are measuring response time (RT), and we may realize that current-

trial RT is positively related to RT on the previous two trials and much more on the

just-previous trial than on the trial before the just-previous one. That would

amount to, in this example, an autocorrelation with large positive lag-1 coefficient

and a smaller but still positive lag-2 coefficient. And perhaps, for the sake of the

example, we might imagine every three trials were somehow similar and priming of

one another. That is, RT might decrease due to priming three trials before, and this

relationship would manifest in a negative lag-3 coefficient.

Possibility 2: Autocorrelation (or Fourier amplitudes) encodes the

past in terms of periodic cycles.

A second possibility is that the autocorrelation can encode regular cyclic change

into linear models. Indeed, some time-series approaches might avoid the

autocorrelation function by name, but they will often use the amplitude spectrum

from the Fourier transform. The amplitude spectrum (or for amplitude squared, the

power spectrum) encodes the size of the oscillations for a wide range of spectral

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

frequencies (or inversely, wavelength). Wavelength is just another another way to

specify lag: for instance, the time it takes for a cycle to unfold indicates the time

between similar rises and falls in the measurement. It is thus no coincidence that

the autocorrelation bears a one-to-one relationship with the amplitude spectrum of

the Fourier transform (Wiener, 1964). Psychological examples of periodic cycles are

circadian rhythms of wakefulness versus rest, bouts of consumption/production as

between famine and feast, oscillation of limbs during entrainment to a metronomic

or musical beat (e.g., Haken et al., 1985).

The Fourier transform is an almost universally available description of a

series. Almost all series yield a Fourier series of amplitudes for oscillations of all

possible frequencies. This almost inevitable availability of a Fourier series is that

oscillations are almost always present in measurements at some timescale

(Bloomfield, 2004). One might not be able to estimate the Fourier series if the

series had no oscillations, that is, unbounded growth or decay of complete stasis.

However, nonoscillatory series are extremely rare and difficult to identify in

practice. The lack of generative theory for nonoscillatory processes goes hand in

hand with the statistical problem of determining when the measured oscillations

might be reducible to different kinds of noise, for example, whether measurement

noise or noise modeled through moving-average detrending (Dambros et al., 2019;

Spencer-Smith, 1947; Wang et al., 2013). All psychological or biological systems

should recur and fluctuate (Riley & van Orden, 2005)—even the simplest physical

models used to build theories of psychological processes (Richardson, 1930).

Hence, our measurements in psychology should be oscillatory and thus have

estimable Fourier transforms. The primary concern within the

psychological/biological domain is not the presence of oscillations but rather the

stationarity of these oscillatory modes (given the one-to-one relationship of Fourier

transform and linear autocorrelation, compared to the question of stationarity of

the autocorrelation). A long-standing controversy about the applicability and

interpretability of Fourier modes is whether/how they converge (Bloomfield, 2004;

Paley & Zygmund, 1930). It is much the same issue of a mean being stable for

heavy-tailed probability distribution functions: we can always calculate the

arithmetic mean from a sample of measurements, but the easily calculable mean of

a heavy-tailed process may or may not support the interpretation that the mean of a

Gaussian or thin-tailed distribution would (Richardson, 1926; Shlesinger et al.,

1993). In the case of the Fourier model, then, the always-(to-our-knowledge)-

oscillatory psychological measures will always allow calculation of the Fourier

series, and the lingering questions are not about the existence of a frequency

domain but rather about the stability of the estimated amplitudes for the oscillatory

modes available to measurement (Singh et al., 2017).

Possibility 3: Linear models omitting autocorrelation assume zero

memory which is just zero autocorrelation. A third possibility of

autocorrelation is an implicit zero. When psychological research does not deal

explicitly with autocorrelation, it is not a denial of the role of autocorrelation but

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

merely an implicit assumption of ‘no memory’ or ‘independence.’ We do not need to

carry all those coefficients of the autocorrelation function if we can just assume ‘no

memory,’ that is, that all coefficients are zero. Another way to think of zero memory

is ‘white noise,’ a pattern of memoryless variability in which measures oscillate at

all timescales, and the magnitude of oscillations is comparable at every timescale

(Baxandall, 1968; Forgacs et al., 1971). Indeed, the always zero autocorrelation

function is mathematically equivalent to a set of sinusoidal oscillations with similar

amplitude for all oscillatory periods. In this sense, the standby assumption of ‘i.i.d

Gaussian’ variability is usually an assumption of additive white noise. White noise

itself is an elegant way to generate a Gaussian distribution, that is, to sum together

many sinusoidal oscillations of all available frequencies (or inversely, wavelengths)

with all uniform amplitudes (Pearson, 1905).

Nonstationarities and quick fixes to overcome them in linear modeling

Types of nonstationarities

The stationarities of mean, variance, or autocorrelation each are capable of failing

alone. Stationarity of mean could fail when there is a persistent trend in the mean,

while variance around that mean stays the same (Fig. 1A). Stationarity of variance

could fail separately from stationarity of mean: our self-report might reverberate

differently across time, all the while maintaining the long-run trait, much like the

audio waveform of a high-hat cymbal in a jazz drum solo, shimmering wildly into

large positive and negative micropascals of pressure at intervals but always

centered around the zero marks in the middle (Fig. 1B). Similarly, the linear

autocorrelation could be nonstationary without necessarily changing the mean and

variance (Fig. 1C). The schedule of events or sequence of trials can shift without

aggregate rise or fall and without change in aggregate dispersion.

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

Fig. 1. Examples of mean, variance, and linear autocorrelation failing to be stationary.

(A) Variability in mean. (B) Variability in variance. (C) Variability in autocorrelation.

Quick fixes

If only one part of the measure is nonstationary, quick fixes can massage the

measure back into conformity with linear-modeling criteria. For instance, a series

of response-time measures can easily show a long sequence of quick responses or a

bout of very long, slow responses (Bills, 1927, 1931, 1935; Holden et al., 2009; Van

Orden et al., 2003). Logarithmic scaling is a quick and surefire way to bring the

excursions of the mean into polite restraint. Another example is the temporal

structure of mouse-tracking trajectories to study online decision processes. In this

case, competing computational processes in the computer’s operating system can

produce rather spurious contingencies between successive measurements of the

mouse cursor over extremely brief timescales. Research using mouse-tracking has

191

192

193

194

195

196

197

198

199

200

201

202

203

developed elegant means of ‘time-normalizing,’ that is, coarse-graining this

measure and imposing, at a longer timescale, a more regular sequence on poorly

sequenced finer-scaled raw measurements (Kieslich et al., 2019).

A perfect storm in our measurements: Failures to meet linear-modeling

criteria accrue and long-memory can compound the problem

Psychological experiences are unstable in the short term

Here, our ability to infer cause from familiar measurements implodes under the

demands of the linear model. Attempts to linearize a measurement series may only

go so deep. Failures to meet the required criteria can accrue, and the quick fixes

may only uncover persisting instability as in the case of self-report measures

(Olthof, Hasselman, & Lichtwarck-Aschoff, 2020). First, an ongoing series of self-

report measures will exhibit nonstationarity of the mean and variance. That is, the

long-term changes in self-report measures exhibit intermittent, irregular, and

abrupt transitions from periods between heightened self-report measures to lower

self-report measures. Second, these transitions can be so frequent that the valid

prediction window ranges from 3 to 5 successive self-report measures. So, despite

all the expectable regular rhythms we might use to model our self-reports, other

less regular events might cause new variations. Nonstationarity could be

contagious, spreading from the mean to the autocorrelation function (Horvatic et

al., 2011).

Psychological experiences are unstable in the long term too

The strategy we mentioned of coarse-graining measures (e.g., in time-

normalization) reflects a more profound belief that fluctuations even out in the long

run. This belief is a core value of the linear model, with its expectations of

stationarity. Indeed, we know from cognitive psychology to beware of the ‘hot hand’

fallacy in which we might easily see patterns where there are none in the long run

(Gilovich et al., 1985). So, wary cognitive psychologists may find it intuitive that, in

the longer run, average behavior starts showing less sequential variation and less

sequential memory. However, the complete opposite can emerge: when we coarse-

grain our measurement series for averages over longer time windows, our

psychological-scientific measurement series can show stronger sequential

variations in the long run—or at least show sequential variations resembling the

shorter-run variations and that persist over a much longer timescale than the

premises of linear modeling suggest (Fig. 2).

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

Fig. 2. This plot depicts a linear series with ‘fractal structure’ (i.e., 1/f noise). This series

has an autocorrelation that ‘diverges,’ which means that the correlation between the

present with previous values dwindles only very slowly with greater lags between the

present and previous values, decreasing at a power law with timescale with a fractional

power-law exponent. The autocorrelation with this power-law dwindling is not pictured

in this plot, but the example series in the time domain illustrates how contrary to the

common assumption, variance may not necessarily stabilize in the longer run. First, we

discuss in text the practice of using coarse-graining to diminish the ‘noisiness’ of shorter-

timescale variation, but then we discuss how fractal structure may thwart this intuition

by presenting a case in which longer-timescale variation may not be much less than

shorter-timescale variation. The green arrows (from roughly 18 to 28 days and then from

55 to 65 days) highlight shorter timescales, and the green brackets indicate a rough

approximation of the range of values in those roughly 10-day spans. The blue arrows

(from 18 to 48 days and 64 to 104 days) highlight longer timescales, and the blue

brackets show a rough approximation of the range of values in those 30- and 40-day

spans.

The perhaps unintuitive possibility is that longer-term averages may be no

more stable than short-term averages, and the reason is 'long memory,' that is,

initially short-scale variability persisting across long scales.

Long-memory structure appears in self-report measures (Delignières et al.,

2004; Olthof, Hasselman, & Lichtwarck-Aschoff, 2020) as well as in various equally

tried-and-true psychological-scientific measures: response times, lexical decision,

and word naming (Gilden, 2001; Holden et al., 2009; Kello, 2013; Kello et al.,

2008; Van Orden et al., 2003). Long memory entails ‘fractal scaling’ (shortening

the phrase ‘fractional scaling’) which means that these measures exhibit the

autocorrelation diminishing very slowly with lags, that is, at a scale-invariant power

function with a fractional exponent.

Memory in our measurements can mean that instability becomes

permanent

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

This colorful long-memory might initially lull us into the idea that, for instance,

more memory could entail more predictability. However, we should clarify that

here the opposite can be true. Long-memory only means that variability grows

similarly from shorter to longer timescales. Now let us recall how grim our

prospects are in that case: if self-report measures show highly nonstationary mean

and variance and a brief prediction window, then long-memory suggests that what

extends similarly over time is the difficulty in predicting very far. Predictability falls

off every 3 to 5 successive self-report measures as noted before (Olthof, Hasselman,

& Lichtwarck-Aschoff, 2020), but then, long-memory means that predictability

suffers even further, repeatedly across time, becoming ever more unfeasible as

short-run instability echoes out across instabilities over the longer run.

Psychological causes are time-sensitive, but strictly-linear causes are

not

Readers on the fence about nonlinearity might find nothing new in these concerns.

For instance, we have long known about the instability of variance under the term

‘heteroscedasticity’ (Tabachnik & Fidell, 2007). If we have a quick fix, what is the

harm? However, our concern is that fitting our misbehaving measurements into a

shape that linear models will recognize may have diminishing returns. At a certain

point, quick fixes shaving off this or that nonstationarity ignore the possibility that

these nonlinearities reflect deeper truths about real causes underlying

psychological processes. We can squeeze our measurements into a shape that will

fit a linear model only so long before they cease to resemble the actual measured

behavior.

Linear models freeze any effect of history

Linear modeling in psychology suffers from an incompatibility over time:

psychological processes are rooted in experiential past and arcing towards

anticipated futures, and linear modeling are ultimately time-insensitive. As elegant

as linear modeling can be, it always aims at the same additive and time-

independent source of variance.

A question that immediately follows these considerations is: would the

linearly-modeled cause offered above even be recognizable to our knowledge of how

cognition works? Our own self-report would be ‘not necessarily.’ For instance, if our

response-series measured the time taken to read individual words in sequence (e.g.,

Wallot et al., 2014), then the autocorrelation is an expression of how each word-

reading time might depend on previous words in the sequence of words having

been read. Of course, it makes good sense that we read each word considering the

words before it. Moreover, long-memory suggests that each current word-reading

event always carries some distant effect of long, long-past words. The power-law

decay of autocorrelation in long-memory does mean that the current values of our

measure have a dwindling correlation to past values with larger lags. Nonetheless,

the scale-invariant shape of the power-law entails that the autocorrelation dwindles

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

without ever converging to zero. So, then, yes, we read each word with a memory of

how the story started.

But this linear-modeling story becomes more challenging for how to interpret

the autocorrelation in our theories. The deep problem for strictly-linear psychology

is that linear autocorrelation entails a twofold stricture: first, linearly-

autocorrelated sensitivity to experiential past on the present never changes, and

second, the linear autocorrelation offers no larger context that might—at some

point—redefine effects of past constituent events. To the first part of the linear

stricture, a stable autocorrelation is handy for linearly modeling, but trotting out

the whole autocorrelation (N–1 coefficients for an N-length measurement) to

confirm causal modeling, though, maybe of limited theoretical value (Gilden,

2009). The autocorrelation entails that the contribution of activity at past lags is

always independent (Fig. 3A). Here is where the interpretation for our word-

reading example may stop feeling recognizable: a stationary autocorrelation

function with long-memory implies that all past words matter, but they do so

independently. That is, the time that, you, the reader spent reading the word

‘independently’ at the end of the last sentence depended on time reading the words

just before: ‘so,’ ‘do,’ ‘they,’ ‘but,’ ‘matter,’ and so forth. But the catch is that phrases

(or any event longer than a single word) would have no causal force of their own—

and here we catch a glimpse of the context that we will need for any event to mean

anything. For instance, there is no room in a linear autocorrelation to indicate

syntactic structure, such as the adjective ‘past’ might modify ‘words.’ Gone from the

linearly-modeled past are regular features of reading.

Fig. 3. Two perspectives about how a measured series is analyzed. (A) The linear

autoregressive perspective takes the premise that each measure in time entails the

summing of random and independent factors. (B) The multifractal perspective takes the

premise that each measure entails interactions among component processes at many

nested timescales.

Context matters, to risk stating the obvious. If a careless writer omitted any

single word, then the linear autocorrelation offers no phrasal context here to

support the reader in comprehending the meaning of each current word in the

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

greater context of a narrative. Here we have the second constraint of the linear

structure: the autocorrelation function is definable only at one scale (e.g., the scale

of individual words), and implicitly, that makes impossible the use of one scale as

context for processing information on another scale (Kelty-Stephen & Wallot,

2017). And the lack of context or nesting of multiple scales is unfortunate: to date,

lexical priming research is clear that human readers depend on the context at

multiple different scales of experience (Troyer & McRae, 2021). The psychological

theory takes it for granted that intelligent use of information depends on

considering the information at one scale through a lens at another scale (Simon,

1969). Mind grows and adapts; linear models do not.

Contextualized meaning is unavailable to linear modeling but may be

available to nonlinear modeling

What we take for granted could be explicit in our models. What we hope leads our

readers off the fence and into the rich potential of nonlinear modeling is precisely

this point: modeling nonlinearity might operationalize these interactions across

scales, the contextualizing of meaning. Indeed, we can agree that context matters in

a broad class of cases, but modeling nonlinearity offers the possibility that the

contextualizations are quantifiable—and testable for anyone doubting that they

impact the measurement. Interactions across timescales are not just ineffable

truths but may find a quantitative expression that can generalize and support

formalism. The scaling relationships we have noticed in the autocorrelations might

not be just the coincidental sum of independently estimable contributions (Fig.

3B). Instead, they might be shadows cast by a thoroughly different nonlinear form

of cause than articulated by linear modeling—one of causes cascading across scale

rather than hopping from one independent point in time to the next. They might be

control parameters governing or predicting the sequence of psychological

experiences as they evolve across time. Furthermore, this last possibility is totally

out of the reach of linear modeling: even the most hierarchical linear model will fail

to define a sequence if only because the linear model is time-symmetric and

assumes order does not matter. That is, linear-modeled context could not lead a

process towards any outcome it had not visited before.

What are our choices then? We see two major options. First, we could just

give up on the idea of contextualized cause because linear models are inapplicable.

Or, second, we could try to model the strength of nonlinearity in our measures. Our

own self-report measure is that the latter option feels more productive. Self-report

measures are not going away, and we have no intention of advocating a

replacement. The same goes for the psychological measures showing similar

structure. In this vein, we have no interest in recommending a wholesale jettisoning

of standard psychological ontology or taxonomies. We have no reason to doubt the

reality of psychological experience as reported to or shared with us by participants

—whether through self-report or through any measurement they consent to. For

instance, the growth of multifractal modeling has not, in geophysics, led to the

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

proponents advocating the avoidance of traditional labels for multifractal things,

for example, ‘wave,’ ‘wind,’ ‘cloud,’ ‘storm,’ or ‘planet’ (Schertzer & Lovejoy, 2013).

The only doubt we bring is for the analytical framework that we use to explain the

phenomenological reports. This manuscript is an invitation for all interested in

facing up to the limitations of the linear model’s portrayal of our field’s cherished

measures—and in hopping over the fence into the nonlinear territory.

Multifractal modeling allows probing nonlinear causal relationships

across scales

We hope that the preceding text might have convinced some of our readers that

‘nonlinearity’ is not simply a curiosity far afield from respectable, long-standing

psychological research. No, we hope that we might have shown that nonlinearity

might be rooted deep in our daily work. In this vein, we are indebted to Olthof et

al.’s (2020) work on self-report measures, which raises the concern without causing

alarm. The rest of the work aims to positively approach these issues, addressing the

concepts of linear and nonlinear changes in their own rights. We will aim to nod to

the psychological examples but will try to keep this discussion on the logic and the

math. We will discuss the math through the example of rainfall. This example may

be disappointingly nonpsychological, but we use it for two reasons: (1) a

nonpsychological example series offers the concepts without cluttering the concepts

with existing psychological theory. Seeing a series that does not provide a

comfortable home for our most cherished intuitions can help us see the logic in

conceptual terms. For instance, we can invent mathematically helpful illustrations

that may not make any sense for this or that specific psychological example. (2)

Rainfall works neatly with the logic of multifractal analysis. Multifractal analysis

involves ‘binning’ our measurement, that is, seeing how much of our measurement

falls into nonoverlapping subsets. Abstract as some of the following math and

concepts may be, we think it may be helpful to consider rainfall as an immensely

tangible example. For instance, the rain can fall into actual bins, and we do not

have to lean both into the abstraction of math and the abstraction of psychological

theory.

The following text is a conceptual description of the logic, along with some

mathematical detail. Readers who find the conceptual treatment incompletely

satisfying may refer to our Supplementary Material examining the multifractality of

a small handful of speech audio waveforms, one of human speech and two of text-

to-speech (TTS) synthesizers reading a brief text (Fig. 4). All raw series and code

are attached for replicating the process.

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

Fig. 4. Speech waveforms for the phrase “Sherine Valverde and her husband Alessandro

are determined to teach their baby.” (A) Human speaker (20-years-old female). (B)

Text-to-speech (TTS) synthesizer: Voice Dream’s Samantha, standard. (C) TTS

synthesizer: Voice Dream’s Samantha, enhanced.

Nonlinearity: Not simply curviness but a failure to reduce to a sum

The analysis of linear or nonlinear changes over time evaluates whether or not the

measured series can be effectively modeled as a sum of independent random factors

(Fig. 3A). Nonlinear changes can mean that the series is not well-modeled as

merely a sum. The question of whether it is a sum is a foundational mathematical

issue and goes deeper than the relatively superficial question of whether changes

over time ‘look’ linear to the eye. Linear changes over time can include

curvilinearity; some nonlinear trajectories are perfectly compatible with modeling

the series as a sum. Time-series with peaks and valleys can invite a polynomial

418

419

420

421

422

423

424

425

426

427

428

429

430

model. Psychological examples of polynomial structure include a quadratic serial-

position curve, with a greater likelihood of recall for the earlier and the later items

in a list (Ranjith, 2012), or a linear decay of maze-completion errors by Tolman’s

rats once they were given a reward (Tolman & Honzik, 1930). Polynomial models

include linear effects of time and several powers of the time (e.g., quadratic and

cubic for second and third powers). Critically, polynomials are linear in the growth

parameters, or sums after all—sums of integer exponents of time (e.g., linear

growth is proportional to ‘time’ which is just time raised to the power of 1, i.e.,

time1,’ but quadratic profiles are just the sum of ‘time1’ with ‘time squared’ which is

just time raised to the power of 2, i.e., ‘time2,’ and cubic profiles are ‘time3 + time2 +

time1’ and so forth). Choosing to test nonlinearity as a failure to reduce to a sum or

a linear model is mathematically deeper than just eyeballing the plots. That is, a

curvy plot of data points over time may still reduce to a sum of independent factors,

but ‘seeming linear’ does not guarantee that it is actually linear. The linear model

(i.e., summing parts) can make very many seemingly linear changes with time, but

it can also produce very many curvilinear profiles. So, it is important to distinguish

between ‘seeming linear’ and being a linear sum.

The failure of a measured behaviors’ nonlinearity to reduce to a sum has

taken on a growing urgency to psychological sciences (e.g., Riley & van Orden,

2005). Multifractality can provide us deeper insights into the failure of a series’

nonlinearity to reduce to a sum of independent random factors (Fig. 3B). It is

quickly becoming clear that the failure of a series’ nonlinearity to reduce to a sum is

important to psychology. Beyond resemblance to old Gestalt wisdom that wholes

differ from sums of parts, estimates of multifractality have predicted outcomes in

executive function, perception, or cognition, such as in reaction time (Ihlen &

Vereijken, 2010), gaze displacements (Kelty-Stephen & Mirman, 2013), word

reading times (Booth et al., 2018), speech audio waveforms (Hasselman, 2015;

Ward & Kelty-Stephen, 2018), rhythmic finger tapping (Bell et al., 2019), gestural

movements during the conversation (Ashenfelter et al., 2009),

electroencephalography (Kardan, Adam, et al., 2020), and functional magnetic

resonance imaging (Kardan, Layden, et al., 2020). Our purpose is not to review the

empirical meaning of multifractality in psychological terms; this question may not

even be answerable in full at present. Our point is: Multifractality is the logical

consequence of processes that enlist interactions across timescales (Ihlen &

Vereijken, 2010), suggesting that it is essential to processes unfolding at many

rates, such as Gottlieb’s (2002) probabilistic epigenesis. However, the truth would

be better served with a broader set of scholars exploring the role of multifractality

in psychological processes. So, our purpose is to make the method more accessible.

This tutorial introduces multifractality as a mathematical framework helpful

in determining whether and to what degree a series exhibits nonlinear changes over

time. It is by no means the first to introduce multifractality; prior entries to the

multifractal-tutorial literature have sometimes taken a more conceptual

perspective, introducing nonlinearity over time as an interaction across multiple

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

timescales (Kelty-Stephen et al., 2013). Other tutorials have kept closer to detailing

algorithmic steps through the use of computational codes (Ihlen, 2012). The

present work aims to tread a middle ground, reviewing some of the concepts

implicated in linear and nonlinear changes over time and detailing the

mathematical steps involved. These mathematical steps include multifractal

analysis and surrogate data production for resolving when multifractality entails

nonlinear interactions across timescales. The present work makes the case that

multifractality may be crucial for articulating cause and effect in psychology at

large.

Multifractality: A type of nonlinearity for modeling processes that

develop through interactions across scales

Multifractality, a modeling framework developed in its current form about fifty

years ago (Halsey et al., 1986; Mandelbrot, 1976), is primarily a statistical

description of heterogeneity in how systems change across time. All mathematical

frameworks work by encoding more variability into a symbolic and logical

structure. Multifractality is no exception. What multifractality encodes is the

heterogeneity, and it encodes this heterogeneity as a range—maximum minus

minimum—of fractional exponents. These exponents represent the power-law

growth of proportion and timescale. This relationship between proportion and

timescale is a pervading question for any time we observe a changing system we

want to understand: all time-varying processes vary with time, and we are

constantly dealing with the issue that a smaller sample of the whole process tells us

something but not everything about that whole process. So, the important question

is, “how long do we have to look before we see a representative sample of the time-

varying process?” The proportion of the process we can see will increase the longer

we look, and that proportion increases nonlinearly, that is, with the proportion

increasing as a power function (also called a ‘power law’) of scale, with the

proportion increasing with scaleα. Multifractality becomes useful when there is not

simply one alpha value but when, for various reasons outlined below, there may be

many. That means that multifractality can help us understand how and why our

samples of observations may align with the broader structure of the time-varying

process. In summary, multifractality encodes heterogeneity as the range—

maximum minus minimum—of fractional exponents that govern the power laws

relating the observed proportions of heterogeneous changes to a specific timescale

(i.e., how a change in measurement relates to a proportional change in time).

Multifractality arose from a long history of scientific curiosity about how fluid

processes generate complex patterns (Richardson, 1926; Turing, 1952) and remains

one of the leading ways to model fluid, nonlinear processes—as initially intended in

hydrodynamics (Schertzer & Lovejoy, 2004) and more recently as a framework for

understanding the fluid-structure of perception, action, and cognition (Dixon et al.,

2012; Kelty-Stephen, 2017; Kelty-Stephen et al., 2021). In what follows, we unpack

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

both what multifractality is and why it is helpful for quantifying nonlinear changes

in series, with specific examples from perception, action, and cognition.

Multifractality differs from but does not replace mean and standard

deviation

What is the relationship between multifractality, mean, and standard

deviation?

Multifractality sits apart from the more familiar descriptive statistics like mean and

standard deviation—indeed, it does not replace mean and standard deviation. A

good reason for the widespread use of mean and standard deviation is that they

support a wide range of inferential methods to test the effects of many types of

hypothesized causes. However, when our hypotheses about causes begin to probe

the issue of changes over time, mean and standard deviation no longer suffice.

Mean and standard deviation never fail to be helpful or necessary to statistical

reasoning but fail to cover more complex relationships that evolve continually.

Therefore, the use of mean and standard deviation alone does not test hypotheses

about how systems change over time, that is, with the sequence, the time-

asymmetry, and the interactions over timescales that nonlinear series can exhibit.

Why are mean and standard deviation not enough to model how a

system changes over time?

Stable mean and standard deviation are necessary to the linear model, but they are

not sufficient for fully specifying a linear model. There is a third and often lesser-

known component composing the linear model, namely, the autocorrelation

(Mandic et al., 2008). Changes with time require us to acknowledge that the linear

model has not just two but three defining features: (1) mean, (2) standard deviation

(i.e., square root of variance), and (3) specification of the linear autocorrelation (or

equivalently the amplitude spectrum of the Fourier transform). The linear

autocorrelation describes how a given time-varying process correlates with past

behavior (e.g., how behavior over the current month resembles behavior over the

past month). As noted above, autocorrelation can appear through the function of

regression coefficients across lags, the amplitude spectrum of the Fourier

transform, or zero memory of ‘white noise.’

Linear models model measurements over time using the

autocorrelation function

What is entirely linear change in our dependent measures over time?

A linear model applied to a developing process assumes that it changes similarly

over time in the beginning, in the middle, and at the end of that process. This

symmetry over time that linearity assumes appears most clearly in the sinusoidal

waves that the Fourier transform uses to decompose a series—a sinusoidal wave

oscillates around its midpoint, extending into the future exactly as it had

throughout its past.

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

Ironically, this elegant model for changes over time is locked into repeating

the same changes over time. Perhaps it is not irony so much as it is so simple as to

sound absurd on statement: the change over time does not itself change. The

Fourier-transform’s use of sinusoidal waves to model changes over time reflects the

underlying premise that all changes reverse (e.g., all changes over time balance

out), or more widely known as ‘regression to the mean,’ the idea that what changes

over time will hover around the mean (e.g., what goes up must come back down).

Whether we use the linear autocorrelation or the Fourier amplitude spectrum to

quantify changes over time, the entailment is the same: the linear model necessarily

expects a relationship between the past and the present that does not itself change.

When should we use something other than the linear model to

understand changes in a system over time?

It is worth consulting nonlinear methods anytime the linear model fails to exhaust

the empirically observed variability. However, the linear model is an exceptionally

compact and effective statistical framework. Difference changes over time are

problematic, but the linear model is simple. So, statistical literature can often frame

nonlinear problems with linear solutions. Let us imagine that a developing process

exhibits different changes over time at the beginning versus at the end. We might

look for a simple fix to allow us to keep using the linear model—for instance, by

finding a proposed breakpoint where a sudden event brought about an abrupt

change, after which point, behavior followed an entirely different pattern. For

instance, we might take the same set of footpaths on the way to work every day and

back home every day, but an earthquake could suddenly damage or change the

footpaths with trees or structures along the way. After the earthquake, we might

take a radically different set of footpaths to get to and from work every day. So, if it

were our job to determine the best fitting model of our footpaths over very many

days, it might make a lot of sense to just fit one linear model for the pre-earthquake

footsteps and a separate linear model of post-earthquake footsteps. This find-a-

breakpoint strategy has rough-and-ready appeal, but it can become troubling to

generalize it. Crucially, this strategy essentially involves the scientist deploying

their unmodeled awareness of multiple timescales because one timescale of

observation does not generalize to the other.

One way to remove the individual scientist’s fingerprints from besmirching

the model is fitting more objectively meaningful long-term predictors for different-

sized time scales. For instance, if we take the more psychological issue of how

people spend money predicting daily spending behavior might involve looking at

short-term predictors such as individuals’ previous days’ spending behavior.

However, quite apart from these short-term, day-to-day changes, longer-term

trends may be more predictive. For instance, for a college student, spending

behavior may differ vastly during the school year from during the summer. Summer

months may offer the possibility of full-time employment, and so the effect of

previous days’ spending behavior may be entirely different during the longer-term

period of summer versus the rest of the year. Such long-term predictors are often

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

called ‘seasonal’ and suggestive of cyclic repetition, for example, summer arrives

reliably at the same point on the yearly school calendar. The challenge is that

identifying those long-term predictors requires not only theory and intuition but

also account for how these long-term predictors vary their meaning. Certainly, once

a student graduates or leaves school, then the cyclic effect of summer may

disappear as they begin to work full-time all year round. At such a point, the

relevant cyclic patterns useful for predicting spending behavior may change. A big

unknown throughout here: do we need to rely on an intelligent modeler to identify

specific independent timescales? Has the breakpoint-finding scientist not just

smudged more of their own interpretive fingerprints on the process? Or does the

time-varying process itself rely on interactions on timescales that are not

necessarily independent?

At least two good reasons exist to wish for an alternative to strictly linear

models of changes over time. First, ‘changes over time’ may be noncyclically

continuous; that is, changes may shift over time without any simple breakpoints.

The lack of cyclicity may be necessary because a system may change without

returning to the initial ‘normal.’ It is essential to underscore that any expectations

of ‘regression to the mean’ result from the linear assumption of temporal

symmetry. The temporal symmetry of linear models means they look the same

played backward into the past do played forward into the future (Lutkepohl, 2013).

However, there is no statistical guarantee that what goes up must come down as

most measured systems grow, mature and decay (e.g., in climate change or

economic repercussions of a global housing-mortgage crash).

The second reason to wish for an alternative to strictly linear models is that,

even for cyclic change, the interaction among changes over time across cycles. For

instance, roughly cyclic periods like the year, the month, and the day can be easily

identified. However, a daily routine may vary considerably across the span of a

month (e.g., around weekly or biweekly salary payments), and it may vary further

across different months in a year (e.g., as holiday bonuses and time off allow

various ways to behave). The necessity to account for the interaction of various

differently scaled factors over time has prompted the need for multifractal

modeling. While there are hierarchical linear models that can estimate interactions

involving short- and long-term effects, they estimate these interactions with the

same expectation of “stationarity just like in simple linear models” (Singer &

Willett, 2003).

Examples of changes over time and how linear models can respond with

progressively less stable autocorrelations

Given the origins of multifractal modeling in fluid dynamics (Mandelbrot, 1974;

Meneveau & Sreenivasan, 1987), an apt example to consider is daily rainfall in a

given region. Daily rainfall can be measured in centimeters to examine how it

changes over time. Reasons for abrupt changes include elevation, humidity, and

temperature. Reasons for more sustained changes can include the seasons and

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

movement of tectonic plates. Appearance or disappearance of currents, winds, and

vegetation can also impact daily rainfall.

Perfect stability. The linear model aptly applies to changes in daily rainfall

over time under many circumstances. The simplest measurements of daily rainfall

include (1) no rainfall (i.e., perfect drought), or (2) always the same amount of

rainfall (Fig. 5A). In both cases, the present looks perfectly like the past and the

future. Such processes are temporally symmetric. They have perfectly stable means:

drought entails a zero mean for the entire series, and the same amount of rainfall

reflects a stable mean for the entire series. Of note, the standard deviation is zero in

both cases. Note that, we cannot easily characterize psychological variability as in

speech audio waveforms with this profile. So, if only to show the mathematical

possibility of perfect stability, rainfall is a little more accessible an example than

many a psychological measure that we always expect to fluctuate.

640

641

642

643

644

645

646

647

648

649

650

651

652

Fig. 5. Possible series of daily rainfall in a given region. (A) Perfect stability. The

simplest measurements of daily rainfall include no rainfall (gray line) or always the same

amount of rainfall (red line). (B) White noise. Uncorrelated random variation of daily

rainfall, varying according to white noise. (C) Uniform seasonality. The measurement

area may have a rainy season and, therefore, a cyclical rainfall pattern (e.g., more in

June and July than in April). (D) Irregular seasonality. Rainy seasons can come late one

year or early; they can come early multiple years and late the following year. Also, not

shown here, but wet or dry years can exist in which the rainy season varies in intensity

across years, decades, and centuries.

White noise. Another case perhaps more suitable to areas with more

temperate climates would be (3) uncorrelated random variation of daily rainfall,

varying according to ‘white noise’ (Fig. 5B). White noise is the statistical term for

the product of many independent processes. Calling it ‘white’ reflects an almost

poetic allusion to the fact that some of the earliest uses of the Fourier transforms

653

654

655

656

657

658

659

660

661

662

663

664

665

666

involved the application to electromagnetic radiation (i.e., light), and some models

of white light have indicated a broadband contribution of radiation oscillating at

many visible frequencies (Baxandall, 1968; Forgacs et al., 1971). Therefore, in the

long run, white noise epitomizes the temporal symmetry characteristic of linear

changes over time and the regression to the mean. A histogram of a white-noise

process will approach (over long timescales) a Gaussian (or Normal) distribution

with stable mean and standard deviation. This Gaussian profile of white noise is a

close statistical cousin of the binomial distribution that a fair coin would generate

for large samples of progressively longer sequences of coin flips (Box et al., 1986).

Crucially, white noise regresses to the mean in the long run and is uncorrelated in

time (here, ‘uncorrelated’ implies no correlation between rainfall across days,

weeks, or months). In this case, the average rainfall for one day, one week, or one

month is as good a predictor of the next day, week, or month, respectively, as of any

other day, week, or month, respectively. In other words, the average rainfall of one

time period predicts future time periods—in large part because all of this sequence

is statistically the same.

Uniform seasonality. The measurement area may have a rainy season

and, therefore, a cyclical rainfall pattern (e.g., more in June and July than in April;

Fig. 5C). So long as this rainy season begins and ends reliably with the exact dates,

the linear model will produce an adequate description of this rainfall. The changes

over the year would show a peak across months, but in this example, with perfect

timing of the seasons, these changes across months do not change from year to

year.

Irregular seasonality. The above examples are rare cases in practice but

help illustrate the temporal symmetry of linear models. However, rainfall is often

more irregular (Fig. 5D). Rainy seasons can come late one year or early; they can

come early multiple years and late the following year. Also, wet or dry years can

exist in which the rainy season varies in intensity across years, decades, and

centuries. When considering these longer timescales, the Fourier transform can

spread longer and longer sinusoidal waves, and similarly, the linear autocorrelation

can incorporate progressively longer lags or waves. In any event, as rainfall is

measured over long timescales, our linear model can be complicated with

progressively more factors. However, no matter how long the timescale or added

factors, the linear model’s constraint is that the cyclical patterns must be regular

across time.

Irregular seasonality and its potential connection with

nonstationary autocorrelation. This issue of irregular seasonality is

conceptually the same as the nonstationarity highlighted in Fig. 1C, where the

temporal structure varied across time, suggesting that we could need different

autocorrelation functions from the beginning to the middle, and from the middle to

the end of the series. Here we have just that issue noted above, and a Fourier

transform may be calculated, but that may be difficult to interpret. The Fourier

transform will probe a series for its oscillatory modes and estimate each frequency's

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

amplitudes. However, if the oscillations change over time, the Fourier alone is not

sensitive to that change. The Fourier model (and so the linear model, more

generally) will be definable, but it will not be specific to the structure available in

the measurement.

Nonstationary autocorrelation is a crucial failure of linear modeling

Whereas the first three of our examples above are amenable to linear modeling,

these last two begin to shake the foundations of linear modeling. Certainly,

irregular seasonality might look regular at a longer time scale, and a noisy fit of the

Fourier model or the autocorrelation is not in itself a problem (see next section “No

over ever said linear models have to be perfect”). However, the destabilization of

the autocorrelation is the clearest statistical symptom of the linear model losing its

foothold on cause-and-effect relationships. Specifically, unstable autocorrelations

definitely indicate that the effects of past events are changing, and an extremely

interesting possibility is that these changes could depend on events at another

scale. That is to say, long-term seasonality and short-term variation are not fully

separable for many behaving systems.

The choices for how to address this autocorrelational instability are plenty.

What we choose to do with measurements whose temporal structure changes over

time will follow our theoretical interests. Analyses that blend frequency information

with time information (e.g., when oscillatory modes fade in or out) can begin to

take better stock of the measurement’s structure (Singh et al., 2017). We find

wavelet models like those that blend frequency and time information very

intriguing and useful. And indeed, some of these wavelet models have elaborations

that allow the calculation of multifractal spectra (Ihlen & Vereijken, 2010).

However, theoretical choices have steered us away from these wavelet methods for

two reasons. First, Chhabra and Jensen’s (1989) method does not require the steps

that wavelet-based methods do, that is, the Legendre transformation whose first

derivatives of wavelet-estimated root-mean-square across values of q can conflate

estimation error with multifractal structure (Zamir, 2003). Second, whereas

wavelets still aim to parse a series into the independent contributions of

independent timescales, we use surrogate testing specifically out of an interest in

assessing the strength of interactions across timescales. Surrogate testing is of

course available to wavelet-based estimates of multifractality (Ihlen & Vereijken,

2010), but we only highlight the limitation of wavelet methods as such.

No one ever said linear models have to be perfect

Prediction error is inevitable and expected

We are not claiming that multifractality is what to do because linear models are not

perfectly predicting. Perfectly valid linear models are expected to have error (i.e.,

differences between measured and predicted) across time. After all, no one expects

the empirical record to be perfectly regular. No linear model must thus

demonstrate a perfect fit. For instance, if our rainfall model predicted 26, 3, and 15

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

cm of rainfall next Monday, Tuesday, and Wednesday, we might find that the

measured rainfall turned out to be 19, 10, and 11 cm. That would entail errors in the

prediction of 7, –7, and 4 for those three days. Smaller errors enntail more accurate

predictions than larger errors. It would look bizarre for a model to predict perfectly.

And what makes a linear model valid is rather that the nonzero errors exhibit the

lack of temporal structure and so resemble white noise. That is, using the linear

autocorrelation to fit time-variability in the measurement voids any seeming

requirement that raw measurements of the developing process must always have

the same mean and variance. Said another way, the various methods of detrending

and modeling the autocorrelation can vanquish many of our worries about

changing mean or changing variance. It is only the prediction errors that must have

a zero mean, stable variance, and no correlation across time (i.e., nonzero

autocorrelation coefficients overall for the specified lag period). The model

predicting the values of the series itself can have all manner of structure to it (e.g.,

see examples of the autoregressive, integrated, moving-average [ARIMA] models

(Box et al., 1974; Box & Jenkins, 1968). Our complaint is thus not with linear

models predicting imperfectly; it is only with linear models whose errors take an

invalid form, that is, errors with time-varying mean, variances, or autocorrelation.

Behind this concern is the threat that the measurement series does not meet

assumptions of ergodicity and so fails to help infer a stable linear cause (Mangalam

& Kelty-Stephen, 2021, 2022).

Prediction only must be good on average but on the same average

throughout

What this means is that predictions need to be good only on average, but there

needs to be only a fair-coin’s worth of deviation between linear prediction and

actual behavior. We may know that predicted average can change, but linear

predictions only expect the same amount and type of error at any point from

beginning to end of the series. Linear prediction can manage limited degrees of

irregularity in the mean, but it assumes that the irregularity has the same form at

the beginning, middle, and end of the series. Good linear prediction is about

minimizing the variance of predictions around a time-symmetric portrayal of

change with time. That is, linear modeling tackles time-variability by assuming that

the time-variability is the same across time. This concern is a point specifically

about the autocorrelation, which is the linear description of “how a measure

changes with time.” The more a given measurement series’ autocorrelation changes

with time, linear prediction becomes a progressively weaker strategy.

Linear modeling fails so long as prediction errors change with time

Time-symmetrical models can only explain time-symmetric measurements, and

here is where our patience with the linear model breaks. Most importantly, the

linear model does not permit prediction errors with temporal structure deviating

from white noise, regressing-to-the-mean process. We are not speaking here of

simple failures of one or another of the features, for example, trend (failure of

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

stable mean), heteroscedasticity (failure of stable variance), or linear

autocorrelation (failure of memorylessness; Tabachnik & Fidell, 2007). Indeed, we

have long known about each of these failures. These alone could be met with

polynomial/sinusoidal detrending or logarithmic transforms. Instead, we are

raising the concern that, even when we use the known stopgap measures to

bandage these individual failures over, the residuals may persist in misbehaving,

and the variations in our measurements may reflect the change that is not reducible

to independent, additive causes. At a certain point, the issue is that time-

asymmetry is not just a bug of measurement but a persistent feature of our

measurement that our best-fitting and always time-symmetric linear modeling

structures can rarely capture. Furthermore, a deeper issue is that our most ornate

linear models like fractional integration raise mind-boggling questions about how

many causal time-asymmetric factors can be added in all while ignoring the plain

fact that psychological and biological experiences have a sequence, a developmental

progression to them (Kelty-Stephen & Wallot, 2017; Molenaar, 2008).

A major problem here for psychological science is in aiming for models to

explain and then predict cumulative progression. However, our frequently linear

models are all time-symmetric and time-invariant. Learning, remembering,

forgetting, and reading words in sequence—these processes depend on their

sequence. We know in our psychological theories that organisms should gather

their processes together and then break them down in systematic and context-

dependent ways. However, our frequently linear models only operate in a

framework that suggests that sequence is irrelevant: linear models add outcomes

from constituent causal factors, and adding is commutative, working the same way

backward and forwards (Molenaar, 2008). So, as the organism grows and learns

and develops, it does mean that new causes come into effect or that old causes find

new ways to participate. If not, then learning and developing are just reshuffling the

same resources every time. However, we speak comfortably about remembering or

forgetting, gaining experience, or losing our faculties. Our theories expect

irreversibility almost as a premise, but the linear modeling strategy is

characteristically an analysis into independent parts in reversible relationships to

each other (Cariani, 1993). It is no wonder that psychological science often finds

itself in the position of finding measurement residuals flying off the rails of our

predictive models (Kelty-Stephen & Dixon, 2012). Time-symmetry cannot explain

time-asymmetric processes: linear models cannot give a voice to psychological

theories of growth and development.

When prediction errors begin to deviate from this uncorrelated random

variation, it signifies a systematic departure of the linear model’s measurement

series. Just isolated points in time do not tip us off to something being amiss. It is a

statistical symptom that will only show up as we examine how the linear model

compares to the series in the long run. That departure might be sudden and abrupt;

it may be continuous or intermittent. However, in the long run, the issue is that the

prediction errors could be correlated with themselves across time—across those

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

same lags we had seen in the autocorrelation function above. It might be tolerable

to measure a short series with a linear model. However, with progressively longer

time, the deviation from linearity becomes progressively apparent as sums of

independent timescales keep on failing to capture nonsummative interactions

across timescales. This departure of prediction errors from white noise is the

empirical margin within which multifractal methods might help.

How do we perform multifractal analysis?

The present work uses one of the most straightforward variants of multifractal

analysis—called Chhabra and Jensen’s (1989) direct method (the Appendix at the

end of this article provides the mathematical details of this method). This method

built on the foundation of ‘bin proportions’ (Fig. 6). We will unpack this idea of bin

proportion as follows: ‘bins’ stand for subsections of the measured series and can

also be called ‘time windows,’ ‘limited samples’ or ‘short snippets’ of the longer

series. The question of concern is how closely any single bin of the series (i.e., any

small subset of measurements over time) resembles other measurements over time

(Fig. 6, top and bottom left). For example, will one bin of the series look like

another bin at some other time in the same measurement? How do these subsets of

measurements vary when looking at different timescales—do measurements in one

bin look like the measurements found in a longer bin?

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

Fig. 6. Multifractal analysis is founded on the notion of ‘bin proportion,’ P, and proceeds by

examining the statistics of the bin proportion of the series. Multifractal analysis proceeds by

using bin proportion and Shannon entropy of bin proportion to calculate a ‘singularity

strength,’ α, and a ‘Hausdorff dimension, f. The measured series appears in top-left panel.

‘Bins’ stand for subsections of the measured series of size L, as schematized in bottom-left

panel. Bin proportion is obtained by dividing “the amount of the measure in one bin” by “the

amount of all the measure across the entire series.” The slope of the linear regression of

logarithmic bin proportion, logP(L), against logarithmic bin size, logL, equals a singularity

strength, as shown in top-right panel. Shannon entropy is negative bin proportion, – P(L),

multiplied by logarithmic bin proportion, logP(L), as shown in bottom-right panel. Shannon

entropy reduces with bin size, and the negative slope of the linear regression of Shannon

entropy against logarithmic bin proportion estimates negative one times the Hausdorff

dimension, f. The symbols ‘L’ and ‘P’ in the bottom-left panel have been included to schematize

the fact that larger bins (i.e., larger ‘L’) have larger bin proportions (i.e., larger ‘P’), and

‘Entropy’ has been printed smaller and larger text next to the larger and smaller ‘L’ to indicate

the inverse relationship between L and Entropy. In subsequent sections, Shannon entropy will

be replaced with negative Shannon entropy to get rid of the negative sign next to f, and we will

further generalize this α and f with a q parameter that can distinguish between bin proportion

and fluctuation size.

Binning our time series to mimic time windows of observation

We will use the mathematical language of ‘bin proportion’—fractions to express

probabilities—to compare the amount of the measure within bins of different sizes.

Bin proportion is obtained by dividing “the amount of the measure in one bin” by

“the amount of all the measure across the entire series.” The unifying mathematical

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

question fundamental to multifractal analysis is: “how does bin proportion change

with timescale?” We will thus discuss bin proportion and timescale mathematically

throughout. We will encode bin proportion as P and the timescale as L for the bin's

‘length’ or size (Fig. 6, top right).

Multifractal analysis probes three major features of bin proportions.

•It considers the fact that the bin proportion P is sensitive to bin size L, and

the P ~ L relationship allows us to estimate this sensitivity in terms of the

singularity strength α.

•It considers the fact that heterogeneity in bin proportions can be sensitive to

bin size and that the relationship between Shannon entropy of bin

proportions and bin size allows us to estimate this sensitivity in terms of the

‘Hausdorff dimension’ f (Note: Shannon entropy yields –f in this relationship,

and although it is useful to recognize the appearance of Shannon entropy in

this calculation of heterogeneity, the convention in the multifractal analysis is

to use negative Shannon entropy to quantify the positive value f rather than –

f which the positive Shannon-entropy formula would yield; Halsey et al.

1986).

•It considers how the items in the earlier two points