Content uploaded by Jennifer Koch

Author content

All content in this area was uploaded by Jennifer Koch

Content may be subject to copyright.

Comparison of Time Series from Ecosystems and

an Artiﬁcial Multi-Agent Network Based on

Complexity Measures

Michael Hauhs

1

, Jennifer Koch

1

, and Holger Lange

2

1

Bayreuth Center for Ecology and Environmental Research (BayCEER),

University of Bayreuth, D-95440 Bayreuth, Germany

{michael.hauhs, jennifer.koch}@bitoek.uni-bayreuth.de,

http://www.bayceer.uni-bayreuth.de

2

Norwegian Forest Research Institute (Skogforsk), N-1432

˚

As, Norway

holger.lange@skogforsk.no,

http://www.skogforsk.no

Abstract. We investigate ecosystem dynamics by analyzing time se-

ries of measured variables. The information content and the complex-

ity of these data are quantiﬁed by methods from information theory.

When applied to runoﬀ (stream discharge) from catchments, the infor-

mation/complexity relation reveals a simple non-trivial property for a

large ensemble (more than 1800) of time series. This behaviour is so far

not understood in hydrology. Using a multi-agent network receiving in-

put resembling rainfall and producing output, we are able to reproduce

the observed behaviour for the ﬁrst time. The reconstruction is based

on the identiﬁcation and subsequent replacement of general patterns in

the input. We thus consider runoﬀ dynamics as the expression of an

interactive learning problem of agents in an ecosystem.

1 Introduction

Ever since the invention of Tierra [1], Artiﬁcial Life at the level of whole (virtual)

ecosystems has attracted ALife researchers. A minimal notion of an ecosystem

requires that it is open to abiotic ﬂuxes of mass and energy and contains life.

In (real) terrestrial ecosystem research, hydrological headwater catchments are

considered as protoypical examples for whole ecosystems. They are deﬁned as

the basins of attraction for rainfall ﬁelds, i.e. as the units of water transfer

from rainfall to runoﬀ in a stream. They have the advantage that only abiotic

criteria and observations are needed to identify and characterise its boundaries.

A disadvantage lies in the fact that they are often large units of study which are

diﬃcult to manipulate experimentally.

The input-output relationship between rainfall and runoﬀ in catchments is

often assumed to be describable by simple black box models [2]. However, these

models capture only some linear short-term aspects of the dynamics. A closer

look using nonlinear methods of time series analysis reveals a stunningly intri-

cate dynamics in particular for runoﬀ on all time scales [3–5]. In this paper, we

use methods from information theory to quantify information content and com-

plexity as nonlinear measures of the runoﬀ data. An important class of linear

models (autoregressive models and their extensions) is fundamentally unable

to reproduce these quantities from the observations, although these measures

are short-term only and the ﬁtted models do reproduce simple linear charac-

teristics (autocorrelation structure) by design. Another observation is that the

relationship between information and complexity for runoﬀ follows a simple one-

parametric curve, a nontrivial property which lacks explanation from hydrolog-

ical models or a process understanding thus far. It is currently unclear whether

models based purely on hydrology and abiotic transport are able to reproduce

this property. However, taking biological freedoms into account opens up for new

model classes which focus on behaviour directly.

We thus consider these ecosystems here as an ensemble of interacting or-

ganisms (agents), which receives input (rainfall) across the abiotic boundaries,

manipulates it, and releases output (runoﬀ). We use the term interaction in a

technical sense as deﬁned in [6]. We employ an artiﬁcial ecosystem consisting

of a multi-agent network to study the inﬂuence of interaction upon time series

generated by the network. Agents are able to make autonomous decisions de-

pending on their internal strategy parameters. The requirements for the agents

are here that they are capable of adapting to their local environment and mi-

grate between diﬀerent localities [7], [3]. Following learning strategies that seek to

identify repetitive patterns in the input, the agents may maximize their nutrient

access or eﬃciency, e.g. with respect to reproduction, in an evolutionary setting.

Identiﬁed patterns are ”used” (extracted), and thus the input transformed by

substitutions. We will show in the following that pattern substitution is a key

ingredient to reproduce the simple property observed in runoﬀ time series.

2 Information and Complexity Measures

2.1 Quantifying information content and complexity in time series

To calculate the values for information content and complexity, time series have

to be transformed into a symbol sequence (with λ indicating the size of the

alphabet, here always λ = 2). Using the same transformation method renders

the values for diﬀerent time series comparable. The estimation of values for

information content and complexity is based on part-intervals of a certain length

L, called words. Thereby not only the details of the value distribution of these

words, but also transition probabilities are of interest for some measures [4].

An especially suitable information content quantiﬁer for many environmental

data sets is the Mean Information Gain (MIG) [4], [8]. This measure quanti-

ﬁes the information gained on average, if L-word i is followed by L +1-word

j, which diﬀers from i only in the last symbol. With transition probability

p

L,i→j

=

n

L+1,j

n

L,i

, and event frequency p

L,ij

=

n

L+1,j

N−L+1

, used to estimate the

weighted average, Mean Information Gain is [8]:

H

G

= −

λ

L

i,j=1

p

L,ij

log

2

p

L,i→j

(1)

Fluctuation Complexity (FC) [9] accounts for information loss at transition

from L-word i to L-word j. It is the statistical ﬂuctuation of the net information

gain:

σ

2

FC

=

λ

L

i,j=1

p

L,ij

log

2

p

L,i

p

L,j

2

(2)

Reny´ı Complexity deﬁned by [3], [5], is based on diﬀerences of Reny´ıentropies

from conjugated orders:

C

R

(α)=

2

L ln 2(1 − α)

H

R

(α) − H

R

1

α

, (3)

with Reny´ıentropy

H

R

(α)=

1

1 − α

log

2

n

i=1

p

α

i

for α =1. (4)

The short term dynamics of natural and artiﬁcial time series is assessed by

using the above presented concepts of randomness, information and complex-

ity. The methods were developed in information theory and statistical physics

(Symbolic Dynamics, [10]). The information content of the time series is a mono-

tonically increasing but nonlinear function of randomness; thus it is quantiﬁed

by a ﬁrst order measure. A second order measure is expected to show low val-

ues at a coarse sampling rate (data close to noise), low values as well at very

high sampling rate (redundant measurements) and a maximum somewhere in

between [5]. This is in accordance with an intuitive notion for complexity. Here

we choose the Fluctuation Complexity or FC as such a quantiﬁer, it is based on

transition probabilities [9].

These two quantities have the desired features, as can be demonstrated e.g.

for binary Bernoulli sequences (Fig. 1). MIG is nonlinearly proportional to ran-

domness, being more sensitive to structural changes in the region of low random-

ness, FC exhibits a maximum and vanishes for constant as well as completely

random sequences. The combined result of the randomness and complexity anal-

ysis characterises a time series in MIG/FC plots, whereby the two measures form

the axes of these plots (see ﬁgures below). The black curve shown in these ﬁgures

represents the theoretical maximum that can be attained by a random process.

These ﬁgures will be used to assess the similarity between observed catchment

behaviour and the behaviour attained in our simulations.

Any constraints in the dynamics or behavioural patterns will lower the MIG-

/FC value of a time series. It turns out that if this lowering occurs in a consistent

manner across the range of randomness its distance from the limiting curve is

0 10 20 30 40 50 60 70 80 90 100

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

Randomness [%]

Information/Complexity

Complexity − FC

Information − MIG

.

Fig. 1. Mean Information Gain (MIG - ”Information”) is chosen as ﬁrst order mea-

sure (information) and Fluctuation Complexity (FC - ”Complexity”) as second order

measure (complexity). These are here shown as functions of a randomness parameter

(range p ∈ [0, 1/2]: interpreted as scale of randomness from 0% to 100%) in a binary

Bernoulli-Process [3].

assessed by ﬁtting the exponent α of the Reny´ı Complexity. In the limit of α =1

the Reny´ı Complexity becomes identical with the FC [5].

In the subsequent results all time series have been (statically) partitioned at

the median into a binary alphabet. The statistics of words with diﬀerent length

from this alphabet is then used to calculate ﬁrst and second order complexity

measures and the distances from the limit curve. In all examples below, four-

letter words were used.

2.2 Application to runoﬀ data

A window technique was used to apply the measures locally in long-term time

series. The window length for runoﬀ data from catchments was typically 4 years

at daily resolution. The length of artiﬁcial data sets generated without use of the

agent network was 250000. Window length for results from the agent network

was 3400. We inspected diﬀerent window lengths from 25 to 20000 (not shown).

When applied to long-term runoﬀ data sets at daily resolution, a unique

parametrisation of the Reny´ı Complexity results (Fig. 2, α =1.28). So far it had

not been possible to reconstruct artiﬁcial time series with such a property by

stochastic or deterministic generators. One of the conjectures to explain these

diﬃculties suggested this behaviour as a signature of (indirect) interaction among

the organisms within the catchment [11].

.

Fig. 2. Data sets collected from real world ecosystems (hydrological catchments). The

black curve gives the limiting case for α = 1. Each blue dot represent a long-term

runoﬀ data set (>30 years), green triangles are tropical catchments. The red line gives

aﬁtoftheReny´ı Complexity for α =1.28 [3].

2.3 Generating the signatures of universal Reny´ı Complexity with

artiﬁcial systems

Here we tested this conjecture without and with a multi-agent network. The net-

work is able to simulate a parallel decision process by the action of agents which

aﬀect a realised stochastic process, here the supply with external nutrients that

limit growth and proliferation of the agent populations. Interaction among the

agents in this network is indirect only [12]. Before employing the agent network

we tested the eﬀect of a two step realisation process in which we ﬁrstly realised

a time series with speciﬁed randomness (by choosing the Bernoulli parameter)

and secondly identiﬁed and replaced (enhanced) patterns within these series.

The ﬁrst test without the network was necessary to show that a selective deci-

sion was suﬃcient for reconstruction of the observations and that this had to be

done after the random process had been realised. This procedure is described

below.

In order to assess the impact of (interactive) decisions on information and

complexity of time series we generated examples of the Bernoulli-Process that

forms the limiting curve in ﬁgure 2 (for α =1.0) with 250000 points drawn

from a binary alphabet. Then we searched these realisations for one or two

general patterns (e.g. where the four letter words interpreted as integer where

monotonously increasing in 3 (4) subsequent overlapping words). Such local gen-

eral patterns where replaced by either random sequences or by a unique pattern

(a ﬁxed permutation of the original sequence used for all realisations in the range

of randomness) while conserving the occurrences of letters. Figure 3 shows the

results for local randomisation (α =1.06) and pattern replacement (α =1.32).

Depending upon the pre-selected pattern much higher α where possible.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Mean Information Gain (MIG)

Fluctuation Complexity (FC)

Bernoulli−Curve

.

Fig. 3. Artiﬁcial time series with varying Reny´ı Complexity. The dash-dotted line

represents the ﬁt of the Reny´ıComplexityα =1.06 on Bernoulli-Process realisations

with diﬀerent range of randomness, where general patterns were replaced by random

sequences. The dotted line (ﬁt of the Reny´ıComplexityα =1.32) is the same for

replacement by a ﬁxed permutation of the original sequence. Dots in the lower right

corner are not in line with ﬁtted Reny´ı-Curve. Diﬀerent symbols stand for diﬀerent

Bernoulli parameters.

The resulting artiﬁcial time series thus included the complete set of short-

range patterns that are produced by the stochastic process, except for one or

two that were manually removed afterwards. The deviation in the random (lower

right) corner of the diagram is subject of further investigations. Now we can look

for a multi-agent simulation where a similar signature is produced from indirect

interaction among agents competing for nutrients. We use the agent-network

Pool-World for this purpose.

3 Multi-Agent Network: Pool-World

Pool-World [13] is a system in which agents interact indirectly by uptake of

limited resources provided by the environment. Agents evolve under selective

pressure expressed through the temporal and spatial variability of these resources

in input time series. Here the input has been parameterised to represent ranges

of simple or random behaviour.

The question is what is the minimal interactive behaviour within Pool-World

in order to generate time series with similar characteristics as in real ecosystems?

The criterion for comparing input to output is restricted to the position of these

streams in the MIG/FC diagram (see above). [13] showed that even with only

random input of resources the dynamics of the evolving (interacting) agents

reveal a complex long-range dynamics. Here we study the short term measures

only.

Fig. 4. The network consists of places (dark-grey) between which agents (light-grey)

may move. Resources (geometric symbols) have prespeciﬁed input functions for each

of the places, including zero inputs.

The multi-agent network consists of a number of connected 0-dimensional

places (Fig. 4). The agents may move along the connections, but they have to

pay for it by using internal resources. Resources are represented by geometric

symbols. The external supply with resources is provided by speciﬁed functions as

simple, random or complex. Input streams may diﬀer among places and among

nutrient types (here indicated by shape). Agents compete for them by taking

them up through input interfaces with an individually speciﬁed resource aﬃnity.

Aﬃnities may evolve during reproductive events. Agents reproduce non-sexually

when a resource threshold is reached. They die when failing to encounter a

minimum resource load or when the reach a maximum life span. Resources not

taken up are ”drained” from places by an exponential export function. Agents

are only able to interact indirectly through uptake of resources. They are not

equipped with memory in theses runs. The only persistent adaptive state is the

uptake preference parameter.

In former runs it had been shown that with these parameter settings special-

ists and generalists may evolve among agents. The former tend to stay in one

place specialising on one resource type, while the latter tend to move more freely

around. The program is run as a JAVA application (on one single machine). For

details see [7], [13].

Here we used simulations with up to four places. Output of the resources

”drained” was monitored (from each place of the network, but separately for each

resource type) and population levels at each place were observed. We conducted

ﬁfteen scenarios with one place, varying lifetime, reproduction rate, nutrient

input ﬂuxes and decay rate of the nutrients (which has direct eﬀects on nutrient

residence time in the places). Reproduction rates in all scenarios were chosen

near to 1 for holding population levels low. Scenarios with two, three and four

places, showing diﬀerent topologies and so called ”desert”-places (almost no

nutrient input), served for investigating migration. Nutrient based migration was

tested as well as population based migration. Scenarios with four places included

variations of migration threshold. Altogether we analysed 195 output time series

and 44 population time series with information and complexity measures.

Only some variations showed an eﬀect at all. Changing migration from nutri-

ent to population based showed no eﬀects, as well as migration threshold. High

decay rate values aﬀected that the nutrients left the places fast. Especially life-

time and reproduction rate variations had strong eﬀects. Populations adaptation

to available nutrients happend much faster at higher lifetime and reproduction

values.

4Results

We tested between 1 and 4 places with varying connections and up to three

nutrient input ﬂuxes at each place. We used only parts of the time series after

initial transient increases in population levels (Fig. 5). The outﬂow levels are

in these later phases low due to strong competition among agents, migration

and adaptation eﬀects. In the typical example shown in ﬁgure 5 long-term levels

appear stationary though bursts occured (in population and in outﬂow time

series).

0

5

10

0

2

4

6

1 2 3 4 5 6 7 8

0

10

20

2 3 4 5 6 7 8

10

20

30

40

Data point [× 10

−4

]

Number

Outflow "B"

Outflow "G"

Outflow "R"

Population

Fig. 5. Typical results for outﬂow and population time series (one place). Initial tran-

sient increases were cut, the remaining parts show low nutrient outﬂow levels, two

nutrient outﬂows (”B” and ”R”) show bursts.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Mean Information Gain (MIG)

Fluctuation Complexity (FC)

p

Input

= 2

p

Input

= 1

p

Input

= 5

p

Input

= 0.5

p

Input

= 0.2

p

Input

= 0.05

p

Input

= 0.1

.

Fig. 6. The black curve gives the limiting case for α = 1. With increasing nutrient

input, nutrient outﬂow positions in information-complexity diagram shift along the

limiting Bernoulli-Curve from simple to random. Each dot stands for one window of a

time series with corresponding nutrient input.

We show only one typical result here. Many similar series where produced

with varying parameters for agents, nutrient inputs and network topologies.

The outﬂow of resources from all runs did not deviate from the maximum

curve (α =1.0). By variation of the input parameter (from 0.05 to 5; the higher

the input, the higher is the number of diﬀerent values observed in output time

series) we were able to shift the position of the time series for nutrient export

along the limiting curve from simple to random (Fig. 6). In this case we did

consider only result without any burst phenomenona (see for example the burst

of the ”R” or ”B” nutrients in ﬁgure 5 above). The bursts induced deviations

from the limiting curve. These eﬀects were transient. That is why we studied the

information and complexity measures resulting when two curves from diﬀerent

positions along the limiting curve were concatenated (Fig. 7).

These intermediate positions in the information-complexity diagram display

transient behaviour when the window moves between the two concatenated time

series and cannot be ﬁtted by a consistent α-value. This eﬀect is very diﬀerent

from the one observed in the catchment data set.

The last ﬁgure shows the population time series at various locations (Fig. 8).

These time series have a similar characteristic deviation from the limiting curve

across the various degrees of randomness (α =1.22).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Mean Information Gain (MIG)

Fluctuation Complexity (FC)

.

Fig. 7. Two diﬀerent time series from the limit curve (solid dots) concatenated. When

the measures are analysed by the window technique intermediate positions below the

limiting curve result. Dashed lines indicate the trajectories taken by transient alpha

values, when the windows moves from on to the other part of the two time series.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Mean Information Gain (MIG)

Fluctuation Complexity (FC)

Bernoulli−Curve

Renyí−Curve, α = 1.22

Population

Fig. 8. Population time series (about 40 locations). The dashed line represents the ﬁt

of the Reny´ıComplexity(α =1.22).

5 Discussion and conclusion

We were able for the ﬁrst time to artiﬁcially generate a similar signature in a

information-complexity diagram as in real world data sets. However, when using

a multi-agent network for this purpose, this signature appeared in the population

dynamics of the agents and not in the nutrient export from the network. Clearly

more topologies, and agent features could be tried out with this AL simulation.

One drawback with the current setup has been that competition among agents

was ﬁerce and largely successful. Hardly any of the nutrients was exported at all

and export rates were rather low (Fig. 5).

The two step procedure by which a random process has been realised ﬁrst and

then a selection/decision is applied to it was so far the only procedure capable

of producing one-parametric reductions from the limiting curve. This indicates

that an interpretation of the universal behaviour of runoﬀ data is consistent with

an interactive (online) inﬂuence of biota: they decide depending on the innate

strategies based on the realised precipitation available. Rainfall is an example of

the random process. When realised as local rainfall pattern the vegetation decides

and thus imprints this decision as a pattern upon the time series. The result is a

set of behaviours that can be described as active removal or substitution of some

of the realised patterns in the input, i.e. as an active ﬁlter. This is in accordance

with the recent observation [14] that biological and physical time series from

the same environmental system might behave qualitatively diﬀerent: whereas

biological data exhibit nonlinear deterministic dynamics, the physical variables

are best described in linear stochastic framework. We believe that this can be

modelled much easier interactively.

So far no closed form (analytical) model of transport processes in catchments

has been successful at rigorously describing the physical process. The combina-

tion of a stochastic process with an active decision by agents may be a way to

show why it has been so diﬃcult to conceptualise water transport in ecosystems

as a purely physical process. AL simulation have been criticised for being diﬃcult

to compare with observations from real Life. The above example demonstrates

how an AL simulation can be linked to Life at the ecosystem scale and that also

hydrological data sets can be used in this context.

References

1. Ray, T. S.: An approach to the synthesis of life. In: Langton, C., Taylor, C., Farmer,

J.D. and Rasmussen, S. (eds.), Artiﬁcial Life II, Santa Fe Institute Studies in the

Sciences of Complexity, XI (1991) 371–408

2. Jakeman, A. J. and Hornberger, G. M.: How Much Complexity is Warranted in a

Rainfall-Runoﬀ Model? Water Resources Research, 29 (1993) 2637–2649

3. Lange, H.: Charakterisierung ¨okosystemarer Zeitreihen mit nichtlinearen Methoden.

Bayreuther Forum

¨

Okologie, 70, Bayreuth (1999)

4. Lange, H.: Time series analysis in ecology. Nature - Encyclopedia of Life Sciences

(in press)

5. Wolf, F.: Berechnung von Information und Komplexit¨at in Zeitreihen – Analyse des

Wasserhaushalts von bewaldeten Einzugsgebieten. Bayreuther Forum

¨

Okologie, 65,

Bayreuth (1999)

6. Goldin, D., Smolka, S., Attie, P. and Sonderegger, E.: Turing Machines, Transition

Systems, and Interaction. Information and Computation Journal, 194 (2004) 101–

128

7. Glotzmann, T.: Ein agentenbasiertes Simulationssystem zur Entwicklung

¨okosystemarer Szenarien in strukturierten Umgebungen. Bayreuther Forum

¨

Okologie, 102, Bayreuth (2003)

8. Wackerbauer, R., Witt, A., Atmanspacher, H., Kurths, J. and Scheingraber, H.:

A Comparative Classiﬁcation of Complexity Measures. Chaos, Solitons & Fractals

4(1) (1994) 133–173

9. Bates, J. E. and Shepard, H. K.: Measuring complexity using information ﬂuctua-

tion. Physics Letters A 172 (1993) 416–425

10. Lind, D. A. and Marcus, B.: An Introduction to Symbolic Dynamics and Coding.

Cambridge University Press, Cambridge (1995)

11. Hauhs, M., Lange, H.: The modelling approach in ecosystem research and man-

agement. In Horton, G. (ed.): 18th European Simulation Multiconference, (2004)

276–282

12. Keil, D., Goldin, D.: Modeling Indirect Interaction in Open Computational Sys-

tems. 1st Int’l workshop on Theory and Practice of Open Computational systems

(TAPOCS) (2003)

13. Glotzmann, T., Lange, H. and Hauhs, M.: Population Dynamics under Spatially

and Temporally Heterogeneous Resource Limitations in Multi-agent Networks. In

Banzhaf, W., Christaller, T., Dittrich, P., Kim, J. T. and Ziegler, J. (eds.): Advances

in Artiﬁcial Life, 7th European Conference, ECAL 2003 (2003) 328–335

14. Hsieh, C., Glaser, S. M., Lucas, A. J. and Sugihara, G.: Distinguishing random

environmental ﬂuctuations from ecological catastrophes for the North Paciﬁc Ocean.

Nature 435 (2005) 336–340