Conference PaperPDF Available

Inferring Particle Interaction Physical Models and Their Dynamical Properties

Authors:

Abstract and Figures

We propose a framework based on port-Hamiltonian modeling formalism aimed at learning interaction models between particles (or networked systems) and dynamical properties such as trajectory symmetries and conservation laws of the ensemble (or swarm). The learning process is based on approaches and platforms used for large scale optimization and uses features such as automatic differentiation to compute gradients of optimization loss functions. We showcase our approach on the Cucker-Smale particle interaction model, which is first represented in a port-Hamiltonian form, and for which we rediscover the interaction model, and learn dynamical properties that are previously proved analytically. Our approach has the potential for discovering novel particle cooperation rules that can be extracted and used in cooperative control system applications.
Content may be subject to copyright.
Inferring Particle Interaction Physical Models
and Their Dynamical Properties
Ion Matei, Christos Mavridis, John S. Baras and Maksym Zhenirovskyy
Abstract We propose a framework based on port-
Hamiltonian modeling formalism aimed at learning interaction
models between particles (or networked systems) and dynamical
properties such as trajectory symmetries and conservation laws
of the ensemble (or swarm). The learning process is based
on approaches and platforms used for large scale optimization
and uses features such as automatic dierentiation to compute
gradients of optimization loss functions. We showcase our
approach on the Cucker-Smale particle interaction model,
which is first represented in a port-Hamiltonian form, and
for which we re-discover the interaction model, and learn
dynamical properties that are previously proved analytically.
Our approach has the potential for discovering novel particle
cooperation rules that can be extracted and used in cooperative
control system applications.
I. Introduction
Extracting physical laws that govern a given system from
data is a central challenge in many diverse areas of science
and engineering. Most complex systems can be described
as discrete structures (graphs) with dynamical relations [1].
Such networked systems are ubiquitous and include multi-
body systems, chemical reaction networks, animal and UAV
swarms, and power systems. A fundamental challenge in
complex networked systems is to infer the laws of interaction
between particles and their dynamical properties [1]. The
problem has been approached either by using statistical
learning [2] [3], or by learning the parameters of equations
modeling the system. In [4] symbolic equations are generated
from the numerically calculated derivatives of the system
variables. In [5], [6] the constitutive equations of physical
components the system are learned using acausal repre-
sentations, while in [7] the order of fractional dierential
equations modeling the system is estimated.
A general and powerful geometric framework to
model complex dynamical networked systems is the port-
Hamiltonian modeling formalism [8], [9], [10]. Port-
Hamiltonian systems are based on a known energy function
(Hamiltonian) and the interconnection of atomic structure
elements (e.g. inertias, springs and dampers for mechanical
systems) that interact by exchanging energy. They provide an
energy-consistent description of a physical system, having
the property that a power conservative interconnection of
Ion Matei and Maksym Zhenirovskyy are with the Palo Alto Re-
search Center (PARC), Palo Alto, CA (emails: ion.matei@parc.com,
maksym.zhenirovskyy@parc.com). Christos Mavridis and John S. Baras
are with the Department of Electrical and Computer Engineering and the
Institute for Systems Research, University of Maryland, College Park, MD
(emails: mavridis@umd.edu, baras@isr.umd.edu).
This material is based upon work supported by the Defense Advanced Re-
search Projects Agency (DARPA) under Agreement No. HR00111990027.
port-Hamiltonian systems is again a port-Hamiltonian system
[11].
In addition, the port-Hamiltonian system framework is
particularly suited for finding symmetries and conserved
quantities [12]. In particular, it allows to find conserved
quantities, in addition to the Hamiltonian, called Casimir
functions [9], by examining conditions related to the port-
Hamiltonian system at hand, which can lead to model
simplification (reduction). Moreover, finding parameterized
symmetries, e.g., Lie groups of transformations, can lead to
data generation without experimentation, as well as provide
insight on the modeling equations of the system itself [13],
[14].
In this work, we are interested in models describing the
dynamics of swarms or particle ensembles (e.g. bird flocks),
which have been studied intensively through the years [15],
[16], [17]. We model the system of interacting particles as a
graph topology based on port-Hamiltonian components, and
investigate its dynamical properties, such as discrete symme-
tries of trajectories, Lie groups of invariance transformations
and conservation laws. To showcase our approach we use the
Cucker-Smale (CS) model [16] to generate training and test
data for the learning tasks. We apply large scale optimization
methods implemented on deep learning platforms to learn
the particle interaction model from data, and recover its
dynamical properties. Finally, we compare the results of our
method to the ones derived by the theoretical analysis.
The rest of the manuscript is organized as follows: Section
II introduces the CS dynamical model for particle interac-
tions, the port-Hamiltonian formalism, discrete symmetries
and the Lie groups of invariance transformations. Section
III describes the port-Hamiltonian representation of the CS
interaction model. In Section IV we prove theoretical results
on discrete symmetries, Lie groups of invariance transfor-
mations and conservation laws based on Casimir functions.
Section V describes optimization based learning algorithms
that are used to recover the particle interaction model, the
discrete and Lie symmetry maps and the conserved quanti-
ties. Finally, Section VI concludes the paper.
II. Preliminaries
In this section we first describe the CS dynamical model
used to showcase our approach, we give a brief description
of the port-Hamiltonian formalism and introduce the notions
of symmetries and Lie groups of invariance transformations.
A. Cucker-Smale Particle Interaction Model
Let idenote a particle in an ensemble of Nparticles. The
CS particle interaction model [16] is given by ˙xi=viand
˙vi=1
NPN
j=1G(kxixjk)(vjvi), where a typical choice for the
interaction function Gis G(r)=1
(1+r2)γ. The above dynamics
ensures velocity alignment of all particles [16], [17]. An ex-
tension of the original model comes from adding a potential
function [16], [17], resulting in the dynamics ˙xi=viand ˙vi=
1
NPN
j=1G(kxixjk)(vjvi)1
NPi,jU(kxixjk), where the
potential function takes the form U(r)=CAer/lA+CRer/lR,
with CA,CR,lA,lRpositive scalars. The above model can be
compactly written as
˙
x=v(1)
˙
v=G(x)v− ∇U(x),(2)
where [x]i=xi,[v]i=vi,[G(x)]i,i=1
NPN
i=1G(kxixjk),
[G(x)]i,j=1
NG(kxixjk), for i,j, [U(x)]i,i=0, and
[U(x)]i,j=1
NU(kxixjk), for i,j.
B. Port-Hamiltonian Systems
Consider a finite-dimensional linear state space X
along with a Hamiltonian H:X → R+defining energy-
storage, and a set of pairs of eort and flow variables
{(ei,fi)∈ Ei× Fi,i{S,R,P}}, describing ports (ensembles
of elements) that interact by exchanging energy. Then, the
dynamics of a port-Hamiltonian system Σ = (X,H,S,R,P,D)
are defined by a Dirac Structure D[9], [10] as
(fS,eS,fR,eR,fP,eP)∈D⇔eT
SfS+eT
RfR+eT
PfP=0,
where (i)S=(fS,eS)∈ FR× ER=X×X is an energy-storing
port, consisting of the union of all the energy-storing ele-
ments of the system (e.g. inertias and springs in mechanical
systems), satisfying fS=˙x,eS=H
x(x),x∈ X such that
d
dt H=eT
SfS=eT
RfR+eT
PfP, (ii)R=(fR,eR)∈ FR× ERis
an energy-dissipation (resistive) port, consisting of the union
of all the resistive elements of the system (e.g. dampers
in mechanical systems), satisfying heR,fRi0 and, usually,
an input-output relation fR=R(eR), (iii)P=(fP,eP)
FP× EPis an external port modeling the interaction of the
system with the environment, consisting of a control port
Cand an interconnection port I, and (iv)D ⊂ F × E =
FR× ER× FR× ER× FP× EPis a central power-conserving
interconnection (energy-routing) structure (e.g. transformers
in electrical systems), satisfying he,fi=0,(f,e)∈ D, and
dimD=dimF, where E=F, and the duality product he,fi
represents power.
The basic property of port-Hamiltonian systems is that
the power-conserving interconnection of any number of port-
Hamiltonian systems is again a port-Hamiltonian system. An
important and useful special case is the class of input-state-
output port-Hamiltonian systems ˙x=[J(x)R(x)] H
x(x)+
g(x)u,y=gT(x)H
x(x), where u,yare the input–output pairs
corresponding to the control port C,J(x)=JT(x) is skew-
symmetric, while the matrix R(x)=RT(x)0 specifies the
resistive structure.
C. Symmetries and Lie Group of Transformations
Give a a dierential algebraic equation (DAE) F( ˙x,x)=0
with x∈ X ⊆ Rn, the map Ψ:X × S→ X is a symmetry map,
if it is a dieomorphism, and ˆx= Ψ(x) is a solution for the
DAE, that is F(˙
ˆx,ˆx)=0. Therefore the symmetry map must
obey the symmetry condition FΨ(x)
x˙x,Ψ(x)=0.
A particular type of symmetry maps are Lie groups of
invariance transformations. The map Ψ:X × S→ X, where
SRis an interval with 0 S, along with a composition law
φ:S×SS, defines a parametrized (Lie) symmetry [12],
and, in particular, a Lie group of invariance transformations,
if for any solution xof the DAE, and for all S, ˆx(t)=
Ψ(x(t),) is also a solution of the system, and (i) Ψ(x, )
is smooth in xand analytic in , (ii) Ψ(·, ) is an injection
for all S, (iii) (S, φ) forms a group with identity element
zero, and φis analytic, (iv) Ψ(x,0) =x,xD, and (v) if
x= Ψ(x,) and xδ= Ψ(x, δ), then xδ= Ψ(x( ,δ)). Using
the infinitesimal operator X=Ψ(x, )
∂ |=0
x=η(x)
x, we have
that ˆx= Ψ(x, )=eXxand the symmetry condition becomes
X0F( ˙x,x)=0, or η(x)TF
x( ˙x,x)+˙xT∂η
x(x)F
˙x( ˙x,x)=0.
III. Port-Hamiltonian representation of the Cucker-Smale
interaction model
We introduce the notion of generalized mass-spring
damper (gMSD) components. We typically consider masses
as having one port, and springs and dampers as having
two ports. Ports are component interfaces through which
energy is exchanged. Their dynamical representations of the
gMSD are as follows: mass ˙p=f,v=H
q, spring ˙q=v,
f=H
q, damper f=R(q)v. In the case of the mass, pis
the momentum, fis the force acting on the mass, vis the
mass velocity and His the mass Hamiltonian function. In the
case of the spring, qis the spring elongation (the dierence
between the positions at the two ports), vis the relative
velocity, fis the force through the spring and Hdenotes
the spring’s Hamiltonian. In the case of the damper, fis the
force through the damper, qis the relative position of the
damper, Ris a resitive term as a function of qand vis the
relative velocity.
Proposition 3.1: The CS model with potential is equiv-
alent to a fully connected N-dimensional network of gen-
eralized mass-spring-dampers, where each node iin the
network is a mass, and each link (i,j) a parallel composition
of a spring and damper. The Hamiltonian functions for
the mass-springs are given by H(p)=1
2pTpand H(q)=
1
N"CAekqk
lA+CRekqk
lR#, respectively, and and the resistive
function of the damper is given by R(q)=1
N[1+kqk2]γ.
We show an example of this result for the one dimensional
case (p,qR) and for the 3 particles case. The result
holds for the general case, but the notations become more
cumbersome. The fully connected topology of the gMSD
network is shown in Figure 1. We denote by Hiand Hij the
Hamiltonian functions of the masses and springs, respec-
tively. We note that since we assume, unitary masses, the
momenta are equal to the mass velocities, that is, pi=vi,
Fig. 1: Fully connected, 3-dimensional gMSD network
i={1,2,3}. The forces through the links are the sum of the
forces through the dampers and springs, and are given by
fi j =Hij
qi j
+R(qi j)(vivj), for (i,j)∈ {(1,2),(2,3),(3,1)}. The
forces through the masses can be expressed as: f1=f31 f12,
f2=f12 f23 and f3=f23 f31. We get the expressions for
the mass momenta dynamics as:
˙p1=H31
q31
H12
q12
+R(q31)(v3v1)+R(q12 )(v2v1),(3)
˙p2=H12
q12
H23
q23
+R(q12)(v1v2)+R(q23 )(v3v2),(4)
˙p3=H23
q23
H31
q31
+R(q23)(v2v3)+R(q31 )(v1v3).(5)
The dynamics for the spring elongations are
˙qi j =vivj=Hi
pi
Hj
pj
(6)
for (i,j)∈ {(1,2),(2,3),(3,1)}. To recover the CS model with
potential, we replace the relative positions qij with the
absolute positions, namely qi j =qiqjRecalling that spring
potentials are symmetric functions, we get that
H31
q31
H12
q12
=1
3(U(q1q3)− ∇U(q1q2))(7)
H12
q12
H23
q23
=1
3(U(q2q1)− ∇U(q2q3))(8)
H23
q23
H31
q31
=1
3(U(q3q2)− ∇U(q3q2))(9)
Substituting (7)-(9) in (3)-(5), and recalling that under our
assumptions pi=vi, we recover exactly the CS model with
potential.
By introducing the notation zT=[pT,qT], with pT=
[p1,p2,p3] and qT=[q12,q23 ,q31], the equations (3)-(5) and
(6) can be expressed compactly as
˙
z=[J(z)R(z)]H(z)
z,(10)
where H(z)=H1(p1)+H2(p2)+H3(p3)+H12(q12 )+
H23(q23 )+H31(q31), and
R(z)="R(z)0
0 0 #
with
R(z)=
R(q12)+R(q31 )R(q12)R(q31)
R(q12)R(q12 )+R(q23)R(q23)
R(q31)R(q23 )R(q31)+R(q23)
and where
J(z)="0J
JT0#,with J=
1 0 1
11 0
0 1 1
.
We recognize equation (10) as the typical input-state-output
port-Hamiltonian system [9], [10].
IV. Dynamical properties of the Cucker-Smale model
In this section we introduce a set of maps for which
we demonstrate that they satisfy the required properties
for being symmetry or Lie symmetry maps. In addition,
we introduce a conserved quantity that diers from the
Hamiltonian function. The maps and the conserved quantity
will be rediscovered in the learning section. The symmetry
maps will be introduced for both the original CS model and
its port-Hamiltonian representation. We consider the 1-d case
(pRN), since the results can be easily generalized to higher
dimensions.
A. Symmetry maps
The following result introduces a symmetry map for the
CS dynamics in port Hamiltonian form.
Proposition 4.1: The map Γ(p,q)=(p+α1,q) for αR
is a symmetry map for the port-Hamiltonian dynamics (10).
For the CS dynamics with potential in its original form (1)-
(2), the symmetry map is slightly dierent, as shown next.
Proposition 4.2: The map Γ(x,v,t)=(x+α1t+β1,v+
α1,t) for αRis a symmetry map for the CS dynamics
with potential (1)-(2).
B. Lie group of invariance transformations
As introduced in Section II-C, the Lie group of invariance
transformations [18], [19] are a particular type of symmetry
maps with the form ˆ
z=z+εη(z,t)+O(ε2). The following
result introduces the infinitesimal of the CS model in the
original form.
Proposition 4.3: The map η(z,t)=η(x,v,t)=
αth1T,0TiT+hβ1T,α1TiT, for all α,β Ris an
infinitesimal for the Lie group of invariance transformations
corresponding to the CS dynamics in its original form.
A similar result holds for CS model in port-Hamiltonian
form, where the time dependence of the infinitesimal map
is no longer present.
Proposition 4.4: The map η(z)=η(p,q)=αh1T,0TiT, for
all αRis an infinitesimal for the Lie group of invariance
transformations corresponding to the CS dynamics in port
Hamiltonian form.
C. Conserved quantities
The port-Hamiltonian representation has the advantage of
providing at least one quantity that is conserved, namely
the Hamiltonian. In addition to the Hamiltonian function,
there are other quantities that are conserved. The following
results introduce such quantities for both the original and
port-Hamiltonian representation of the CS particle dynamics.
Proposition 4.5: The quantity 1Tvis conserved by the CS
dynamics (1), that is, 1T˙
v=0, for all t0.
We can show similar results in the case of the port-
Hamiltonian representation. We will make use of the Casimir
functions which represent the conserved quantities for port-
Hamiltonain systems.
Proposition 4.6: Any function of the form C(p,q)=
α1Tp+uTq+β, where uNull(J), and α, γ Ris a
conserved quantity for the CS dynamics in port-Hamiltonian
form (10), where zT=[pT,qT].
Remark 4.1: Note that in the 3 particle example, the
matrix Jis square and the null space of Jand JTis the
same. In general this it is not true since JRN×M, where
M=N(N1)/2. Hence, only the null space of JTis given
by {α1,α R}.
V. Learning interaction models and their dynamical
properties
To demonstrate that we can indeed recover the theoretical
results proved in the previous sections, we consider an
example where twenty particles (N=20) evolve according
to the CS dynamics. We consider both the original and
port-Hamiltonian representation of the CS dynamics. The
particles operate in a two dimensional space, that is, the
(relative) position and velocity vectors of each particle have
dimension two. The training data were generated by simu-
lating the CS model with parameter γ=0.15, over the time
interval [0,40] sec, starting with random initial conditions
in the interval [0,10]. A realization of the CS, simulation
results is shown in Figure 2, where we plot the particle
speed (norm of the velocity vector). The structure of the
Fig. 2: Particle speed over time kvi(t)k,i∈ {1,...,N}
time series used for training is zT=[xT,yT,vT
x,vT
y], where
x,y,vx,vyRN. In the port-Hamiltonian representation, the
structure is slightly dierent, namely zT=[pT
x,pT
y,qT
x,qT
y],
where qx,qyRN(N1)
2, and pT
x,pT
yRN. The computation
of the gradients and Jacobians was done using automatic
dierentiation. The learning problems were implemented us-
ing the Python package Autograd [20] and the deep learning
platform Pytorch [21] featuring automatic dierentiation.
A. Particle interaction model
Our first task is to recover the interaction model be-
tween particles. We consider the port-Hamiltonian repre-
sentation case, without potential, which can be obtained
by approximating the spring potential function with zero,
by appropriately choosing the parameters of the potential
function. Using the port-Hamiltonian formalism, this task
translates to learning the constitutive equation for a gen-
eralized damper. In particular we learn Fi j =g(q2
i j;w) ˙qi j
that describes the force acting between two particles i,j,
where qi j is the relative position between two the particles.
We choose the map gto be a neural network (NN) with
one hidden layer of size 12, whose output is given by y=
W[1] tanhW[0]u+b[0]+b[1] , where the weight exponents
denote the layer number. Hence we have a total of 37
parameters. Note that we can add a ReLu type of activation
on the last layer to impose a non-negative output of the
NN. To learn the parameters of the map g, we solve the
optimization problem minw1
nPn
i=1kz(ti)ˆ
z(ti;w)k2, where n
is the number of time samples, w={W[0],b[0] ,W[1],b[1]}
is the set of optimization variables, ˆ
z(ti) are time samples
of the solution of (10) with the resistive term defined by
R(q)=g(kqk2;w), and no potential between particles. The
initial positions and velocity were uniformly drawn from the
interval [0,10]. We used the Autograd package and its Adam
algorithm implementation to solve the least square problem
introduced above. The optimization error was set to terminate
when a value smaller that 105is reached. We compared the
trained interaction model with the “real” interaction model,
as shown in Figure 3. We limited ourselves to a relative
Fig. 3: Comparison between the “real” (blue) and the trained
(dotted red) particle interaction models
distance between [35,35] since this was the maximum
distance the particles reached between them over time. The
MSE between the trained and the “real” interaction curves
over the interval [-35,35] is 1.3×104. We note that there is
some miss-match near zero due to the fact that the particles
never got close enough. Next, we tested the interaction model
on data not used in the training but whose initial conditions
have similar statistics as the initial conditions of the training
data. The MS Etest(ti)=1
Nkz(ti)ˆ
z(ti)k2, where z(ti), ˆ
z(ti)
designate samples of the time series obtained with the “true”
and learned interaction models, respectively, is shown in
Figure 4. We note that the prediction error stabilizes to a
reasonable small value.
Fig. 4: The MSEs of the velocity vectors for test data
B. Lie group of invariance transformations
The Lie group of transformations ψhas a structure
of the form ψ(z)=z+εη(z;w)+o(ε2), where η(z;w) is
the infinitesimal of the transformation [18], [19]. We
consider a linear parameterization of the form η(z;w)=
Az+band the goal is to find the parameters of
the infinitesimal by solving the optimization problem
minA,b1
n×NPn
i=1
∂η
z(z(i))f(z(i))f
z(z(i))η(z(i))
2, where nde-
notes the total number of vector samples. The optimization
problem was solved for the port-Hamiltonain representation,
using the Adam algorithm and Autograd to compute the
gradient of the cost function using automatic dierentiation.
To improve the speed of the optimization algorithm we
computed oine the values for the maps f(z) and f
zat
each sample of the training data z(i). We generated 50 time
series describing the CS dynamics over the time interval
[0,40] sec, using M=50 initial condition vectors uniformly
drawn from [0,10], generating roughly 5000 data samples.
We stopped the optimization process when the MSE loss
function reached MS Etrain =1.1×104. As sanity check, we
looked at the structure of the learned Aand b. The structure
of bis according to what we would expect: same values
for the first half of the vector (of roughly 1.8049) and small
values for the second half (<104). The entries of Aalthough
small, they were not zero, which may be a result of the fact
that we limited the number of optimization iterations.
The test data were generated randomly, in a similar way as
the training data, using a time interval [0,80] sec, generating
roughly 10000 samples. The longer time interval checks the
time extrapolation as well. As metric we used the MSE
applied on trajectories this time. We have two types of
trajectories. The first type denoted by z(t) is a trajectory
generated by solving the CS dierential equations, with
initial conditions obtained by applying the learned symmetry
transformation to the initial conditions of the test data. The
second type, denoted by ˆ
z(t) is obtained by applying the
learned symmetry map on the test data itself. Formally,
we define the metric MS Ete st =1
n×M×NPM
i=1Pn
j=1kz(i)(tj)
ˆ
z(i)(tj)k2, where nis the number of time samples per time
series, Mis the number of time series, and z(i)(tj) is the
vector of position and velocity coordinates at time tjof
the time series i. We obtained the following MSE for the
test data: MS Ete st =6.2×104. We computed also the MSE
evolution over time for the trajectories, where the averaging
was taken over the time series indices (Mof them) and
entries of the state vector, but not over time as well. The
result is shown in Figure 5. We note that that prediction
Fig. 5: MSE particle velocities over time
error accumulates over time, which most likely comes from
the fact that the learned symmetry map was not exact, due
in part to the limited number of optimization iterations.
C. Symmetry maps
We repeat the learning process for the discrete symmetry
case, using this time the CS model in its original form.
We search for a map Γso that ˆ
z= Γ(z,t) is a solution
of the CS ODE ˙
z=f(z), as well. We assume that the
time remains unchanged by the symmetry, hence no
map for the time is included. We consider a linear
parameterization of the symmetry map, Γ(z,t)=Az+bt +c,
which includes time dependence as well. To learn the map
parameters, we solve the following optimization problem
minA,b,c1
n×NPn
i=1
Γ
z(z(i),ti)f(z(i))+Γ
t(z(i),ti)fΓ(z(i),ti)
2.
We used the Pytorch deep learning platform to implement
the optimization process, using the same Adam algorithm
as in the case of the Lie symmetries group. Pytorch features
automatic dierentiation as well, but has the advantage that
can be used with graphics processing units (GPUs), when
the optimization problem can be parallelized. To give an
idea of why Pytorch can be more eective when scaling
up the problem in number of particles, Figure 6 shows a
comparison between the average time for an optimization
iteration of the Pytorch Adam’s algorithm when using CPU
and GPUs, as a function of the number of particles. We note
that unlike the CPU case, when using GPUs the average
iteration time grows linearly with the number of particles.
In addition, in terms of average iteration time when using
the CPU, Pytorch is superior to Autograd: 4.9 sec for
Autograd versus 2.2 sec for Pytorch for the 20 particle case,
for the same number of training samples. We use a similar
Fig. 6: Average time for an Adam iteration when using CPU
(blue curve) and GPUs (red curve), as a function of the
number of particles
strategy to generate training and test data, as in the case
of the Lie symmetries group. We stopped the optimization
algorithms when the MSE reached 5.5×105. The MSE for
the test data was 0.007. The test data MSE as a function of
time is shown in Figure 7. The same phenomenon of error
accumulation over time is noticed as in the case of the Lie
symmetries group.
D. Conservation laws
In this section we demonstrate that we can recover
conserved quantities as introduced in Proposition 4.6,
whose statement can be easily generalized to the two di-
mensional case. Namely, the Casimir functions have the
form C(px,py,qx,qy)=αx1Tpx+αy1Tpy+uT
xqx+uT
yqy+
β,αRand ux,uyNull(J). To learn the Casimir
function C(z;w), we solve the optimization problem
minw1
nPn
i=1
JTC
px(z(i))
2+
JTC
py(z(i))
2+
JC
qx(z(i))
2+
JC
qy(z(i))
2. We considered two type of parameterizations:
a linear parameterization given by C(px,py,qx,qy)=aT
xpx+
aT
ypy+bT
xqx+bT
yqyand a nonlinear parameterization given by
a neural network with a hidden layer of size 2N+N(N1)
defined by C(z)=W[1] tanh W[0] z+b[0] +b[1] . For the
linear case, the partial derivatives of Cwere hard-coded since
Fig. 7: MSE particle velocities over time
they are simple and do not depend on the training data.
We did use though Autograd to compute analytically the
gradient of the loss function. We initialized the optimization
variables randomly, and we run the Adam algorithm for 2500
iterations with a fixed step of 0.001. Each iteration in the
Adam algorithm takes roughly 3msec. For sanity check, we
looked at the structure of the learned vectors axand ay,
whose entries are shown in Figure 8. We note that we indeed
recovered the expected structure, namely αx1and αy1.
Fig. 8: Entries of vectors axand ay
Another sanity check measure is to plot the evolution of
the Casimir function as a function of time depicted in Figure
9, showing that it has a constant value of 1077.53. The value
of the Casimir function depends on the initial conditions for
both the training data and the optimization variables. We
repeated the learning process for a nonlinear (neural network)
parameterization. In this case, we used Autograd to construct
functions that can be called to compute the partial derivatives
of the Casimir function. In addition of these Jacobians,
we used Autograd to generate the gradient of the loss
function. As a result, each iteration of the Adam algorithm
becomes slower, namely 3 sec. We run the algorithm for
Fig. 9: Casimir function over time for the linear parametriza-
tion
500 iterations starting from random initial conditions for the
optimization variables, selected around the zero value. The
Casimir function for the nonlinear parametrization, computed
at each point on the state trajectory is shown in Figure 10,
where we notice that the function takes a constant value of
approximately -0.4641
Fig. 10: Casimir function over time for the nonlinear param-
eterization
VI. Conclusions
In this paper we proposed a framework based on port-
Hamiltonian modeling formalism, aimed at learning inter-
action models between particles and dynamical properties
such as trajectory symmetries and conservation laws of
ensembles(or swarms) using large-scale optimization ap-
proaches. We built upon the Cucker-Smale particle inter-
action model, which we represented in a port-Hamiltonian
form, and for which we re-discovered the interaction model,
and learned the dynamical properties that were previously
proved analytically. Our approach can potentially be used
for discovering novel particle interaction rules which can
lead to new cooperative control system laws. The future
steps will include scaling up the problem to a very large
number of particles, considering non-linear parameterizations
for the symmetry maps, and re-casting the learning tasks in
a form that is compatible with parallel GPU computations
on deep learning platforms. In addition, we will explore if
symbolic computation of Jacobians together with automatic
dierentiation of the loss function will lead to a significant
decrease in time per optimization iteration.
References
[1] J.S. Baras. A fresh look at network science: Interdependent multi-
graphs models inspired from statistical physics. In Proceedings of the
6th International Symposium on Communication, Control and Signal
Processing, pages 497–500, May 2014.
[2] F. Lu, M. Zhong, S. Tang, and M. Maggioni. Nonparametric inference
of interaction laws in systems of agents from trajectory data. arXiv
preprint arXiv:1812.06003, 2018.
[3] S.L. Brunton, J.L. Proctor, and J.N. Kutz. Discovering governing
equations from data by sparse identification of nonlinear dynam-
ical systems. Proceedings of the National Academy of Sciences,
113(15):3932–3937, 2016.
[4] J. Bongard and H. Lipson. Automated reverse engineering of nonlinear
dynamical systems. Proceedings of the National Academy of Sciences,
104(24):9943–9948, 2007.
[5] I. Matei, J. de Kleer, and R. Minhas. Learning constitutive equations
of physical components with constraints discovery. In 2018 Annual
American Control Conference (ACC), pages 4819–4824, June 2018.
[6] I. Matei, J. De Kleer, M. Zhenirovskyy, and A. Feldman. Learning
constitutive equations of physical components with predefined feasi-
bility conditions. In 2019 American Control Conference (ACC), pages
922–927, July 2019.
[7] Z. Mao, Z. Li, and G.E. Karniadakis. Nonlocal flocking dynamics:
Learning the fractional order of pdes from particle simulations. arXiv
preprint arXiv:1810.11596, 2018.
[8] A.J. van der Schaft and B.M. Maschke. Port-hamiltonian systems on
graphs. SIAM Journal on Control and Optimization, 51(2):906–937,
2013.
[9] A.J. van der Schaft. Port-hamiltonian systems: an introductory survey.
In M. Sanz-Sole, J. Soria, J.L. Varona, and J. Verdera, editors,
Proceedings of the International Congress of Mathematicians Vol. III,
number suppl 2, pages 1339–1365. European Mathematical Society
Publishing House (EMS Ph), 2006.
[10] A.J. van der Schaft and D. Jeltsema. Port-hamiltonian systems theory:
An introductory overview. Foundations and Trends R
in Systems and
Control, 1(2-3):173–378, 2014.
[11] J. Cervera, A.J. van der Schaft, and A. Ba˜
nos. Interconnection of port-
hamiltonian systems and composition of dirac structures. Automatica,
43(2):212–225, 2007.
[12] A. Mouchet. Applications of Noether conservation theorem to Hamil-
tonian systems. Annals of Physics, 372, 12 2015.
[13] J. Schwichtenberg. Physics from symmetry. Springer, 2015.
[14] J.S. Baras. Group invariance and symmetries in nonlinear control
and estimation. Nonlinear Control in the Year 2000, A. Isidori, F.
Lamnabhi-Lagarrigue, W. Respondek (Edts.), 1:137–171, December
2000.
[15] C.W. Reynolds. Flocks, herds and schools: A distributed behavioral
model. In ACM SIGGRAPH computer graphics, volume 21, pages
25–34. ACM, 1987.
[16] J.A. Carrillo, M. Fornasier, G. Toscani, and F. Vecil. Particle, kinetic,
and hydrodynamic models of swarming. Birkh¨
auser Boston, Boston,
2010.
[17] J.A. Carrillo, S. Martin, and V. Panferov. A new interaction potential
for swarming models. Physica D: Nonlinear Phenomena, 260:112–
126, 2013.
[18] G.W. Bluman and S.C. Anco. Symmetry and integration methods for
dierential equations. Applied Mathematical Sciences, (154), 2002.
[19] A.F. Cheviakov G. W. Bluman and S.C. Anco. Applications of sym-
metry methods to partial dierential equations. Applied Mathematical
Sciences, (163), 2010.
[20] D. Maclaurin, D. Duvenaud, M. Johnson, and J. Townsend. Autograd.
https://github.com/HIPS/autograd, 2018.
[21] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito,
Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic dierenti-
ation in PyTorch. 2017.
... Extracting the laws of interaction between agents of general networked systems finds applications in a wide range of fields, from power systems and chemical reaction networks, to social networks and UAV swarms [3]- [5], [7]- [9]. Statistical [10], and model-based [2], [3], [7], [11] learning approaches have been used to learn the interaction rules between agents. ...
... Extracting the laws of interaction between agents of general networked systems finds applications in a wide range of fields, from power systems and chemical reaction networks, to social networks and UAV swarms [3]- [5], [7]- [9]. Statistical [10], and model-based [2], [3], [7], [11] learning approaches have been used to learn the interaction rules between agents. There are generally two broad approaches in modeling the underlying dynamics of ensembles of self-organizing agents: the microscopic particle models, described by ordinary or stochastic differential equations, and the macroscopic continuum models, described by partial differential equations (PDEs). ...
... Particle models have been mainly used in numerical simulations and learning methodologies [7], [11], [18]. Recently, Mao et al. in [11] modeled the interactions with respect to a fractional differential system of equations, and Matei et al. in [7] proposed an energy-based approach by modeling the network as a port-Hamiltonian system [19]. ...
Preprint
Full-text available
We consider the problem of understanding the coordinated movements of biological or artificial swarms. In this regard, we propose a learning scheme to estimate the coordination laws of the interacting agents from observations of the swarm's density over time. We describe the dynamics of the swarm based on pairwise interactions according to a Cucker-Smale flocking model, and express the swarm's density evolution as the solution to a system of mean-field hydrodynamic equations. We propose a new family of parametric functions to model the pairwise interactions, which allows for the mean-field macroscopic system of integro-differential equations to be efficiently solved as an augmented system of PDEs. Finally, we incorporate the augmented system in an iterative optimization scheme to learn the dynamics of the interacting agents from observations of the swarm's density evolution over time. The results of this work can offer an alternative approach to study how animal flocks coordinate, create new control schemes for large networked systems, and serve as a central part of defense mechanisms against adversarial drone attacks.
... In general, there are two broad approaches when investigating the underlying dynamics for flocks or swarms: the microscopic, particle models described by ordinary differential equations (ODEs) or stochastic differential equations, and the macroscopic continuum models, described by partial differential equations (PDEs). Agent-based models assume behavioral rules at the individual level, such as velocity alignment, attraction, and repulsion (Cucker and Smale, 2007;Giardina, 2008;Ballerini et al., 2008) and are often used in numerical simulations and in learning schemes where the interaction rules are inferred (Matei et al., 2019). As the number of interacting agents gets large, the agent-based models become computationally expensive (Carrillo et al., 2010). ...
Preprint
Full-text available
We propose a family of compactly supported parametric interaction functions in the general Cucker-Smale flocking dynamics such that the mean-field macroscopic system of mass and momentum balance equations with non-local damping terms can be converted from a system of partial integro-differential equations to an augmented system of partial differential equations in a compact set. We treat the interaction functions as Green's functions for an operator corresponding to a semi-linear Poisson equation and compute the density and momentum in a translating reference frame, i.e. one that is taken in reference to the flock's centroid. This allows us to consider the dynamics in a fixed, flock-centered compact set without loss of generality. We approach the computation of the non-local damping using the standard finite difference treatment of the chosen differential operator, resulting in a tridiagonal system which can be solved quickly.
... They are copying the system and matching the boundary conditions, consequently, they are not robust to model mismatch or noise. Another recent field of research is to use machine learning tools to estimate the density (Matei et al., 2019;Huang and Agarwal, 2020;Liu et al., 2020), no matter the origin of the measurement. They are then more flexible and they lead to a good reconstruction providing substantial computer resources. ...
Preprint
Full-text available
The state reconstruction problem of a heterogeneous dynamic system under sporadic measurements is considered. This system consists of a conversation flow together with a multi-agent network modeling particles within the flow. We propose a partial-state reconstruction algorithm using physics-informed learning based on local measurements obtained from these agents. Traffic density reconstruction is used as an example to illustrate the results and it is shown that the approach provides an efficient noise rejection.
Article
We propose a family of compactly supported parametric interaction functions in the general Cucker-Smale flocking dynamics such that the mean-field macroscopic system of mass and momentum balance equations with non-local damping terms can be converted from a system of partial integro-differential equations to an augmented system of partial differential equations in a compact set. We treat the interaction functions as Green’s functions for an operator corresponding to a semi-linear Poisson equation and compute the density and momentum in a translating reference frame, i.e. one that is taken in reference to the flock’s centroid. This allows us to consider the dynamics in a fixed, flock-centered compact set without loss of generality. We approach the computation of the non-local damping using the standard finite difference treatment of the chosen differential operator, resulting in a tridiagonal system which can be solved quickly.
Chapter
In this work we consider the problem of defending against adversarial attacks from UAV swarms performing complex maneuvers, driven by multiple, dynamically changing, leaders. We rely on short-time observations of the trajectories of the UAVs and develop a leader detection scheme based on the notion of Granger causality. We proceed with the estimation of the swarm’s coordination laws, modeled by a generalized Cucker-Smale model with non-local repulsive potential functions and dynamically changing leaders, through an appropriately defined iterative optimization algorithm. Similar problems exist in communication and computer networks, as well as social networks over the Internet. Thus, the methodology and algorithms proposed can be applied to many types of network swarms including detection of influential malevolent “sources” of attacks and “miss-information”. The proposed algorithms are robust to missing data and noise. We validate our methodology using simulation data of complex swarm movements.
Article
Full-text available
Inferring the laws of interaction in agent-based systems from observational data is a fundamental challenge in a wide variety of disciplines. We propose a non-parametric statistical learning approach for distance-based interactions, with no reference or assumption on their analytical form, given data consisting of sampled trajectories of interacting agents. We demonstrate the effectiveness of our estimators both by providing theoretical guarantees that avoid the curse of dimensionality, and by testing them on a variety of prototypical systems used in various disciplines. These systems include homogeneous and heterogeneous agents systems, ranging from particle systems in fundamental physics to agent-based systems that model opinion dynamics under the social influence, prey-predator dynamics, flocking and swarming, and phototaxis in cell dynamics.
Conference Paper
Full-text available
We address the problem of learning constitutive equations of acausal physical components in partially known physical systems. The parameters of the constitutive equations satisfy a set of unknown constraints. We propose an iterative procedure for joint parameters and constraints learning and discuss practical aspects of its implementation. The procedure favors exploration during the first iterations. This enables learning a model for the constraints. As the constraints learning advances more weight is given to finding the constitutive equations. We test our method on a demonstrative example in which the model of a nonlinear resistor is learned.
Article
Full-text available
The ability to discover physical laws and governing equations from data is one of humankind's greatest intellectual achievements. A quantitative understanding of dynamic constraints and balances in nature has facilitated rapid development of knowledge and enabled advanced technological achievements, including aircraft, combustion engines, satellites, and electrical power. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing physical equations from measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. The resulting models are parsimonious, balancing model complexity with descriptive ability while avoiding overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized, time-varying, or externally forced systems.
Book
Apart from offering a systematic and insightful framework for modeling and analysis of multi-physics systems, port-Hamiltonian systems theory provides a natural starting point for control. Especially in the nonlinear case it is widely recognized that physical properties of the system - such as balance and conservation laws and energy considerations - should be exploited and respected in the design of control laws which are robust and physically interpretable. Port-Hamiltonian Systems Theory: An Introductory Overview provides a concise and easily accessible description of the foundations underpinning the subject, and goes on to emphasize novel developments in the field that will be of interest to a broad range of researchers. The tutorial style makes it suitable for use in a course and by students.
Article
The Noether theorem connecting symmetries and conservation laws can be applied directly in a Hamiltonian framework without using any intermediate Lagrangian formulation. This requires a careful discussion about the invariance of the boundary conditions under a canonical transformation and this paper proposes to address this issue. Then, the unified treatment of Hamiltonian systems offered by Noether's approach is illustrated on several examples, including classical field theory and quantum dynamics.
Conference Paper
We consider several challenging problems in complex networks (communication, control, social, economic, biological, hybrid) as problems in cooperative multi-agent systems. We describe a general model for cooperative multi-agent systems that involves several interacting dynamic multigraphs and identify three fundamental research challenges underlying these systems from a network science perspective. We show that the framework of constrained coalitional network games captures in a fundamental way the basic tradeoff of benefits vs. cost of collaboration, in multi-agent systems, and demonstrate that it can explain network formation and the emergence or not of collaboration. Multi-metric problems in such networks are analyzed via a novel multiple partially ordered semirings approach. We investigate the interrelationship between the collaboration and communication multigraphs in cooperative swarms and the role of the communication topology, among the collaborating agents, in improving the performance of distributed task execution. Expander graphs emerge as efficient communication topologies for collaborative control. We relate these models and approaches to statistical physics.
Article
An up-to-date survey of the theory of port-Hamiltonian systems is given, emphasizing novel developments and relationships with other formalisms. Port-Hamiltonian systems theory yields a systematic framework for network modeling of multi-physics systems. Examples from different areas show the range of applicability. While the emphasis is on modeling and analysis, the last part provides a brief introduction to control of port-Hamiltonian systems.
Book
This book provides a comprehensive treatment of symmetry methods and dimensional analysis. The authors discuss aspects of Lie groups of point transformations, contact symmetries, and higher-order symmetries that are essential for solving differential equations. Emphasis is given to an algorithmic, computational approach to finding integrating factors and first integrals. Numerous examples including ordinary differential equations arising in applied mathematics are used for illustration and exercise sets are included throughout the text. This book is designed for advanced undergraduate or beginning graduate students of mathematics and physics, as well as researchers in mathematics, physics, and engineering.