Content uploaded by Ion Matei
Author content
All content in this area was uploaded by Ion Matei on Feb 19, 2020
Content may be subject to copyright.
Inferring Particle Interaction Physical Models
and Their Dynamical Properties
Ion Matei, Christos Mavridis, John S. Baras and Maksym Zhenirovskyy
Abstract— We propose a framework based on port-
Hamiltonian modeling formalism aimed at learning interaction
models between particles (or networked systems) and dynamical
properties such as trajectory symmetries and conservation laws
of the ensemble (or swarm). The learning process is based
on approaches and platforms used for large scale optimization
and uses features such as automatic differentiation to compute
gradients of optimization loss functions. We showcase our
approach on the Cucker-Smale particle interaction model,
which is first represented in a port-Hamiltonian form, and
for which we re-discover the interaction model, and learn
dynamical properties that are previously proved analytically.
Our approach has the potential for discovering novel particle
cooperation rules that can be extracted and used in cooperative
control system applications.
I. Introduction
Extracting physical laws that govern a given system from
data is a central challenge in many diverse areas of science
and engineering. Most complex systems can be described
as discrete structures (graphs) with dynamical relations [1].
Such networked systems are ubiquitous and include multi-
body systems, chemical reaction networks, animal and UAV
swarms, and power systems. A fundamental challenge in
complex networked systems is to infer the laws of interaction
between particles and their dynamical properties [1]. The
problem has been approached either by using statistical
learning [2] [3], or by learning the parameters of equations
modeling the system. In [4] symbolic equations are generated
from the numerically calculated derivatives of the system
variables. In [5], [6] the constitutive equations of physical
components the system are learned using acausal repre-
sentations, while in [7] the order of fractional differential
equations modeling the system is estimated.
A general and powerful geometric framework to
model complex dynamical networked systems is the port-
Hamiltonian modeling formalism [8], [9], [10]. Port-
Hamiltonian systems are based on a known energy function
(Hamiltonian) and the interconnection of atomic structure
elements (e.g. inertias, springs and dampers for mechanical
systems) that interact by exchanging energy. They provide an
energy-consistent description of a physical system, having
the property that a power conservative interconnection of
Ion Matei and Maksym Zhenirovskyy are with the Palo Alto Re-
search Center (PARC), Palo Alto, CA (emails: ion.matei@parc.com,
maksym.zhenirovskyy@parc.com). Christos Mavridis and John S. Baras
are with the Department of Electrical and Computer Engineering and the
Institute for Systems Research, University of Maryland, College Park, MD
(emails: mavridis@umd.edu, baras@isr.umd.edu).
This material is based upon work supported by the Defense Advanced Re-
search Projects Agency (DARPA) under Agreement No. HR00111990027.
port-Hamiltonian systems is again a port-Hamiltonian system
[11].
In addition, the port-Hamiltonian system framework is
particularly suited for finding symmetries and conserved
quantities [12]. In particular, it allows to find conserved
quantities, in addition to the Hamiltonian, called Casimir
functions [9], by examining conditions related to the port-
Hamiltonian system at hand, which can lead to model
simplification (reduction). Moreover, finding parameterized
symmetries, e.g., Lie groups of transformations, can lead to
data generation without experimentation, as well as provide
insight on the modeling equations of the system itself [13],
[14].
In this work, we are interested in models describing the
dynamics of swarms or particle ensembles (e.g. bird flocks),
which have been studied intensively through the years [15],
[16], [17]. We model the system of interacting particles as a
graph topology based on port-Hamiltonian components, and
investigate its dynamical properties, such as discrete symme-
tries of trajectories, Lie groups of invariance transformations
and conservation laws. To showcase our approach we use the
Cucker-Smale (CS) model [16] to generate training and test
data for the learning tasks. We apply large scale optimization
methods implemented on deep learning platforms to learn
the particle interaction model from data, and recover its
dynamical properties. Finally, we compare the results of our
method to the ones derived by the theoretical analysis.
The rest of the manuscript is organized as follows: Section
II introduces the CS dynamical model for particle interac-
tions, the port-Hamiltonian formalism, discrete symmetries
and the Lie groups of invariance transformations. Section
III describes the port-Hamiltonian representation of the CS
interaction model. In Section IV we prove theoretical results
on discrete symmetries, Lie groups of invariance transfor-
mations and conservation laws based on Casimir functions.
Section V describes optimization based learning algorithms
that are used to recover the particle interaction model, the
discrete and Lie symmetry maps and the conserved quanti-
ties. Finally, Section VI concludes the paper.
II. Preliminaries
In this section we first describe the CS dynamical model
used to showcase our approach, we give a brief description
of the port-Hamiltonian formalism and introduce the notions
of symmetries and Lie groups of invariance transformations.
A. Cucker-Smale Particle Interaction Model
Let idenote a particle in an ensemble of Nparticles. The
CS particle interaction model [16] is given by ˙xi=viand
˙vi=1
NPN
j=1G(kxi−xjk)(vj−vi), where a typical choice for the
interaction function Gis G(r)=1
(1+r2)γ. The above dynamics
ensures velocity alignment of all particles [16], [17]. An ex-
tension of the original model comes from adding a potential
function [16], [17], resulting in the dynamics ˙xi=viand ˙vi=
1
NPN
j=1G(kxi−xjk)(vj−vi)−1
NPi,j∇U(kxi−xjk), where the
potential function takes the form U(r)=−CAe−r/lA+CRe−r/lR,
with CA,CR,lA,lRpositive scalars. The above model can be
compactly written as
˙
x=v(1)
˙
v=G(x)v− ∇U(x),(2)
where [x]i=xi,[v]i=vi,[G(x)]i,i=−1
NPN
i=1G(kxi−xjk),
[G(x)]i,j=1
NG(kxi−xjk), for i,j, [∇U(x)]i,i=0, and
[∇U(x)]i,j=1
N∇U(kxi−xjk), for i,j.
B. Port-Hamiltonian Systems
Consider a finite-dimensional linear state space X
along with a Hamiltonian H:X → R+defining energy-
storage, and a set of pairs of effort and flow variables
{(ei,fi)∈ Ei× Fi,i∈{S,R,P}}, describing ports (ensembles
of elements) that interact by exchanging energy. Then, the
dynamics of a port-Hamiltonian system Σ = (X,H,S,R,P,D)
are defined by a Dirac Structure D[9], [10] as
(fS,eS,fR,eR,fP,eP)∈D⇔eT
SfS+eT
RfR+eT
PfP=0,
where (i)S=(fS,eS)∈ FR× ER=X×X is an energy-storing
port, consisting of the union of all the energy-storing ele-
ments of the system (e.g. inertias and springs in mechanical
systems), satisfying fS=−˙x,eS=∂H
∂x(x),x∈ X such that
d
dt H=−eT
SfS=eT
RfR+eT
PfP, (ii)R=(fR,eR)∈ FR× ERis
an energy-dissipation (resistive) port, consisting of the union
of all the resistive elements of the system (e.g. dampers
in mechanical systems), satisfying heR,fRi≤0 and, usually,
an input-output relation fR=−R(eR), (iii)P=(fP,eP)∈
FP× EPis an external port modeling the interaction of the
system with the environment, consisting of a control port
Cand an interconnection port I, and (iv)D ⊂ F × E =
FR× ER× FR× ER× FP× EPis a central power-conserving
interconnection (energy-routing) structure (e.g. transformers
in electrical systems), satisfying he,fi=0,∀(f,e)∈ D, and
dimD=dimF, where E=F∗, and the duality product he,fi
represents power.
The basic property of port-Hamiltonian systems is that
the power-conserving interconnection of any number of port-
Hamiltonian systems is again a port-Hamiltonian system. An
important and useful special case is the class of input-state-
output port-Hamiltonian systems ˙x=[J(x)−R(x)] ∂H
∂x(x)+
g(x)u,y=gT(x)∂H
∂x(x), where u,yare the input–output pairs
corresponding to the control port C,J(x)=−JT(x) is skew-
symmetric, while the matrix R(x)=RT(x)≥0 specifies the
resistive structure.
C. Symmetries and Lie Group of Transformations
Give a a differential algebraic equation (DAE) F( ˙x,x)=0
with x∈ X ⊆ Rn, the map Ψ:X × S→ X is a symmetry map,
if it is a diffeomorphism, and ˆx= Ψ(x) is a solution for the
DAE, that is F(˙
ˆx,ˆx)=0. Therefore the symmetry map must
obey the symmetry condition F∂Ψ(x)
∂x˙x,Ψ(x)=0.
A particular type of symmetry maps are Lie groups of
invariance transformations. The map Ψ:X × S→ X, where
S⊂Ris an interval with 0 ∈S, along with a composition law
φ:S×S→S, defines a parametrized (Lie) symmetry [12],
and, in particular, a Lie group of invariance transformations,
if for any solution xof the DAE, and for all ∈S, ˆx(t)=
Ψ(x(t),) is also a solution of the system, and (i) Ψ(x, )
is smooth in xand analytic in , (ii) Ψ(·, ) is an injection
for all ∈S, (iii) (S, φ) forms a group with identity element
zero, and φis analytic, (iv) Ψ(x,0) =x,∀x∈D, and (v) if
x= Ψ(x,) and xδ= Ψ(x, δ), then xδ= Ψ(x,φ( ,δ)). Using
the infinitesimal operator X=∂Ψ(x, )
∂ |=0∂
∂x=η(x)∂
∂x, we have
that ˆx= Ψ(x, )=eXxand the symmetry condition becomes
X0F( ˙x,x)=0, or η(x)T∂F
∂x( ˙x,x)+˙xT∂η
∂x(x)∂F
∂˙x( ˙x,x)=0.
III. Port-Hamiltonian representation of the Cucker-Smale
interaction model
We introduce the notion of generalized mass-spring
damper (gMSD) components. We typically consider masses
as having one port, and springs and dampers as having
two ports. Ports are component interfaces through which
energy is exchanged. Their dynamical representations of the
gMSD are as follows: mass ˙p=f,v=∂H
∂q, spring ˙q=v,
f=∂H
∂q, damper f=R(q)v. In the case of the mass, pis
the momentum, fis the force acting on the mass, vis the
mass velocity and His the mass Hamiltonian function. In the
case of the spring, qis the spring elongation (the difference
between the positions at the two ports), vis the relative
velocity, fis the force through the spring and Hdenotes
the spring’s Hamiltonian. In the case of the damper, fis the
force through the damper, qis the relative position of the
damper, Ris a resitive term as a function of qand vis the
relative velocity.
Proposition 3.1: The CS model with potential is equiv-
alent to a fully connected N-dimensional network of gen-
eralized mass-spring-dampers, where each node iin the
network is a mass, and each link (i,j) a parallel composition
of a spring and damper. The Hamiltonian functions for
the mass-springs are given by H(p)=1
2pTpand H(q)=
1
N"−CAe−kqk
lA+CRe−kqk
lR#, respectively, and and the resistive
function of the damper is given by R(q)=1
N[1+kqk2]γ.
We show an example of this result for the one dimensional
case (p,q∈R) and for the 3 particles case. The result
holds for the general case, but the notations become more
cumbersome. The fully connected topology of the gMSD
network is shown in Figure 1. We denote by Hiand Hij the
Hamiltonian functions of the masses and springs, respec-
tively. We note that since we assume, unitary masses, the
momenta are equal to the mass velocities, that is, pi=vi,
Fig. 1: Fully connected, 3-dimensional gMSD network
i={1,2,3}. The forces through the links are the sum of the
forces through the dampers and springs, and are given by
fi j =∂Hij
∂qi j
+R(qi j)(vi−vj), for (i,j)∈ {(1,2),(2,3),(3,1)}. The
forces through the masses can be expressed as: f1=f31 −f12,
f2=f12 −f23 and f3=f23 −f31. We get the expressions for
the mass momenta dynamics as:
˙p1=∂H31
∂q31
−∂H12
∂q12
+R(q31)(v3−v1)+R(q12 )(v2−v1),(3)
˙p2=∂H12
∂q12
−∂H23
∂q23
+R(q12)(v1−v2)+R(q23 )(v3−v2),(4)
˙p3=∂H23
∂q23
−∂H31
∂q31
+R(q23)(v2−v3)+R(q31 )(v1−v3).(5)
The dynamics for the spring elongations are
˙qi j =vi−vj=∂Hi
∂pi
−∂Hj
∂pj
(6)
for (i,j)∈ {(1,2),(2,3),(3,1)}. To recover the CS model with
potential, we replace the relative positions qij with the
absolute positions, namely qi j =qi−qjRecalling that spring
potentials are symmetric functions, we get that
∂H31
∂q31
−∂H12
∂q12
=−1
3(∇U(q1−q3)− ∇U(q1−q2))(7)
∂H12
∂q12
−∂H23
∂q23
=−1
3(∇U(q2−q1)− ∇U(q2−q3))(8)
∂H23
∂q23
−∂H31
∂q31
=−1
3(∇U(q3−q2)− ∇U(q3−q2))(9)
Substituting (7)-(9) in (3)-(5), and recalling that under our
assumptions pi=vi, we recover exactly the CS model with
potential.
By introducing the notation zT=[pT,qT], with pT=
[p1,p2,p3] and qT=[q12,q23 ,q31], the equations (3)-(5) and
(6) can be expressed compactly as
˙
z=[J(z)−R(z)]∂H(z)
∂z,(10)
where H(z)=H1(p1)+H2(p2)+H3(p3)+H12(q12 )+
H23(q23 )+H31(q31), and
R(z)="R(z)0
0 0 #
with
R(z)=
R(q12)+R(q31 )−R(q12)−R(q31)
−R(q12)R(q12 )+R(q23)−R(q23)
−R(q31)−R(q23 )R(q31)+R(q23)
and where
J(z)="0J
−JT0#,with J=
−1 0 1
1−1 0
0 1 −1
.
We recognize equation (10) as the typical input-state-output
port-Hamiltonian system [9], [10].
IV. Dynamical properties of the Cucker-Smale model
In this section we introduce a set of maps for which
we demonstrate that they satisfy the required properties
for being symmetry or Lie symmetry maps. In addition,
we introduce a conserved quantity that differs from the
Hamiltonian function. The maps and the conserved quantity
will be rediscovered in the learning section. The symmetry
maps will be introduced for both the original CS model and
its port-Hamiltonian representation. We consider the 1-d case
(p∈RN), since the results can be easily generalized to higher
dimensions.
A. Symmetry maps
The following result introduces a symmetry map for the
CS dynamics in port Hamiltonian form.
Proposition 4.1: The map Γ(p,q)=(p+α1,q) for α∈R
is a symmetry map for the port-Hamiltonian dynamics (10).
For the CS dynamics with potential in its original form (1)-
(2), the symmetry map is slightly different, as shown next.
Proposition 4.2: The map Γ(x,v,t)=(x+α1t+β1,v+
α1,t) for α∈Ris a symmetry map for the CS dynamics
with potential (1)-(2).
B. Lie group of invariance transformations
As introduced in Section II-C, the Lie group of invariance
transformations [18], [19] are a particular type of symmetry
maps with the form ˆ
z=z+εη(z,t)+O(ε2). The following
result introduces the infinitesimal of the CS model in the
original form.
Proposition 4.3: The map η(z,t)=η(x,v,t)=
αth1T,0TiT+hβ1T,α1TiT, for all α,β ∈Ris an
infinitesimal for the Lie group of invariance transformations
corresponding to the CS dynamics in its original form.
A similar result holds for CS model in port-Hamiltonian
form, where the time dependence of the infinitesimal map
is no longer present.
Proposition 4.4: The map η(z)=η(p,q)=αh1T,0TiT, for
all α∈Ris an infinitesimal for the Lie group of invariance
transformations corresponding to the CS dynamics in port
Hamiltonian form.
C. Conserved quantities
The port-Hamiltonian representation has the advantage of
providing at least one quantity that is conserved, namely
the Hamiltonian. In addition to the Hamiltonian function,
there are other quantities that are conserved. The following
results introduce such quantities for both the original and
port-Hamiltonian representation of the CS particle dynamics.
Proposition 4.5: The quantity 1Tvis conserved by the CS
dynamics (1), that is, 1T˙
v=0, for all t≥0.
We can show similar results in the case of the port-
Hamiltonian representation. We will make use of the Casimir
functions which represent the conserved quantities for port-
Hamiltonain systems.
Proposition 4.6: Any function of the form C(p,q)=
α1Tp+uTq+β, where u∈Null(J), and α, γ ∈Ris a
conserved quantity for the CS dynamics in port-Hamiltonian
form (10), where zT=[pT,qT].
Remark 4.1: Note that in the 3 particle example, the
matrix Jis square and the null space of Jand JTis the
same. In general this it is not true since J∈RN×M, where
M=N(N−1)/2. Hence, only the null space of JTis given
by {α1,α ∈R}.
V. Learning interaction models and their dynamical
properties
To demonstrate that we can indeed recover the theoretical
results proved in the previous sections, we consider an
example where twenty particles (N=20) evolve according
to the CS dynamics. We consider both the original and
port-Hamiltonian representation of the CS dynamics. The
particles operate in a two dimensional space, that is, the
(relative) position and velocity vectors of each particle have
dimension two. The training data were generated by simu-
lating the CS model with parameter γ=0.15, over the time
interval [0,40] sec, starting with random initial conditions
in the interval [0,10]. A realization of the CS, simulation
results is shown in Figure 2, where we plot the particle
speed (norm of the velocity vector). The structure of the
Fig. 2: Particle speed over time kvi(t)k,i∈ {1,...,N}
time series used for training is zT=[xT,yT,vT
x,vT
y], where
x,y,vx,vy∈RN. In the port-Hamiltonian representation, the
structure is slightly different, namely zT=[pT
x,pT
y,qT
x,qT
y],
where qx,qy∈RN(N−1)
2, and pT
x,pT
y∈RN. The computation
of the gradients and Jacobians was done using automatic
differentiation. The learning problems were implemented us-
ing the Python package Autograd [20] and the deep learning
platform Pytorch [21] featuring automatic differentiation.
A. Particle interaction model
Our first task is to recover the interaction model be-
tween particles. We consider the port-Hamiltonian repre-
sentation case, without potential, which can be obtained
by approximating the spring potential function with zero,
by appropriately choosing the parameters of the potential
function. Using the port-Hamiltonian formalism, this task
translates to learning the constitutive equation for a gen-
eralized damper. In particular we learn Fi j =g(q2
i j;w) ˙qi j
that describes the force acting between two particles i,j,
where qi j is the relative position between two the particles.
We choose the map gto be a neural network (NN) with
one hidden layer of size 12, whose output is given by y=
W[1] tanhW[0]u+b[0]+b[1] , where the weight exponents
denote the layer number. Hence we have a total of 37
parameters. Note that we can add a ReLu type of activation
on the last layer to impose a non-negative output of the
NN. To learn the parameters of the map g, we solve the
optimization problem minw1
nPn
i=1kz(ti)−ˆ
z(ti;w)k2, where n
is the number of time samples, w={W[0],b[0] ,W[1],b[1]}
is the set of optimization variables, ˆ
z(ti) are time samples
of the solution of (10) with the resistive term defined by
R(q)=g(kqk2;w), and no potential between particles. The
initial positions and velocity were uniformly drawn from the
interval [0,10]. We used the Autograd package and its Adam
algorithm implementation to solve the least square problem
introduced above. The optimization error was set to terminate
when a value smaller that 10−5is reached. We compared the
trained interaction model with the “real” interaction model,
as shown in Figure 3. We limited ourselves to a relative
Fig. 3: Comparison between the “real” (blue) and the trained
(dotted red) particle interaction models
distance between [−35,35] since this was the maximum
distance the particles reached between them over time. The
MSE between the trained and the “real” interaction curves
over the interval [-35,35] is 1.3×10−4. We note that there is
some miss-match near zero due to the fact that the particles
never got close enough. Next, we tested the interaction model
on data not used in the training but whose initial conditions
have similar statistics as the initial conditions of the training
data. The MS Etest(ti)=1
Nkz(ti)−ˆ
z(ti)k2, where z(ti), ˆ
z(ti)
designate samples of the time series obtained with the “true”
and learned interaction models, respectively, is shown in
Figure 4. We note that the prediction error stabilizes to a
reasonable small value.
Fig. 4: The MSEs of the velocity vectors for test data
B. Lie group of invariance transformations
The Lie group of transformations ψhas a structure
of the form ψ(z)=z+εη(z;w)+o(ε2), where η(z;w) is
the infinitesimal of the transformation [18], [19]. We
consider a linear parameterization of the form η(z;w)=
Az+band the goal is to find the parameters of
the infinitesimal by solving the optimization problem
minA,b1
n×NPn
i=1
∂η
∂z(z(i))f(z(i))−∂f
∂z(z(i))η(z(i))
2, where nde-
notes the total number of vector samples. The optimization
problem was solved for the port-Hamiltonain representation,
using the Adam algorithm and Autograd to compute the
gradient of the cost function using automatic differentiation.
To improve the speed of the optimization algorithm we
computed offline the values for the maps f(z) and ∂f
∂zat
each sample of the training data z(i). We generated 50 time
series describing the CS dynamics over the time interval
[0,40] sec, using M=50 initial condition vectors uniformly
drawn from [0,10], generating roughly 5000 data samples.
We stopped the optimization process when the MSE loss
function reached MS Etrain =1.1×10−4. As sanity check, we
looked at the structure of the learned Aand b. The structure
of bis according to what we would expect: same values
for the first half of the vector (of roughly 1.8049) and small
values for the second half (<10−4). The entries of Aalthough
small, they were not zero, which may be a result of the fact
that we limited the number of optimization iterations.
The test data were generated randomly, in a similar way as
the training data, using a time interval [0,80] sec, generating
roughly 10000 samples. The longer time interval checks the
time extrapolation as well. As metric we used the MSE
applied on trajectories this time. We have two types of
trajectories. The first type denoted by z(t) is a trajectory
generated by solving the CS differential equations, with
initial conditions obtained by applying the learned symmetry
transformation to the initial conditions of the test data. The
second type, denoted by ˆ
z(t) is obtained by applying the
learned symmetry map on the test data itself. Formally,
we define the metric MS Ete st =1
n×M×NPM
i=1Pn
j=1kz(i)(tj)−
ˆ
z(i)(tj)k2, where nis the number of time samples per time
series, Mis the number of time series, and z(i)(tj) is the
vector of position and velocity coordinates at time tjof
the time series i. We obtained the following MSE for the
test data: MS Ete st =6.2×10−4. We computed also the MSE
evolution over time for the trajectories, where the averaging
was taken over the time series indices (Mof them) and
entries of the state vector, but not over time as well. The
result is shown in Figure 5. We note that that prediction
Fig. 5: MSE particle velocities over time
error accumulates over time, which most likely comes from
the fact that the learned symmetry map was not exact, due
in part to the limited number of optimization iterations.
C. Symmetry maps
We repeat the learning process for the discrete symmetry
case, using this time the CS model in its original form.
We search for a map Γso that ˆ
z= Γ(z,t) is a solution
of the CS ODE ˙
z=f(z), as well. We assume that the
time remains unchanged by the symmetry, hence no
map for the time is included. We consider a linear
parameterization of the symmetry map, Γ(z,t)=Az+bt +c,
which includes time dependence as well. To learn the map
parameters, we solve the following optimization problem
minA,b,c1
n×NPn
i=1
∂Γ
∂z(z(i),ti)f(z(i))+∂Γ
∂t(z(i),ti)−fΓ(z(i),ti)
2.
We used the Pytorch deep learning platform to implement
the optimization process, using the same Adam algorithm
as in the case of the Lie symmetries group. Pytorch features
automatic differentiation as well, but has the advantage that
can be used with graphics processing units (GPUs), when
the optimization problem can be parallelized. To give an
idea of why Pytorch can be more effective when scaling
up the problem in number of particles, Figure 6 shows a
comparison between the average time for an optimization
iteration of the Pytorch Adam’s algorithm when using CPU
and GPUs, as a function of the number of particles. We note
that unlike the CPU case, when using GPUs the average
iteration time grows linearly with the number of particles.
In addition, in terms of average iteration time when using
the CPU, Pytorch is superior to Autograd: 4.9 sec for
Autograd versus 2.2 sec for Pytorch for the 20 particle case,
for the same number of training samples. We use a similar
Fig. 6: Average time for an Adam iteration when using CPU
(blue curve) and GPUs (red curve), as a function of the
number of particles
strategy to generate training and test data, as in the case
of the Lie symmetries group. We stopped the optimization
algorithms when the MSE reached 5.5×10−5. The MSE for
the test data was 0.007. The test data MSE as a function of
time is shown in Figure 7. The same phenomenon of error
accumulation over time is noticed as in the case of the Lie
symmetries group.
D. Conservation laws
In this section we demonstrate that we can recover
conserved quantities as introduced in Proposition 4.6,
whose statement can be easily generalized to the two di-
mensional case. Namely, the Casimir functions have the
form C(px,py,qx,qy)=αx1Tpx+αy1Tpy+uT
xqx+uT
yqy+
β,∀α∈Rand ∀ux,uy∈Null(J). To learn the Casimir
function C(z;w), we solve the optimization problem
minw1
nPn
i=1
JT∂C
∂px(z(i))
2+
JT∂C
∂py(z(i))
2+
J∂C
∂qx(z(i))
2+
J∂C
∂qy(z(i))
2. We considered two type of parameterizations:
a linear parameterization given by C(px,py,qx,qy)=aT
xpx+
aT
ypy+bT
xqx+bT
yqyand a nonlinear parameterization given by
a neural network with a hidden layer of size 2N+N(N−1)
defined by C(z)=W[1] tanh W[0] z+b[0] +b[1] . For the
linear case, the partial derivatives of Cwere hard-coded since
Fig. 7: MSE particle velocities over time
they are simple and do not depend on the training data.
We did use though Autograd to compute analytically the
gradient of the loss function. We initialized the optimization
variables randomly, and we run the Adam algorithm for 2500
iterations with a fixed step of 0.001. Each iteration in the
Adam algorithm takes roughly 3msec. For sanity check, we
looked at the structure of the learned vectors axand ay,
whose entries are shown in Figure 8. We note that we indeed
recovered the expected structure, namely αx1and αy1.
Fig. 8: Entries of vectors axand ay
Another sanity check measure is to plot the evolution of
the Casimir function as a function of time depicted in Figure
9, showing that it has a constant value of 1077.53. The value
of the Casimir function depends on the initial conditions for
both the training data and the optimization variables. We
repeated the learning process for a nonlinear (neural network)
parameterization. In this case, we used Autograd to construct
functions that can be called to compute the partial derivatives
of the Casimir function. In addition of these Jacobians,
we used Autograd to generate the gradient of the loss
function. As a result, each iteration of the Adam algorithm
becomes slower, namely 3 sec. We run the algorithm for
Fig. 9: Casimir function over time for the linear parametriza-
tion
500 iterations starting from random initial conditions for the
optimization variables, selected around the zero value. The
Casimir function for the nonlinear parametrization, computed
at each point on the state trajectory is shown in Figure 10,
where we notice that the function takes a constant value of
approximately -0.4641
Fig. 10: Casimir function over time for the nonlinear param-
eterization
VI. Conclusions
In this paper we proposed a framework based on port-
Hamiltonian modeling formalism, aimed at learning inter-
action models between particles and dynamical properties
such as trajectory symmetries and conservation laws of
ensembles(or swarms) using large-scale optimization ap-
proaches. We built upon the Cucker-Smale particle inter-
action model, which we represented in a port-Hamiltonian
form, and for which we re-discovered the interaction model,
and learned the dynamical properties that were previously
proved analytically. Our approach can potentially be used
for discovering novel particle interaction rules which can
lead to new cooperative control system laws. The future
steps will include scaling up the problem to a very large
number of particles, considering non-linear parameterizations
for the symmetry maps, and re-casting the learning tasks in
a form that is compatible with parallel GPU computations
on deep learning platforms. In addition, we will explore if
symbolic computation of Jacobians together with automatic
differentiation of the loss function will lead to a significant
decrease in time per optimization iteration.
References
[1] J.S. Baras. A fresh look at network science: Interdependent multi-
graphs models inspired from statistical physics. In Proceedings of the
6th International Symposium on Communication, Control and Signal
Processing, pages 497–500, May 2014.
[2] F. Lu, M. Zhong, S. Tang, and M. Maggioni. Nonparametric inference
of interaction laws in systems of agents from trajectory data. arXiv
preprint arXiv:1812.06003, 2018.
[3] S.L. Brunton, J.L. Proctor, and J.N. Kutz. Discovering governing
equations from data by sparse identification of nonlinear dynam-
ical systems. Proceedings of the National Academy of Sciences,
113(15):3932–3937, 2016.
[4] J. Bongard and H. Lipson. Automated reverse engineering of nonlinear
dynamical systems. Proceedings of the National Academy of Sciences,
104(24):9943–9948, 2007.
[5] I. Matei, J. de Kleer, and R. Minhas. Learning constitutive equations
of physical components with constraints discovery. In 2018 Annual
American Control Conference (ACC), pages 4819–4824, June 2018.
[6] I. Matei, J. De Kleer, M. Zhenirovskyy, and A. Feldman. Learning
constitutive equations of physical components with predefined feasi-
bility conditions. In 2019 American Control Conference (ACC), pages
922–927, July 2019.
[7] Z. Mao, Z. Li, and G.E. Karniadakis. Nonlocal flocking dynamics:
Learning the fractional order of pdes from particle simulations. arXiv
preprint arXiv:1810.11596, 2018.
[8] A.J. van der Schaft and B.M. Maschke. Port-hamiltonian systems on
graphs. SIAM Journal on Control and Optimization, 51(2):906–937,
2013.
[9] A.J. van der Schaft. Port-hamiltonian systems: an introductory survey.
In M. Sanz-Sole, J. Soria, J.L. Varona, and J. Verdera, editors,
Proceedings of the International Congress of Mathematicians Vol. III,
number suppl 2, pages 1339–1365. European Mathematical Society
Publishing House (EMS Ph), 2006.
[10] A.J. van der Schaft and D. Jeltsema. Port-hamiltonian systems theory:
An introductory overview. Foundations and Trends R
in Systems and
Control, 1(2-3):173–378, 2014.
[11] J. Cervera, A.J. van der Schaft, and A. Ba˜
nos. Interconnection of port-
hamiltonian systems and composition of dirac structures. Automatica,
43(2):212–225, 2007.
[12] A. Mouchet. Applications of Noether conservation theorem to Hamil-
tonian systems. Annals of Physics, 372, 12 2015.
[13] J. Schwichtenberg. Physics from symmetry. Springer, 2015.
[14] J.S. Baras. Group invariance and symmetries in nonlinear control
and estimation. Nonlinear Control in the Year 2000, A. Isidori, F.
Lamnabhi-Lagarrigue, W. Respondek (Edts.), 1:137–171, December
2000.
[15] C.W. Reynolds. Flocks, herds and schools: A distributed behavioral
model. In ACM SIGGRAPH computer graphics, volume 21, pages
25–34. ACM, 1987.
[16] J.A. Carrillo, M. Fornasier, G. Toscani, and F. Vecil. Particle, kinetic,
and hydrodynamic models of swarming. Birkh¨
auser Boston, Boston,
2010.
[17] J.A. Carrillo, S. Martin, and V. Panferov. A new interaction potential
for swarming models. Physica D: Nonlinear Phenomena, 260:112–
126, 2013.
[18] G.W. Bluman and S.C. Anco. Symmetry and integration methods for
differential equations. Applied Mathematical Sciences, (154), 2002.
[19] A.F. Cheviakov G. W. Bluman and S.C. Anco. Applications of sym-
metry methods to partial differential equations. Applied Mathematical
Sciences, (163), 2010.
[20] D. Maclaurin, D. Duvenaud, M. Johnson, and J. Townsend. Autograd.
https://github.com/HIPS/autograd, 2018.
[21] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito,
Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differenti-
ation in PyTorch. 2017.