ArticlePDF Available

Gray-box monitoring of hyperproperties with an application to privacy

Authors:

Abstract and Figures

Runtime verification is a complementary approach to testing, model checking and other static verification techniques to verify software properties. Monitorability characterizes what can be verified (monitored) at run time. Different definitions of monitorability have been given both for trace properties and for hyperproperties (properties defined over sets of traces), but these definitions usually cover only some aspects of what is important when characterizing the notion of monitorability. The first contribution of this paper is a refinement of classic notions of monitorability both for trace properties and hyperproperties, taking into account, among other things, the computability of the monitor. A second contribution of our work is to show that black-box monitoring of HyperLTL (a logic for hyperproperties) is in general unfeasible, and to suggest a gray-box approach in which we combine static and runtime verification. The main idea is to call a static verifier as an oracle at run time allowing, in some cases, to give a final verdict for properties that are considered to be non-monitorable under a black-box approach. Our third contribution is the instantiation of this solution to a privacy property called distributed data minimization which cannot be verified using black-box runtime verification. We use an SMT-based static verifier as an oracle at run time. We have implemented our gray-box approach for monitoring data minimization into the proof-of-concept tool Minion . We describe the tool and apply it to a few case studies to show its feasibility.
This content is subject to copyright. Terms and conditions apply.
Formal Methods in System Design (2021) 58:126–159
https://doi.org/10.1007/s10703-020-00358-w
Gray-box monitoring of hyperproperties with an application
to privacy
Sandro Stucki1·César Sánchez2·Gerardo Schneider1·
Borzoo Bonakdarpour3
Received: 28 February 2020 / Accepted: 9 December 2020 / Published online: 2 February 2021
© The Author(s) 2021
Abstract
Runtime verification is a complementary approach to testing, model checking and other static
verification techniques to verify software properties. Monitorability characterizes what can
be verified (monitored) at run time. Different definitions of monitorability have been given
both for trace properties and for hyperproperties (properties defined over sets of traces), but
these definitions usually cover only some aspects of what is important when characterizing
the notion of monitorability. The first contribution of this paper is a refinement of classic
notions of monitorability both for trace properties and hyperproperties, taking into account,
among other things, the computability of the monitor. A second contribution of our work is
to show that black-box monitoring of HyperLTL (a logic for hyperproperties) is in general
unfeasible, and to suggest a gray-box approach in which we combine static and runtime
verification. The main idea is to call a static verifier as an oracle at run time allowing, in
some cases, to give a final verdict for properties that are considered to be non-monitorable
under a black-box approach. Our third contribution is the instantiation of this solution to a
privacy property called distributed data minimization which cannot be verified using black-
box runtime verification. We use an SMT-based static verifier as an oracle at run time. We
have implemented our gray-box approach for monitoring data minimization into the proof-
of-concept tool Minion. We describe the tool and apply it to a few case studies to show its
feasibility.
Keywords Runtime verification ·Monitorability ·Hyperproperty ·LTL ·HyperLTL ·Data
minimization ·Security ·Privacy
1 Introduction
Imagine yourself organizing an international conference on formal methods with parallel
tracks spread over multiple conference venues. While the caterers prepare beverages and
This research has been partially supported by the United States NSF SaTC Award 1813388, by the Swedish
Research Council (Vetenskapsrådet) under Grant 2015-04154 “PolUser”, by the Madrid Regional
Government under Project S2018/TCS-4339 “BLOQUES-CM”, by EU H2020 Project 731535 “Elastest”,
and by Spanish National Project PGC2018-102210-B-100 “BOSCO”.
Extended author information available on the last page of the article
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 127
snacks for the early morning coffee break on the first day of the conference, you find yourself
pondering the following questions:
1. Will there be enough coffee for the participants of all the different tracks during the
upcoming coffee break?
2. Will the coffee at the different venues be served simultaneously?
3. If one of the venues runs out of coffee, will there still be coffee at one of the other venues?
Questions like these, involving different possible behaviors of local state across multiple
parts of a complex system, are called hyperproperties. The above questions can be compactly
formalized in the following three HyperLTL formulas:
(1)π. π(2)π.π.(ππ)(3)π.π.(¬ππ)
where πand πdenote conference venues.
In this paper, we investigate the monitorability of such hyperproperties. Typically, one
considers a property to be monitorable if it is possible to reach a scenario (within a finite
number of steps) which definitely satisfies or violates the property. For example, we can
easily imagine future scenarios where questions 1 and 2 have definitive answers. When any
of the conference venues run out of coffee during a break, we have detected a violation
of (1). Even if the empty coffee dispensers are later replenished, the property, being a safety
hyperproperty, remains permanently violated. Similarly, any pair of venues where one runs
out of coffee before the other constitutes a permanent violation of (2). Thus, monitorability
is easy to establish for (1) and (2). Monitorability of (3), on the other hand, is surprisingly
subtle. According to earlier work on the topic [3], the property should not be monitorable;
yet we can build a monitor for (3) in practice. This apparent contradiction and its resolution
are the subject of the first half of this paper.
Monitoring hyperproperties is of interest beyond the coffee breaks of academics meetings.
Most notably, hyperproperties arise naturally when characterizing the security and privacy
of information systems. Indeed, monitoring systems for security and privacy violations at run
time is one of the main motivations behind our work. As a case in point, we apply our insights
on monitoring hyperproperties to a particular privacy hyperproperty called distributed data
minimality (DDM) in the second half of this paper.
Beyond this particular application, our theoretical results apply to the broader context of
runtime verification (RV). Runtime verification is a computing analysis technique based on
observing executions of a system to check its expected behavior against predefined properties.
When using RV, one first generates a monitor from a specification, ideally automatically, and
then uses the monitor to analyze the behavior of a system under study. RV is considered to be
“a practical application of formal verification” and “a less ad-hoc approach complementing
conventional testing and debugging” [40]. Unlike static formal verification, RV sacrifices
completeness since it can only analyze (a finite collection of) finite traces of a system under
observation, which are typically (a finite set of) prefixes of (a potentially infinite set of)
potentially unbounded computations. Despite this apparent limitation, there are properties
that can only be verified at run time. See [25,30,40] for surveys on RV, and the recent book
[9].
When applying static verification techniques to complex systems, one is forced to consider
decidability issues. Similar considerations apply to runtime verification, where the proper-
ties of interest are those that are monitorable. Monitorability was first defined by Pnueli and
Zaks as follows. A property expressed in LTL is monitorable (after observing a prefix trace
u) if there is an extension of uthat would violate or satisfy the property [38]. We call this
notion semantic black-box monitorability.Itissemantic because this notion of monitorability
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
128 Formal Methods in System Design (2021) 58:126–159
defines a decision problem (the existence of a satisfying or violating trace extension) without
requiring a corresponding decision procedure. It is black-box because this definition only
considers the property without further information about the program/system being moni-
tored, so every extended observation is possible and must be considered. As initial research
in RV originated in the model checking community, it was natural to consider the problem
of monitoring LTL formulae. The problem of monitoring LTL (for similar notions of mon-
itorability) is quite well-studied [13,26,38](seealso[10]). For other settings, though, these
“classical” definitions are not very suitable: a property may be semantically monitorable even
though no algorithm to monitor that property exists.
A hyperproperty is a property defined over sets of traces (i.e. a hyperproperty defines a
set of sets of traces). Therefore, monitoring hyperproperties implies reasoning about multi-
ple traces simultaneously (seminal work on monitoring hyperproperties include [3,17,24]).
Most security properties are hyperproperties [18], including confidentiality, integrity, non-
interference, non-inference, etc. This is also the case for privacy properties, such as data
minimization [7,36]. The notion of monitorability for hyperproperties in [3] extends the orig-
inal definition by not only considering whether extensions of an observed trace would violate
or satisfy the property, but also considering extensions of the set of observed traces. A big
limitation of the existing notions of monitorability in the literature is that they completely
ignore the role of the system being monitored.
In this paper, we consider a more fine-grained concept of monitorability providing a more
comprehensive landscape of different aspects to be considered along the three dimensions of
the cube depicted in Fig. 1. The first dimension of the cube expresses that the monitor can
either reason about single traces, or about multiple traces simultaneously (the trace/hyper
dimension). The second dimension is concerned with how much we know about the system
being monitored (the black/white dimension). If we have full knowledge of the system and its
analysis is completely precise, we call this white-box monitoring. Black-box monitoring refers
to the classic approach of assuming zero knowledge about the system and crafting general
monitors that provide sound verdicts for every system. White- and black-box monitoring
are the extreme ends of a spectrum. Between them, we find various degrees of gray-box
monitoring, in which the monitor uses some information about the system (approximate sets
of executions). This partial knowledge may be given by a model of the system in addition
to the observed finite execution. Note that we may or may not have access to the source
code of the system being monitored; what is important is to have a model of it. The third
dimension (computability) considers the computability limitations of the monitors themselves
as programs, that is, it focuses on the computational power of the monitors.
trace/hyper
black/
white
computability
[3,15 ,26 ]
[12 ,
13 ,23 ,
24 ,38 ]
[43 ]
Fig. 1 The monitorability cube
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 129
We prove in this paper that a large class of hyperproperties that involve quantifier alter-
nations are not back-box monitorable in general. To work around this discouraging result,
we propose a gray-box approach based on a combination of static and runtime verification
that allows us to still give a definitive verdict (violation or satisfaction) for certain properties
that are not black-box monitorable. In particular, our approach uses static verification over a
model of the system as an oracle at run time: given the set of already observed (finite) traces,
the oracle is used to try to reach a verdict concerning further (not yet seen) traces.
We then apply our approach to a specific privacy property called data minimality,more
specifically distributed data minimality (DDM). The principle of data minimization (defined
in Article 5 of the EU General Data Protection Regulation [21]) stipulates that only data that is
(semantically) used by a program should be collected and processed. When data is collected
from independent sources, the principle is called distributed data minimization. DDM can
be expressed as a formula in HyperLTL with one quantifier alternation, of the form ∀∀∃∃ϕ.
Ithasbeenshownin[36] that a stronger version of DDM is black-box monitorable for
violations of the property, but nothing is said about DDM, though it is left implicit that it is
non-monitorable (“formulas with alternating quantifiers are not monitorable in general”).
Our approach to monitor violations of DDM is as follows. We create a gray-box monitor
that dynamically observes and collects traces for the negation of DDM (∃∃∀∀¬ϕ), which are
then considered to be potential witnesses for the existential part. The monitor then invokes an
oracle (a model extracted from the source code in the form of its symbolic execution tree, on
which we use an SMT solver) to soundly decide the universally quantified inner sub-formula.
Our approach is sound but the monitor may give an inconclusive answer, depending on the
precision of the static verifier. We present a proof-of-concept gray-box monitor for DDM
called Minion1and we apply it to a few case studies to show its feasibility.
This paper is a revised and extended version of a paper presented at the 23rd International
Symposium on Formal Methods (FM’19) [41]. Besides including more detailed examples,
and full proofs of our theoretical results, we have added a new section describing our tool
Minion in more detail, and new theoretical results as explained below. In summary, the
contributions of this paper are:
(1)A generalized version of HyperLTL parametrized over relational structures, with a richer,
more expressive core logic that is better suited to reasoning about security hyperproper-
ties. (Sect. 2.)
(2)A unified semantic framework of monitorability abstracting over any particular choice
of specification logic. By instantiating our framework to LTL or HyperLTL, we obtain
the familiar notions of monitorability for trace- and hyperproperties (Sect. 2).
(3)A novel and richer definition of monitorability that, besides the choice of specification
logic, also considers different degrees of information about the system being monitored
(gray-box monitoring), and the computational power of the monitor (computability)
(Sect. 3).
(4)A method for gray-box monitoring by enhancing runtime verification with oracles that
uses static analysis and static verification in order to enable the possibility of monitoring
properties that are non-monitorable in a black-box manner (Sect. 3).
(5)We express DDM as a hyperproperty and study its monitorability using our new gray-
box approach. In particular, the oracle is based on symbolic execution and SMT solvers
(Sect. 4).
(6)We describe a proof-of-concept implementation of our gray-box monitor for DDM,
apply it to some representative examples, and present an empirical evaluation (Sect. 5).
1Freely available online at https://github.com/sstucki/minion/.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
130 Formal Methods in System Design (2021) 58:126–159
We start by recapitulating LTL and HyperLTL and summarizing existing notions of mon-
itorability. Comparison with related work and our conclusions are presented in the last two
sections of the paper.
2 Background
Let beaset,calledthealphabet.We call each element of aletter (or an event). Throughout
the paper, ωdenotes the set of all infinite sequences (called traces) over ,anddenotes
the set of all finite traces over . For a trace tω(or t), t[i]denotes the ith element
of t,whereiN.Weuse|t|to denote the length (finite or infinite) of trace t. Also, t[i,j]
denotes the subtrace of tfrom position iup to and including position j(or if i>jor if
i>|t|). In this manner t[0,i]denotes the prefix of tup to and including iand t[i,..]denotes
the suffix of tfrom i(including i).
Given a set X,weuseP(X)for the set of subsets of Xand Pfin(X)for the set of finite
subsets of X.Letube a finite trace and ta finite or infinite trace. We denote the concatenation
of uand tby ut. Also, utdenotes the fact that uis a prefix of t.GivenafinitesetU
of finite traces and an arbitrary set Wof finite or infinite traces, we say that Wextends U
(written UW) if, for all uU,thereisavW, such that uv. Note that every trace
in Uis extended by some trace in W(we call these trace extensions), and that Wmay also
contain additional traces with no prefix in U(we call these set extensions).
2.1 LTL and relational HyperLTL
We now briefly introduce LTL and HyperLTL. Let AP be a finite set of atomic propositions
and define the alphabet as =2AP. The syntax of LTL [37]is:
ϕ::= a¬ϕϕϕϕϕUϕ
where aAP. The semantics of LTL is given by associating to a formula the set of traces
tωthat it accepts:
t| aiff at[0]
t| ¬ ϕiff t| ϕ
t| ϕ1ϕ2iff t| ϕ1or t| ϕ2
t| ϕiff t[1,..]|ϕ
t| ϕ1Uϕ2iff for some i,t[i,..]|ϕ2and for all j<i,t[j,..]|ϕ1
We will also use the usual derived operators (ϕtrue Uϕ) and (ϕ≡¬¬ϕ).
All properties expressible in LTL are trace properties (each individual trace satisfies
the property or not, independently of any other trace). Some important properties, such as
information-flow security policies (including confidentiality, integrity and secrecy), cannot
be expressed as trace properties but require reasoning about two (or more) independent
executions (perhaps from different inputs) simultaneously. Such properties are called hyper-
properties [18]. HyperLTL [19] is a temporal logic for hyperproperties that extends LTL by
allowing explicit quantification over execution traces.
We present here a generalized version of HyperLTL with a richer core logic than those
commonly found in the literature. Whereas HyperLTL is typically defined with a propositional
core logic similar to that of LTL, here we allow formulas to use atomic relation symbols of
non-zero arity. This extension ismotivated by relational properties from security and privacy,
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 131
which involve comparisons of values ranging over unbounded I/O domains across multiple
executions of a system. For example, the well-known non-interference property may be
expressed symbolically as
π1.π2.in1)=Lin2)out1)=Lout2).
Non-interference is clearly a hyperproperty and, as such, has been a target for run-time ver-
ification via HyperLTL along with other, similar security properties [3,36]. Yet, statements
like in1)=Lin2)cannot be expressed in propositional logic in general and are there-
fore, strictly speaking, beyond the scope of HyperLTL. The above is true unless in(π) and
out ) range over finite I/O domains. That said, even if realistic systems operate on finite
data in practice, encoding a property such as non-interference as a propositional formula is
impractical for large/unbounded I/O domains (text, video, etc.). To overcome this limitation,
we strengthen the core of HyperLTL to a predicate logic parametrized by a given σ-structure.
Definition 1 (signatures and structures)A(relational) signature σis a pair σ=(S,ar),
where Sis a set of relation symbols and ar :SNassigns an arity ar(r)to each rS.A
σ-structure is a pair (A,I),where Ais a set, called the domain,andIis an interpretation
function assigning to each rSan ar(r)-ary relation I(r)Aar(r).A(relational) structure
Ais a triple A=A,AA,IA)consisting of a relational signature σAand its interpretation.
For simplicity, we restrict ourselves to single-sorted relational structures (that is, the set A
is the domain interpreting the only sort), which is sufficient for the purpose of this paper.
However, the definition of Relational HyperLTL given below can easily be adapted to many-
sorted structures with standard techniques. Similarly, our results can be easily extended to
cover function symbols. We leave these extensions and a thorough discussion of the logical
and proof-theoretic properties of HyperLTL over a structure Afor future work.
Let Vobeafinitesetofobject variables and Vta countably infinite set of trace vari-
ables. Object variables x,y,z,... Vodenote the observables of a system, e.g. the value
of a counter, the temperature measured by some sensor, or the latest input received by a
reactive system. Trace variables π, τ, π ,... Vtdenote traces obtained by observing
independent runs of some system or concurrent runs of individual subsystems. The syntax
of Relational HyperLTL for a signature σis:
ϕ::= ∀π.ϕ π.ϕ ψ(formula)
ψ::= r(e,...,e)¬ψψψψψUψfor rS(temporal formula)
e::= xπ(expression)
To interpret HyperLTL formulas over a structure A,weusethealphabet=AVo.That
is, events are valuations of the object variables in the domain of A. A trace assignment
Π:Vtωis a partial function mapping trace variables to infinite traces. We use Πto
denote the empty assignment, and Π[π→ t]for the same function as Π, except that πis
mappedtotracet. The semantics of HyperLTL (over A) is defined by associating formulas
with pairs (T,Π),whereTis a set of traces and Πis a trace assignment:
T| π.ϕ iff for all tT,we have T[π→ t]|ϕ
T| π.ϕ iff there exists tT,such that T[π→ t]|ϕ
T| ψiff Π| ψ
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
132 Formal Methods in System Design (2021) 58:126–159
The semantics of the temporal inner formulas is defined in terms of the traces associated with
each path (here Π[i,..]denotes the map that assigns πto t[i,..]if Π(π) =t):
Π| r(e1,...,en)iff IA(r)(e1Π,...,enΠ)for n=ar(r)
Π| ψ1ψ2iff Π| ψ1or Π| ψ2
Π| ¬ ψiff Π| ψ
Π| ψiff Π[1..]|ψ
Π| ψ1Uψ2iff for some i,Π[i,..]|ψ2,and
for all j<iT[j,..]|ψ1
The semantics of an expression eis simply the value associated with the corresponding pair
of trace and object variables:
xπΠ=Π(π)[0](x)
We say that a set Tof traces satisfies a HyperLTL formula ϕ(denoted T| ϕ) if and only if
T
| ϕ. In the rest of the paper, for brevity, we refer to ‘Relational HyperLTL’ as just
‘HyperLTL’.
The propositional semantics introduced for HyperLTL (see [19]) is a special case of the
definition above, choosing Vo=AP and the Boolean model A=2, which is defined as
σ2=({T},ar(T)=1)A2={,} I2(T)(a)iff a=.
Since there is only one relation symbol in this structure, we may as well drop it and simply
write aπinstead of T(aπ), thus recovering the usual syntax of (propositional) HyperLTL. Note
that in this domain all atomic relations can be constructed using Boolean connectives. For
example, “πand πagreeonthevalueofa” can be written (aπaπ). In the remainder of
the paper, we refer to HyperLTL over a structure Aas HyperLTLA. If no particular structure
is specified, we assume the usual syntax and semantics of HyperLTL2.
Example 1 Recall the HyperLTL formulas (2) and (3) from the introduction
(2
2=∀π.π.(ππ)(3
3=∀π.π.(¬ππ)
and let T={t1,t2,t3},where
t1={ }{ ,}{ }··· t2={ ,}{ }{} · · · t3={ }{ }{ }··· .
and where the ellipses ‘ ···’ indicate that the last event is repeated indefinitely. Although
traces t1and t2together satisfy (2), t3does not agree with the other two, i.e. t3[2]but
/t1[2]and /t2[2]. Hence, T| ϕ2. On the other hand, T| ϕ3because t3ensures
that there is always coffee somewhere.
2.2 Semantic monitorability
Runtime verification is concerned with (1) generating a monitor from a formal specification
ϕ, and (2) using the monitor to detect whether or not ϕholds by observing events generated
by the system at run time. Informally, monitorability refers to the feasibility of monitoring
a property. Some properties are non-monitorable because no finite observation can lead to a
conclusive verdict. We now present some abstract definitions to encompass previous notions
of monitorability in a general way. These definitions are made concrete by instantiating them,
for example, to traces (for trace properties) or sets of traces (for hyperproperties), see Ex. 2
below.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 133
Observation. We refer to the finite information provided dynamically to the monitor up
to a given instant as an observation.WeuseOand Oto denote individual observations
and Oto denote the set of all possible observations. The set Ois equipped with an
extension order OO.
System behavior. We us e Bto denote the universe of all possible behaviors of a system.
Abehavior BBmay, in general, be an infinite piece of information whereas an
observation is always finite. By abuse of notation, OBdenotes that observation
OOcan be extended to a behavior B.
Property. In this abstract sense, a property P isapredicateonB: some behaviors satisfy
the property P, while the rest violate it. We write P(B)if the behavior Bsatisfies property
P,and P(B)otherwise. We will use logics to specify properties of behaviors.
Example 2 When monitoring trace properties such as LTL, the set of observations is O=
because an observation is a finite trace O. The order is the prefix relation on strings.
The set of behaviors is B=ω. Every formula ϕinduces a property Pϕon traces via its
semantics: Pϕ(t)iff t| ϕ.
When monitoring hyperproperties expressed in a logic such as HyperLTL, an observation
is a finite set of finite traces O,thatis,O=Pfin(). The order is the prefix
relation for finite sets of finite traces defined in the beginning of this section. That is, OO
whenever for all tOthere is a tOsuch that tt. Behaviors are now sets of infinite
traces, so B=P(ω). Finally, the semantics ·|ϕof a HyperLTL formula ϕdefines a
property on sets of traces.
We adapt the notions of satisfaction and violation of a property Pfrom behaviors to
observations as follows. An observation OOpermanently satisfies a property Pif every
behavior BBthat extends Osatisfies P.
P(O)iff for all BBsuch that OBwe have P(B).
Similarly, an observation Opermanently violates a property Pif every extension BBof
Oviolates P:
P(O)iff for all BBsuch that OBwe have P(B).
This slight abuse of notation is justified by the fact that an observation permanently violates
a property if and only if it permanently satisfies the complement property (and vice versa).
Obviously, no observation can permanently violate and permanently satisfy a given property.
However, many observations neither permanently satisfy nor violate a given property, as
different extensions of the observation have different outcomes with respect to the property.
These definitions can be instantiated to logics like LTL and HyperLTL. Given an LTL
formula ϕ,afiniteprexupermanently satisfies ϕif v| ϕfor all finite extensions vu,
and upermanently violates ϕif v| ϕfor all finite extensions vu.Wewriteu| sϕ
if upermanently satisfies ϕ,andu| vϕif upermanently violates ϕ. The definitions and
notations for permanent satisfaction and violation of HyperLTL formulas are obtained by
replacing finite traces u,vwith finite sets of finite traces U,V.
To monitor a system for satisfaction (or violation) of a formula ϕis to decide whether a
finite observation permanently satisfies (resp. violates) ϕ.
Definition 2 (semantic black-box monitorability) A property Pis (semantically) black-box
monitorable if, for every observation O, there exists an extended observation OO,such
that P(O)or P(O).
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
134 Formal Methods in System Design (2021) 58:126–159
Note that Def. 2states that for every observation Othere is hope in finding a final verdict
by extending the observation. Instantiating Def. 2for LTL with finite traces as observations
(O=and B=ω) leads to the traditional definition of monitorability for LTL by Pnueli
and Zaks [38](seealso[13,26]) that states that a property is monitorable after observing O
if Ois not an ugly prefix, where a prefix is ugly if it cannot extended into a definite verdict.
The fact that Ois not an ugly prefix is called PZ(O)(for Pnueli-Zaks) in [2]. Note that a
prefix is hopeless precisely when it is ugly. See Sect. 6for a longer discussion on the different
notions of monitorability, particularly the taxonomy discussed in [2].
Similarly, instantiating Def. 2for HyperLTL with observations as finite sets of finite traces
leads to monitorability as introduced by Agrawal and Bonakdarpour [3].
Example 3 The LTL formula ais not (semantically) black-box monitorable since it
requires an infinite-length observation, while formulas aand aare monitorable: a
is monitorable for violations as the monitor will flag a violation as soon as ais not seen,
while an (acceptance) monitor for awill signal acceptance when observing an a. Similarly,
π.τ.(πτ)is (semantically) black-box monitorable for violation (it suffices to
identify a case whenever there is coffee in the πtrace but not in the corresponding place
in the τtrace). On the other hand, π.τ.(¬πτ)is not (semantically) black-box
monitorable (neither for acceptance nor for violation) due to the quantifier alternation. We
will prove this claim in detail in Sect. 3.
3 Improved monitorability by gray-box monitoring
Most of the previous definitions of monitorability in the literature assume one or a combina-
tion of the following:
the logics are limited to trace logics, not considering hyperproperties [13,26,38];
the system under analysis is black-box in the sense that the monitoring process must
consider every further observation as a plausible extension [3,13,26], that is, the monitor
makes decisions only based on the observed traces and it does not take into account the
set of possible traces that the system can exhibit;
the logics are tractable, in the sense that the decision problems of satisfiability, liveness,
etc. are decidable for these logics. For example, one can decide the existence of an
extended observation that satisfies or falsifies the given property [3,13,26,38].
We present here a more general notion of monitorability by challenging these assumptions.
First, in Sect. 3.1 we prove negative results about the limitations of monitoring hyperproperties
in a black-box manner. Then, in Sect. 3.2 we introduce the refined notionx of gray-box
monitors and study the relation between their monitoring power with respect to semantic
monitorability.
3.1 The limitations of (black-box) monitoring hyperproperties
Earlier work on monitoring hyperproperties is restricted to the quantifier alternation-free
fragment, that is, considering only or properties. We establish now an impossi-
bility result about the monitorability of formulas of the form π.π.β,whereβis a state
predicate. For simplicity, we present the result for two traces, but this can be easily extended
to arbitrarily many traces (two or more) and ++hyperproperties. Intuitively speaking, the
impossibility result works by first fixing a class of formulas, and then extending any observa-
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 135
tion into an extended observation that is satisfying (typically the set of all traces) and into a
violating observation (by extending into a behavaior that fails to have a corresponding trace
for the inner quantifier). Therefore, it is hopeless to monitor such formulas as one is unable
to produce a definite verdict no matter what the observation is. Note that new observations
both make it easier to have corresponding observations for the inner , but create a further
challenge for the outter .
For a pair of traces πand π, a state predicate βis a relation over xπand xπor a
Boolean combination of such relations, whereas it contains no temporal operators. Given a
pair of valuations v, v,wewriteβ(v,v)as a shorthand for {π→ vvv ··· →
vvv···} |β, i.e. βevaluated at vand v. For example, the predicate β=(aπbπ)for
Vo=AP ={a,b}depends on the valuation of aand bat the first state of paths πand π,
respectively.
A predicate βis reflexive if β(v,v) holds for all valuations v. A predicate βis serial
if, for all v,thereisavsuch that β(v,v)holds. For example, the predicate β≡¬ π
πππis non-reflexive (β(,)is false) and serial (β(v,{ })holds for any v).
The following theorem captures precisely the monitorability of a safety ∀∃ hyperproperty.
Theorem 1 A HyperLTLAformula of the form ϕ=∀π.π.βis (semantically) black-box
monitorable if and only if βis reflexive or non-serial.
Proof Let ϕ=∀π.π.β. We show the two directions separately.
() Assume ϕis black-box monitorable. Then there must be a finite set Vof finite traces
(an extension of the empty set) such that V| sϕor V| vϕ. We analyze the two cases
in turn.
Case V | sϕ. Pick an arbitrary wand let u=www ··· be the trace repeating
windefinitely. Extend Vinto the set T={vu|vV}of infinite traces. Let ibe
the length of the longest prefix in V.Thent[i]=wfor all tT.SinceV| sϕ,
we have T| ϕand, in particular, t[i],t[i]|β. Hence β(w, w) for any w,soβis
reflexive.
Case V | vϕ. Because ωextends V,wemusthaveω| ϕ. Now assume that
βis serial. Then the universal set ωis a model of ϕbecause we can construct, for
every trace sω, a trace tωsuch that β(s[i],t[i])for any i. Hence βcannot
be serial.
() Either ϕis reflexive or non-serial. We consider each case in turn.
–Ifβis reflexive then ϕholds for every non-empty set of infinite words by picking
the same trace to instantiate πand π. Therefore ϕis black-box monitorable (in fact,
guaranteed to be permanently satisfied for any observation).
Assume that ϕis non-serial. Then there must be a valuation vsuch that ¬β(v,v)
for any v. Consider an arbitrary observation Uand extend one uUinto uv.
The observation uvpermanently violates ϕbecause there is no candidate for πthat
matches uvat the position where voccurs.
This concludes the proof.
In the proof of Theorem 1, the existence of an observation that permanently violates or
permanently satisfies ϕis sufficient to show reflexivity (for satisfaction) and non-seriality
(for violation) of the state predicate. Similarly, in the other direction non-seriality implies
black-box monitorability because for every observation there is an extended observation that
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
136 Formal Methods in System Design (2021) 58:126–159
is a permanent violation. However, reflexivity implies a stronger notion of monitorability
than that defined in Def. 2as every observation is a permanent satisfaction. In other words,
a future observation with a definite verdict is not merely possible, it is guaranteed.
The fragment of ∀∃ properties captured by Theorem 1is very general. First, the temporal
fragment considered in Theorem 1is safety, which are the simplest properties in terms
of the temporal operators considered. Second, every binary predicate can be turned into a
non-reflexive predicate by distinguishing the traces being related. In fact, many relational
properties, such as non-interference and DDM, contain a tacit assumption that only distinct
traces are being related and therefore the relation is non-reflexive. Seriality simply establishes
that βcannot be falsified by only observing the local valuation of one of the traces. Intuitively,
a non-serial predicate can be falsified by looking only at one of the traces, so the property is
not a proper hyperproperty as it is not truly relational.
The proof of Theorem 1makes essential use of the operator in the property π.π.β.
One may be tempted to conclude that non-monitorability is therefore mainly a problem for
temporal formulas. However, this is not the case. The following theorem shows that black-box
monitorability is also impossible for a large class of completely non-temporal formulas.
We say that a subset Vof valuations is a sink of a state predicate βif, for all valuations
u, either β(u,u)holds or there is a vVsuch that β(u,v)holds. If, in addition, Vis
nite,wesayVis a finite sink.
Theorem 2 A HyperLTLAformula of the form ϕ=∀π.πis (semantically) black-box
monitorable if and only if βis non-serial or has a finite sink.
Proof Let ϕ=∀π.π. We show the two directions separately.
() Assume ϕis black-box monitorable. Then, there must be a finite V(extending the
empty set) such that V| sϕor V| vϕ. We analyze the two cases.
Case V | sϕ.LetV[0]={v[0]|vV}.WeshowthatV[0]is a finite sink
of β.LetT={vvv ··· | vV}. Pick an arbitrary u,lett=uuu ···,and
T=T∪{t}. Because Vpermanently satisfies ϕand VT,wehaveT| ϕ.
In particular, {π→ t→ t}|βfor t=tor tT.Ift=tthen β(u,u),
otherwise β(u,v[0])for some vV. Hence V[0]is a finite sink of β.
Case V | vϕ. Because ωextends V,wemusthaveω| ϕ. Now assume that
βis serial. Then the universal set ωis a model of ϕbecause we can construct, for
every trace sω,atracetωsuch that β(s[0],t[0]). Hence βcannot be serial.
() Either ϕis non-serial or ϕhas a finite sink. We consider each case.
Assume ϕis non-serial. Then there must be a valuation usuch that ¬β(u,v)
for any v. Hence, given any finite set Uof finite traces, U∪{u}permanently
violates ϕ.
Assume ϕhas a finite sink W.ThenWis a finite set of finite traces (each trace
being a singleton) and Wpermanently satisfies ϕsince, for any trace t,thereisa
wW∪{t[0]} such that β(t[0],w) holds. Given any other finite set Uof finite
traces, let V=UW.ThenVpermanently satisfies ϕ.
The practical consequence of Theorems 1and 2is that many hyperproperties involving one
quantifier alternation cannot be monitored. We also note that Theorems 1and 2establish the
minimal case for monitorability of HyperLTL formulas. Indeed, generalizing the ∀∃ fragment
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 137
to single-alternation fragments with more quantifiers (++) does not change proofs of
Theorems 1and 2.
Note that neither Theorem 1nor Theorem 2makes any assumptions on the choice of
the structure Aover which βis defined. Hence, the theorems hold both for propositional
HyperLTL as well as any extensions to richer core logics. However, for finite domains A
(and finite sets of object variables Vo), Theorem 2is trivial because the alphabet =AVo
is a finite sink of any serial state predicate β. This is true, in particular, for the propositional
semantics of HyperLTL where =2AP. For richer core logics, on the other hand, Theorem 2
is non-trivial. We will see an example of this in Sect. 4, where we analyze monitorability of
DDM.
Finally, note that even though the proofs of Theorem 1and Theorem 2have a very similar
structure, neither of them subsumes the other.
3.2 Gray-box monitoring and the notions of sound and perfect monitors
We want to expand the spectrum of properties we could eventually monitor so we propose
to exploit knowledge about the set of traces that the system can produce (gray-box or white-
box monitoring). Given a system that can produce the set of system behaviors SB,we
parametrize the notions of permanent satisfaction and permanent violation to consider only
behaviors in S:
PS(O)iff for all BSsuch that OBwe have P(B),
PS(O)iff for all BSsuch that OBwe have P(B).
First, we extend the definition of monitorability (Def. 2above) to consider the system under
observation.
Definition 3 (semantic gray-box monitorability) A property Pis semantically gray-box
monitorable for a system Sif every observation Ohas an extended observation OOin
S, such that PS(O)or PS(O).
Example 4 Recall the hyperproperty ϕ3=∀π.τ.(¬πτ)from our introductory
example. We have already established that this property is not semantically black-box mon-
itorable because the state predicate ¬πτis non-reflexive and serial. A closer look at
the proof of Theorem 1reveals the exact culprit: ϕ3is not monitorable for violation because
we cannot rule out the possibility that there will be a set extension in the future that adds
the trace t={ }{ }{ }··· which ensures that there is always coffee somewhere. This
set extension, though permitted in theory, seems dubious in practice. The number of cof-
fee dispensers is generally finite, even at the largest of conferences. If we constrain system
behaviors TSto contain at most ntraces, i.e. |T|≤nfor all TS, the property ϕ3
becomes semantically gray-box monitorable for violation. To see this, let Ube an arbitrary
finite observation (containing at most ntraces). To show that ϕ3is gray-box monitorable
for violation, we need to exhibit a finite extension Vof Uthat permanently violates ϕ3.Let
Ube a set extension of Uof size |U|=n. Clearly, such an extension always exists
since ={{ },}contains an unbounded supply of distinct finite traces that can be
added to U. Denote the length of the longest trace in Uby land let Vbe the set of finite
traces obtained by padding every trace in Uwith -events until the resulting trace vhas
length |v|=l+1. Then VUand Vis a set of ntraces, all of length l+1, such that
/v[l]for any vV. Any extension Tof Vin Sis necessarily a trace extension (as
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
138 Formal Methods in System Design (2021) 58:126–159
opposed to a set extension). Hence, /t[l]for any trace tin any such T, and therefore
V| v
Sϕ3.
Gray-box monitorability strictly subsumes black-box monitorability as the latter corresponds
to the special case where S=B. Hence, we will omit the qualifier “gray-box” when there is
no risk of confusion. In cases where SB, black-box monitorability is a (strictly) stronger
property. This is a corollary of the following lemma.
Lemma 1 Given a pair of systems S,Ssuch that SS, any property P that is semantically
monitorable for Sis also semantically monitorable for S.
Proof Let Pbe a property that is semantically monitorable for S. To show that Pis also
semantically monitorable for S,letOObe an arbitrary observation. By assumption, O
must have a finite extension OOthat either permanently satisfies of violates Pfor S.
Assume PS(O)(the case for PS(O)is analogous). Because any BSis also in S,we
have P(B)for any BOin S, and hence PS(O).
Lemma 1says that semantically monitorable properties Pof a system Sremain monitorable
for subsystems SS, but the converse does not hold in general: Pneed not be semantically
monitorable for super-systems S S.Ex.4illustrates this: ϕ3is semantically monitorable
for S={T||T|<n}but not for ωS.
Following Def. 3, monitors must now analyze and decide properties of extended observa-
tions of a particular system. This, in turn, may not be computationally tractable for sufficiently
rich system descriptions. To study this issue we introduce now a notion of monitors that con-
sider Sand the computational power of monitors (the diagonal dimension in Fig. 1). A
monitor for a property Pand a set of traces Sis a computable function MS:O→{,,?}
that, given a finite observation O, decides a verdict for P:
indicates success,
indicates failure, and
? indicates that the monitor cannot declare a definite verdict given only O.
To avoid clutter, we write Minstead of MSwhen the system is clear from the context. The
following definition captures when a monitor for a property Pcan produce a definite answer.
Definition 4 (sound monitor) Given a property Pand a set of behaviors S, a monitor Mis
sound if, for every observation OO,
1. if M(O)=,then PS(O),
2. if M(O)=⊥,then PS(O).
If a monitor is not sound then it is possible that an extension of Oforces Mto change a
to a verdict, or vice-versa. The function that always outputs ‘?’ is a sound monitor for
any property, but this is the least informative monitor. A perfect monitor precisely outputs
whether satisfaction or violation is inevitable, which is the most informative monitor.
Definition 5 (perfect monitor) Given a property Pand a set of traces S,amonitor Mis
perfect if, for every observation OO,
1. if PS(O),thenM(O)=,
2. if PS(O),thenM(O)=⊥,
3. otherwise M(O)=?.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 139
Obviously, a perfect monitor is sound. Similar definitions of perfect monitor only for sat-
isfaction (resp. violation) can be given by forcing the precise outcome only for satisfaction
(resp. violation).
Example 5 The following function is a monitor for ϕ3from Ex. 4:
M(U)=
?if|U|<n,
? if, for all i<max{|u||uU},there is a uU,s.t. u[i],
otherwise.
where nis the maximum number of traces allowed by S, i.e. n=max{|T||TS}.
Clearly, Mis computable. Furthermore, Mis a sound monitor for ϕ3,thatis,U| v
Sϕ3
whenever M(U)=⊥.ForMto produce a verdict, we must have |U|=nand there
must be an isuch that /u[i]for all uU. This means that any extension Tof Uin
Sis necessarily a trace extension, and therefore /t[i]for every trace tT. Hence, U
permanently violates ϕ3.
If n<,then Mis also a perfect monitor for ϕ3. First, note that no finite observation U
can permanently satisfy ϕ3because we can extend every such Uinto a set of infinite traces that
violates ϕ3by appending an infinite sequence of “¬” events to each trace in U. It remains
to show that U| v[S]ϕ3whenever M(U)=?. There are two cases: if |U|<n,thenwe
can always extend Uinto a set of traces TSthat contains the trace t={ }{ }{ }···
guaranteeing an indefinite supply of coffee; if, on the other hand, |U|=nbut there has
always been coffee somewhere in U, i.e. for all i<max{|u||uU}, there is some uU
such that u[i], then we can extend Uinto a set of infinite traces T={ut |uU},
where again t={ }{ }{ }···. In either case, TSand T| ϕ3.
Ablack-box monitor assumes that every behavior is potentially possible, that is S=B.
If the monitor uses approximate information about the actual system, then we say it is gray-
box.Weusewhite-box when the monitor can reason with absolute precision about the set of
traces of the system. In some cases, for example to decide instantiations of a quantifier, a
satisfaction verdict that is taken from Scan be concluded for all over-approximations (dually
under-approximations for violation and for ).
Using Defs. 4and 5, we now add the computability aspect to capture a stronger definition
of monitorability. Abusing notation, we use OSto say that the observation Ocan be
extended to a trace allowed by the system.
Definition 6 (strong monitorability) A property Pis strongly monitorable for a system S
if there is a sound monitor Msuch that for all observations OO, there is an extended
observation OSfor which either M(O)=or M(O)=⊥.
It is easy to see that if a property is strongly monitorable (for S) then it must be semantically
monitorable. But the converse does not hold: in rich domains, some semantically gray-
box monitorable properties may not be strongly monitorable. One simple example is the
observation of deterministic Turing machines. Consider the problem monitoring whether a
given deterministic Turing machine with a given input writes 1 in the tape infinitely many
times, or only a finite number of times. Since there is only one trace, at a given point in time,
there is only one behavior (because the machine is deterministic). Therefore the property
is semantically monitorable. However, this property is not strongly monitorable as no such
monitor exists.
In what follows we will use the term monitorability to refer to strong monitorability
when there is no risk of confusion. Also, we say that a property is strongly monitorable
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
140 Formal Methods in System Design (2021) 58:126–159
for satisfaction if the extension Owith M(O)=always exists (and analogously for
violation).
Lemma 2 If P is strongly monitorable for a system S, then P is semantically (gray-box)
monitorable for S.
A property may not be monitorable in a black-box manner, but monitorable in a gray-box
manner. In the realm of monitoring of LTL properties, strong and semantic monitorability
coincide for finite state systems (see [43]) both in the case of black-box and in the case
of gray-box for finite state systems, because model-checking and the problem of deciding
whether a state of a Büchi automaton is live are decidable.
Following the idea sketched in [15] we propose to use a combination of static analysis
and runtime verification to monitor violations of ++properties (or dually, satisfactions of
++).
The main idea is to collect candidates for the outer part dynamically (traces from the real
observed execution) and then use static analysis at runtime to over-approximate the inner
quantifiers. For the latter we consider traces coming not only from the monitored system but
also from other sources: they could be generated from a formal specification of the system,
or from a model obtained from the system via symbolic execution and predicate abstraction.
The key insight of our approach is that the current real execution of the system accounts for
the outermost quantifier, while static analysis/verification is applied to explore runs of the
model accounting for the innermost quantifier.
4 Monitoring distributed data minimality
In this section, we describe how to monitor DDM, which can be expressed as a hyperproperty
of the form ++. The negative non-monitorability result from Sect. 3.1 can be generalized
to ++hyperproperties. In the particular case of DDM, although we mainly deal with the
input/output relation of functions and are not concerned with infinite temporal behavior,
we still need to handle possibly infinite set extensions Sfor black-box monitoring. In the
remainder of this section, we discuss the following, seemingly contradictory aspects of DDM:
(P1) DDM is not semantically black-box monitorable,
(P2) DDM is semantically white-box monitorable (for programs that are not DDM),
(P3) checking DDM statically is undecidable,
(P4) DDM is strongly gray-box monitorable for violation, and we give a sound monitor.
The apparent contradictions are resolved by careful analysis of DDM along the different
dimensions of the monitorability cube (Fig. 1).
We will show how to monitor DDM and similar hyperproperties using a gray-box
approach. In our approach, a monitor can decide the existence of traces at run time using a
limited form of static analysis. The static analyzer receives the finite observation Ocollected
by the monitor, but not the future system behavior. Instead it must reason under the assump-
tion that any system behavior in Sthat is compatible with Omay eventually occur. For
example, given an ∃∀ formula, the outer existential quantifier is instantiated with a concrete
set Uof runtime traces, while possible extensions of Uprovided by static analysis can be
used to instantiate the inner universal quantifier.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 141
Fig. 2 A program for computing the total fee of a trip on a toll road
4.1 DDM preliminaries
We briefly recapitulate the concepts of data minimization and data minimality [7]. Before
giving their formal definitions, let us illustrate the ideas behind these concepts on the short
Java program shown in Fig. 2. Consider the method rate. Its purpose is to compute the
baseline rate to be paid by the driver of a vehicle on a toll road. The rate depends on the time of
day and the number of passengers in the vehicle. The range of the output is {56,70,72,90},
and consequently the data processor does not need to know the precise hour of the day, nor the
exact number of passengers. A vehicle might pass a toll station at any time between 9pm and
5am to be subject to the higher daytime rates (72, 90), and at any other time to benefit from
the lower nighttime rates (56, 70). Also, any vehicle occupied by three or more passengers is
eligible for a 20% carpool discount. Giving the actual hour and number of passengers violates
the principle of data minimality because more information than necessary is collected. Data
minimization is the process of ensuring that the range of inputs provided is reduced, such
that different inputs result in different outputs.
Formally, given a function f:IO, the problem of data minimization consists in
finding a preprocessor function p:II, such that f=fpand p=pp. The goal of
pis to limit the information available to fwhile preserving the behavior of f. There are many
possible such preprocessors (e.g. the identity function), which can be ordered according to the
information they disclose, that is, according to the subset relation on their kernels.Thekernel
ker(p)of a function pis defined as the equivalence relation (x,y)ker(p)iff p(x)=p(y).
The smaller ker(p)is, the more information pdiscloses. The identity function is the worst
preprocessor since it discloses all information (its kernel is equality—the least equivalence
relation). An optimal preprocessor, or minimizer, is one that discloses the least amount of
information.
A function fis monolithic data-minimal (MDM), if it fulfills either of the following
equivalent conditions:
1. the identity function is a minimizer for f,
2. fis injective.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
142 Formal Methods in System Design (2021) 58:126–159
Seeing as MDM is just injectivity (by Condition 2), one may wonder why we went through the
effort of defining preprocessors and their information order only to arrive at an equivalent but
more convoluted definition (Condition 1). The reason is that, besides MDM, there are other
variants of data minimality (e.g. DDM), whose logical characterizations are not as intuitive.
The advantage of Condition 1 is that it is an information-flow-based characterization that can
be generalized to more complicated settings in a relatively straightforward fashion (as we will
see shortly). Condition 2, on the other hand, is a purely logical or data-based characterization
suitable for implementation in e.g. a monitor.
MDM is the strongest form of data minimality, where one assumes that all input data
is provided by a single source and thus a single preprocessor can be used to minimize the
function. In a distributed setting, the concept of minimization is more complex as input data
may be collected from multiple independent sources. Consider the method fee in Fig. 2.
This method computes the total fee for a trip on a toll road, based on the hours at which a
vehicle passes three consecutive toll stations, and on the number of passengers in the vehicle.
The overall fee depends on the total time spent on the toll road, which is data collected from
all three toll stations. In particular, if a vehicle enters a section of the toll road during a
low-rate early morning hour, but fails to reach the next station before 9pm, the driver will be
charged the more expensive daytime rate for the entire section. DDM requires minimizing
each input parameter (i.e. the information collected at separate toll stations) individually.
A preprocessor located at any given toll station can easily minimize the individual inputs
(hour,passengers) at that station. But an individual preprocessor cannot guarantee
minimization with respect to the overall fee since it has no information about the input data
collected at the other stations. DDM therefore constitutes merely a “best effort” to minimize
inputs given the inherently distributed nature of the system.
As a second example, consider a web-based auction system that accepts bids from n
bidders, represented by distinct input domains I1,..., In, and where concrete bids xkIk
are submitted remotely. The auction system must compute the function m(x1,...,xn)=
maxk{xk}, which is clearly non-injective and, hence, non-MDM. In this case, a single, mono-
lithic minimizer cannot be used since different bidders need not have any knowledge of each
other’s bids. Instead, bidders must try to minimize the information contained in their bid
locally, in a distributed way, before submitting it to the auction.
The problem of distributed data minimization consists in building a collection p1,..., pn
of nindependent preprocessors pk:IkIkfor a given function f:I1×··· × InO,
such that their product p(x1,...,xn)=(p1(x1),..., pn(xn)) is a preprocessor for f.Such
preprocessors are called distributed, and a distributed preprocessor for fthat discloses the
least amount of information is called a distributed minimizer for f. Then, one can generalize
the (information-flow) notion of data-minimality to the distributed setting as follows. The
function fis distributed data-minimal (DDM) if the identity function is a distributed mini-
mizer for f. For example, the maximum function mdefined above is DDM. As for MDM,
there is an equivalent, data-based characterization of DDM described next.
Proposition 1 (distributed data minimality [7,35]) A function f is DDM iff, for all input
positions k and all x,yIksuch that x = y, there is some z I , such that f (z[k→ x])=
f(z[k→ y]).
We refer the interested reader to Antignac et al. [7] for a more detailed discussion of data
minimality and for a detailed proof of Proposition 1.
The alternative characterization of DDM given in Prop. 1serves as our formal specification
for exploring the monitorability of DDM. In the following, we assume that the function
f:I1×···×InOhas at least two arguments (n2). Note that for unary functions, DDM
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 143
coincides with MDM. Since MDM is a +-property (involving no quantifier alternations),
most of the challenges to monitorability discussed here do not apply [36]. We also assume,
without loss of generality, that the function fbeing monitored has only nontrivial input
domains, i.e. |Ik|≥2forallk=1,...n.If Ikis trivial then this constant input can be
ignored. Finally, note that checking DDM statically is undecidable (P3) for sufficiently rich
programming languages [7].
4.2 DDM as a hyperproperty
We consider data-minimality for total functions f:IO. The domain of interpretation Af
is the set of possible input-output (I/O) pairs of f, i.e. Af=I×O. We do not distinguish
between different observables (each I/O pair is an observation in itself). Hence our alphabet,
or set of events, is just f=Af=I×O, i.e. the set of object variables is the singleton set
Vo={}. Since a single I/O pair ufcaptures an entire run of f, we restrict ourselves
to observing singleton traces, i.e. traces of length |u|=1. In other words, we ignore any
temporal aspects associated with the computation of f. This allows us to use first-order
predicate logic—without any temporal modalities—as our specification logic.
DDM is a hyperproperty, expressed as a predicate over sets of traces, even though the
traces are I/O pairs. The set of observable behaviors Ofof a given fconsists of all finite
sets of I/O pairs Of=Pfin( f). The set of all possible system behaviors Bf=P( f)
additionally includes infinite sets of I/O pairs.
Example 6 Let f:N×NNbe the addition function on natural numbers, f(x,y)=x+y.
Then I=N×N,O=N, and a valid trace uftakes the form u=((x,y), z),where
x,yand zare all naturals. Both U={((1,2), 3), ((2,1), 3)}and V={((1,1), 3)}are
considered observable behaviors U,VOf, even though Vdoes not correspond to a valid
system behavior since f(1,1)= 3. Remember that we do not discriminate between valid
and invalid system behaviors in a black-box setting.
For a function fwith ninputs, we define the set of binary relation symbols
Sf={output}∪n
i=1{samei,almosti}.
This induces a signature σf=(Sf,ar(r)=2).Givenatuplex=(x1,x2,...,xn), we write
proji(x)or simply xifor its i-th projection. Given an I/O pair u=(x,y),weusein(u)for
the input component and out(u)for the output component (i.e. in(u)=xand out(u)=y).
The interpretation of the relations in Sfis
I(output)(u,v) out(u)=out(v) uand vagree on their output,
I(samei)(u,v) in(u)i=in(v)iuand vagree on the i-th input,
I(almosti)(u,v)
k=i
in(u)k=in(v)kuand vagree on all but thei-th input
Example 7 Let Π={π→ ((1,2), 3), π → ((2,1), 3)}.ThenΠ| output(π, π ),but
Π| same1(π, π )and Π| almost1(π, π ).
We define DDM for input argument ias follows:
ϕi=∀π.π.τ.τ.¬samei(π, π )samei(π, τ ) samei )
almosti(τ , τ )∧¬output(τ, τ )
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
144 Formal Methods in System Design (2021) 58:126–159
In words: given any pair of traces πand π, if the inputs of πand πdiffer in their i-th
position, then there must be some common values zfor the remaining inputs, such that the
outputs of ffor in(τ ) =z[i→ in)i]and in)=z[i→ in )i]differ. Note that zdoes
not appear in ϕidirectly, instead it is determined implicitly by the (existentially quantified)
traces τand τ. Finally, distributed data minimality for fis defined as
ϕdm =
n
i=1
ϕi.
The property ϕdm follows the same structure as the logical characterization of DDM given
in Prop. 1. The universally quantified variables range over the possible inputs at position i,
while the existentially quantified variables τand τrange over the other inputs and the outputs.
Note also that, given the input coordinates of π,π,andτ, all the output coordinates, as well
as the input coordinates of τ, are uniquely determined. Note that even though ϕdm is not in
prenex normal form, it is a finite conjunction of ∀∀∃∃ formulas in prenex normal form, so a
finite number of monitors can be built and executed in parallel, one per input argument.
Example 8 Consider again U={((1,2), 3), ((2,1), 3)}and V={((1,1), 3)}from Ex. 6.
Then, V| ϕdm trivially holds, but U| ϕdm because when Π(π) = Π(π)there is no
choice of Π(τ),Π(τ)Ufor which Π| ¬ output(τ, τ )holds.
Note that, in the above example, V| ϕdm holds despite the fact that Vis not a valid behavior
of the example function f(x,y)=x+y. Indeed, whether or not U| ϕdm holds for a given
Uis independent of the choice of f. In particular, f| ϕdm, for any choice of fregardless
of whether fis data-minimal or not. This is already a hint that the notion of semantic black-
box monitorability is too weak to be useful when monitoring ϕdm.Sincefis a model of
ϕdm,noobservationUcan have an extension that permanently violates ϕdm. As we will see
shortly, gray-box monitoring does not suffer from this limitation. Monitorability of DDM
for violations becomes possible once we exclude potential models such as fwhich do not
correspond to valid system behaviors.
Remark Note that although our definition and approach work for general (reactive) systems,
the DDM example is admittedly a non-reactive system with traces of length 1. This, however,
is not a limitation of the approach. Extending DDM for reactive systems is left as future work.
4.3 Properties of DDM
Since ϕdm is a ++property, it should not come as a surprise that it is not semantically
black-box monitorable in general (P1). Lemma 3below essentially follows from Theorem 2.
Lemma 3 (black-box non-monitorability) Assume f :IO, then ϕdm is semantically
black-box monitorable iff I is finite.
Proof Although Theorem 2does not apply exactly to the current setting—ϕdm has four
instead of two quantifiers—the proof goes through with only minor adjustments. We first
establish that ϕdm is serial (in the appropriate sense), from which it follows that ϕdm has
a finite sink if and only if fis finite. The remainder of the proof then follows the same
argument as that of Theorem 2.
Without loss of generality, assume that fis surjective. (Otherwise, replace the codomain
Owith the image of fin the remainder of the proof.)
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 145
We first show that ϕdm is serial. Assume Iand Oeach contain at least two elements.
Smaller I/O domains correspond to degenerate cases for which semantic black-box moni-
torability is easy to show, so we omit them here. Let u,ube arbitrary I/O pairs, o= oO
a pair of distinct outputs, and ian arbitrary input position. Define v=(in(u), o)and
v=(in(u)[i→ in(u)i],o).Thenu,uand v,vare all in f,anditiseasytocheck
that ϕdm holds if the quantified variables are instantiated to these traces in the given order.
In other words, ϕdm is serial in the intuitive sense: for any instantiation of the universally
quantified variables to valuations in f, there is at least one corresponding instantiation of
the existentially quantified variables. (Note that this makes fasinkofϕdm in a similar
sense.)
If Iis finite, then (by assumption) so is f, and hence ϕdm has a “finite sink”. Concretely,
fis a finite extension of any finite observation U, and it is the largest such observation, so
it permanently satisfies ϕdm. It follows that ϕdm is semantically black-box monitorable for
satisfaction.
Assume instead that Iis infinite, and let Ube any finite set of traces. To show that U
neither permanently satisfies nor permanently violates ϕdm, it is sufficient to exhibit a pair of
extensions Ts,TvUthat satisfy and violate ϕdm, respectively. For Ts,wepickTs=f.
Since ϕdm is serial, Ts| sϕdm.
We have to work slightly harder to construct Tv. Intuitively, we must show that ϕdm does
not have a “finite sink”. Concretely, we show that any finite observation Ucan be extended
with a pair of traces v,vsuch that U∪{v, v}violates ϕdm.
Since Iis infinite but Uis finite, there must be an input position iand a pair of distinct
elements x= xIisuch that no trace in Uhas xor xas its i-th input. Pick some
arbitrary trace wf,andletv=w[i→ x]and v=w[i→ x]. By construction,
v, v/U,soTv=U∪{v, v}is a strict extension of U. To show that Tvdoes indeed
violate ϕdm, it is sufficient to show that Tv| vϕi.Pickv, vto instantiate πand π.Then
in(w)i=x= x=in(w)iby construction, but there is no way to instantiate τand τ:since
they have to agree with πand πon the i-th input position, the only candidates are vand v,
but out(v) =out(v )by construction.
Perhaps surprisingly, ϕdm is semantically white-box monitorable for violations (P2). That
is, if fis not DDM, there is hope to detect it. To make this statement more precise, we first need
to identify the set of valid system behaviors Sfof f.Wedene#
f={(x,y)|f(x)=y}
to be the set of I/O pairs that correspond to executions of f.ThenSf=P(#
f)precisely
characterizes the set of valid system behaviors.
Example 9 Define g:N×NNas g(x,y)=x, i.e. gsimply ignores its second argument.
Then #
g={((x,y), x)|x,yN}. It is easy to show that DDM is white-box monitorable
for g. Any finite set of valid traces Ucan be extended to include a pair of traces u,uthat only
differ in their second input value, e.g. u=((1,1), 1)and u=((1,2), 1). Now, consider any
TSfthat extends U∪{u,u}. Clearly, Tcannot contain any trace vfor which in(v)1=1
but out(v) = 1 as that would constitute an invalid system behavior. But Twould have to
contain such a trace to be a model of ϕ2. Hence, T| ϕdm for any such T, which means
U∪{u,u}permanently violates ϕdm.
Note the crucial use of information about gin the above example: it is the restriction to valid
extensions TSfthat excludes trivial models such as fand thereby restores (semantic)
monitorability for violations. The apparent conflict between (P1) and (P2) is thus resolved.
With the extra information that gray-box monitoring affords, we can make more precise
claims about properties like DDM: whether or not a property is monitorable may, for instance,
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
146 Formal Methods in System Design (2021) 58:126–159
depend on whether the property actually holds for the system under scrutiny. Concretely, for
the case of DDM, we show the following.
Theorem 3 Given a function f :IO , the formula ϕdm is semantically gray-box mon-
itorable in Sfif and only if either f is distributed non-minimal or the input domain I is
finite.
Theorem 3follows from the following two auxiliary lemmas.
Lemma 4 (semantic violation) If f is not DDM, then ϕdm is semantically monitorable for
violation (in Sf).
Proof Assume a finite set of traces USf. We need to show that there is a finite extension
VUpermitted by Sfthat permanently violates ϕdm. First, note that the task is trivial if I
is finite: we simply pick V=#
f, i.e. the set of all possible executions, which is also finite.
The only finite extension of Vpermitted by Sfis the complete set of traces #
fitself, and
since fis not distributed minimal, ϕdm cannot hold for #
f.
Assume instead that Iis infinite. Since fis distributed non-minimal, there must be some
input position iand some pair of distinct inputs x= xIi, such that f(z[i→ x])=
f(z[i→ x])for any choice of zI.Lety=z[i→ x]and y=z[i→ x]for an arbitrary
zI.ThenanysetWSfthat contains the traces u=(y,f(y)) and u=(y,f(y))
violates ϕdm. To see this, assume instead that W| s
Sfϕdm. Then there must be traces
v, vWthat agree on all but the i-th input, such that f(in(v)[i→ x])= f(in(v )[i→ x]),
thus contradicting non-minimality of f. Hence, by picking V=U∪{u,u},wehave
V| v
fϕdm.
Lemma 5 (Semantic satisfaction) If f :IO is DDM, then ϕdm is semantic monitorable
for satisfaction (in Sf)if and only if I is finite.
Proof First, if Iis finite the result follows by picking V=#
f. Assume now that fis
distributed minimal, ϕdm is semantically monitorable for satisfaction, and Iis infinite. Let
USfbe some non-empty, finite set of traces with some distinguished element uU.Since
ϕdm is monitorable for satisfaction, there must be a finite extension VUthat permanently
satisfies ϕdm. To arrive at a contradiction, it suffices to construct a finite extension WV
that does not satisfy ϕdm.
Pick an input position ifor which Iiis infinite. Such an imust exist because otherwise
Iwould be the Cartesian product of finite sets, and Iis infinite by assumption. Next, pick a
pair of distinct element x= xIisuch that there are no traces in Vwith xor xas their
i-th input. Such x,xmust also exist because Iiis infinite but Vis finite. Finally, pick an
input position j= i,andayIjsuch that y= in(u)j.Suchaymust exist for Ijto be
non-trivial.
Now let z=in(u)[i→ x],z=in(u)[i→ x,jy]and w=(z,f(z)),w=
(z,f(z)).Thenwand ware clearly valid traces, i.e. w, w#
f,butw, w/Vsince w
and whave xand xas their i-th inputs, respectively. Let W=V∪{w, w}. By construction,
¬samei(π, π )holds if we instantiate πand πto wand w, respectively, but there is no
pair of traces v, vWto instantiate τ,τin such a way that samei(π, τ ),same
i)and
almosti(τ , τ )all hold simultaneously. The former force the choice τ→ wand τ→ w
but, by construction, in(w) j= in(w)j. Hence W| sϕdm and we arrive at a contradiction.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 147
Intuitively, Theorem 3means that fcannot be monitored for satisfaction. Note that the
semantic monitorability property established by Theorem 3is independent of whether we
can actually decide DDM for the given f. We address the question of strong monitorability
later on in this section.
If Iis finite, it is easy to strengthen Theorem 3by providing a perfect monitor Mdm for
ϕdm.Since fis assumed to be a total function with a finite domain, we can simply check
the validity of ϕdm for every trace U#
fand tabulate the result. To do so, the and
quantifiers in ϕdm can be converted into conjunctions and disjunctions over U.
Corollary 1 Fo r f :IO with finite I , ϕdm is strongly monitorable in Sf.
If Iis infinite, then ϕdm is not semantically monitorable for satisfaction, but we can still hope
to build a sound monitor for violation of ϕdm.
4.4 Building a gray-box monitor for DDM
In what follows, we assume a computable function capable of deciding DDM only for some
instances. This function, which we call oracle, will serve as the basis for a sound monitor for
DDM (P4). This monitor will detect some, but not all, violations of DDM when given sets
of observed traces, thus resolving the apparent tension between (P3) and (P4).
Given f:I1×···× InO, we define the predicate ϕfas
ϕf(i,x,y)=∃zI.f(z[i→ x])= f(z[i→ y]), (1)
and assume a total computable function Nf,i:Ii×Ii→{,,?}such that
Nf,i(x,y)=or ? if ϕf(i,x,y)holds,
or ? otherwise.
The function Nf,iacts as our oracle to instantiate the existential quantifiers in ϕdm .As
discussed earlier, such oracles may be implemented by statically analyzing the system under
observation (here, the function f). In our proof-of-concept implementation, we extract ϕf
from fusing symbolic execution, and we use an SMT solver to compute Nf,i(see Sect. 5
for details).
We now define a monitor Mdm for ϕdm as follows:
Mdm(U)=
?iff(in(u)) = out(u)for some uU,
?if
n
i=1u,uUNf,i(in(u)i,in(u)i)=⊥,
otherwise.
Intuitively, the monitor Mdm (U)checks the set of traces Ufor violations of DDM by verifying
two conditions: the first condition ensures the consistency of U, i.e. that every trace in U
does in fact correspond to a valid execution of f; the second condition is necessary for U
not to permanently violate ϕdm. Hence, if it fails, Umust permanently violate ϕdm.Since
Nf,iis computable, so is Mdm . Note that Mdm never gives a positive verdict .Thisisa
consequence of Theorem 3:if fis DDM, then ϕdm is not monitorable in Sf.Inotherwords,
DDM is not monitorable for satisfaction.
The second condition in the definition of Mdm is an approximation of ϕdm : the universal
quantifiers are replaced by conjunctions over the finite set of input traces U, while the
existential quantifiers are replaced by a single quantifier ranging over all of #
f(not just U).
This approximation is justified formally by the following theorem.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
148 Formal Methods in System Design (2021) 58:126–159
Theorem 4 (soundness) The monitor Mdm is sound. Formally,
1. U | s
Sfϕdm if Mdm(U)=, and
2. U | v
Sfϕdm if Mdm(U)=⊥.
Proof The monitor never gives a verdict, so the first half of the theorem (satisfaction) holds
vacuously. For the second part (violation), we have
Mdm(U)=⊥ ⇔ USfn
i=1u,uUNf,i(in(u)i,in(u)i)=⊥,
and
n
i=1u,uUNf,i(in(u)i,in(u)i)=⊥
iu,uU.¬ϕf(i,in(u)i,in(u)i)
iu,uU.zI.f(z[i→ in(u)i])=f(z[i→ in(u)i])
iu,uU.w#
f.f(in(w)[i→ in(u)i])=f(in(w)[i→ in(u)i])
⇒∀VSf.UV
iu,uV.wV.f(in(w)[i→ in(u)i])=f(in(w)[i→ in(u)i])
U| v
Sfϕdm.
In the next section, we describe a prototype implementation of Mdm .
5 Implementation and empirical evaluation
We have implemented the ideas described in Sect. 4in a proof-of-concept monitor for DDM
called Minion. The monitor uses the symbolic execution API and the SMT backend of the KeY
deductive verification system [4,27] to extract logical characterizations of Java programs. It
then extends them to first-order formulas over sets of observed traces, and checks the result
using the state-of-the-art SMT solver Z3 [32,33]. Minion is written in Scala and provides a
simple command-line interface. Its source code is freely available online at https://github.
com/sstucki/minion/ .
In the remainder of this section, we describe Minion in more detail and illustrate its
behavior on concrete example programs. We start with a high-level summary of symbolic
execution and how to use it for extracting logical characterizations of programs. We present
the different monitoring strategies implemented in Minion and illustrate how our tool deals
with complex control flow such as loops. We conclude the section by presenting a short
empirical evaluation of Minion, where we tested the two monitoring strategies for DDM
implemented in our tool on different workloads. Note that this evaluation does not constitute
a performance benchmark as performance has not been the focus of our proof-of-concept
implementation.
5.1 Extracting program characterizations via symbolic execution
Minion extracts logical characterization of programs via symbolic execution,aprocessthat
systematically explores the execution paths of a program on all possible inputs [16,28].
During symbolic execution, program inputs are kept abstract (symbolic) and the program
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 149
state (i.e. the values of local variables and memory locations) is represented by expressions
over these abstract inputs. When the program branches on an abstract value, the symbolic
execution engine explores both branches, keeping track of the path conditions that must
hold in the respective branches. The output of symbolic execution is a so-called symbolic
execution tree. Each branch of the tree corresponds to possible execution path pof the
program, and each leaf summarize the final state spof the corresponding path and the path
condition cpunder which it was obtained. The symbolic execution tree thus constitutes a
finite representation of all possible program behaviors, from which we can extract a first-
order formula ψ=pcpspdescribing the overall program behavior (cf. [7], Def. 8).
To build a gray-box monitor for DDM as described in Sect. 4.4,itistheneasytoextendψ
into the formula ϕfdefined in (1) and submit it to an SMT solver such as Z3 to check its
satisfiability.
The main challenge in using symbolic execution is to deal effectively with path explosion.
Consecutive branches in a program can and, in practice, do increase the number of paths
exponentially, especially when the program contains loops or recursive function calls. Sym-
bolic execution engines implement various strategies to work around this issue, but these
are beyond the scope of this paper. We refer the interested reader to the excellent survey by
Baldoni et al. [8] for a general discussion of the topic, and to the KeY book [4] for con-
crete strategies implemented in the KeY symbolic execution engine used by Minion. We will
briefly return to the question of how Minion deals with loops in Sect. 5.3.
5.2 Monitoring strategies
Our tool can monitor Java programs for both monolithic and distributed data minimality
(MDM and DDM). We use the Java program from Fig. 2in Sect. 4.1 as a running example to
illustrate the use of Minion for monitoring MDM and DDM. Consider the method fee of
the class Toll. The method does not contain any problematic control-flow constructs, such
as loops, recursion or exceptions, and always returns a result. We can therefore safely treat
fee as a (total) function. When running Minion on the method fee, the tool first builds
the symbolic execution tree of the method using the KeY API and translates it into a logical
specification suitable for dispatch to an SMT solver. This specification corresponds to the
predicate ϕfee defined in Sect. 4.4. Then, the monitor reads and parses traces in CSV format
from an input file or standard input. Whenever Minion parses a new trace, it rechecks the entire
set of traces read thus far for violation. In this way, the tool supports both online and offline
monitoring. The number and format of the inputs in a trace is determined automatically from
the method signature. Figure 3a shows example traces for the fee method. Columns 1–4
correspond to the parameters h1,h2,h3 and p, respectively, while column 5 contains the
result computed by fee for the given values.
By default, Minion monitors traces for DDM. Thus, when processing the traces given in
Fig. 3a, it signals a violation after reading the second line because fee(20,h2,h3,p)=
fee(2,h2,h3,p)irrespective of the choice of h2,h3,and p. In contrast, all traces listed in
Fig. 3b are accepted by Minion since they have been preprocessed by a distributed minimizer.
Alternatively, Minion can be instructed to monitor traces for MDM in which case a violation
is signaled when processing the last line of Fig. 3b, whereas all traces in Fig. 3c are accepted.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
150 Formal Methods in System Design (2021) 58:126–159
(a) unprocessed (b) distributed minimal (c) monolithic minimal
Fig. 3 Raw and minimized traces generated from Toll.java
5.2.1 Lazy vs. eager monitoring
Perhaps surprisingly, there are cases where Minion will detect a violation of DDM whereas
it will not detect a violation of MDM. Consider the function f(x,y)=x.Since fsimply
ignores its second argument, it is clearly neither distributed nor monolithic minimal. When
monitoring the pair of traces (1,2,1)and (3,4,3)for DDM, Minion detects a violation
because f(x,2)=f(x,4)for any choice of x. Note, however, that this situation does not
appear among the observed traces since the two values for yin the respective traces differ. The
tool reports a violation because a common value for xis found by our oracle when monitoring
for DDM. When monitoring for MDM Minion does not detect the violation, because in this
case there is no need to invoke the oracle.
Whether or not this is the intended behavior of the monitor depends on the assumption of
whether the traces are collected from a program for from the combined program fp(p
being a minimizer). In the latter case, some combinations of inputs may never be observed as
the inputs have been minimized. On the other hand, if traces are not considered preprocessed,
we may wish to explore the behavior of fmore exhaustively. For this purpose, Minion can
be instructed to monitor a set of traces eagerly for MDM, resp. lazily for DDM. For the
former, Minion considers not just the observed traces, but any combination of observed input
values—even if that combination does not actually correspond to an observed trace. For the
latter, Minion only considers combinations of inputs originating from traces with the same
result value. For example, for the pair of input traces (1,2,1)and (3,4,3),Minion is able to
find a violation in eager MDM mode since f(1,2)=f(1,4), but not in lazy DDM mode
since f(1,2)= f(3,4).
5.3 Loops and loop invariants
Thus far, we have only considered simple programs without computational effects (such as
exceptions or mutable state) and whose control flow does not include loops or recursive
calls. These programs could safely be treated as (total) functions. Monitoring programs with
loops for data minimality is more challenging, both conceptually and practically, because
such programs can, in principle, be partial (i.e. non-terminating). To ensure correctness, we
require all programs to be terminating and free from global side effects (though they may
use e.g. exceptions and mutual state locally).
Minion provides two strategies for dealing with loops:
1. obtain an approximate logical characterization by unrolling loops to a fixed depth;
2. annotate loops with loop invariants, which allows KeY’s symbolic execution engine to
give a complete characterization of the method.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 151
Fig. 4 A naive division algorithm for positive integers
The first strategy is suitable for loops with a bounded number of iterations. Intuitively, loop
unrolling (also known as loop unwinding) replaces a loop with mcopies of its loop body,
where mis a fixed number called the unrolling depth. If the number of loop iterations is
unbounded, this method results in an approximate specification by treating the program as
partial: program executions exceeding the maximum number of iterations mare considered
as returning an undefined result. The second option (loop invariants) is more flexible but
also requires more work by the programmer who needs to specify logical invariants that hold
during and after the execution of the loop. The KeY symbolic execution engine supports
loop invariants specified in the Java Modeling Language (JML) [29] to verify termination
and generate a logical specification for the program state during and after the loop [4]. This
specification can then be used to characterize the overall result of the program precisely.
Consider, for example, the method posDiv given in Fig. 4, which implements naive
integer division by repeatedly subtracting yfrom xand counting how many times this is
possible. Our simple symbolic tree method cannot extract a complete logical characterization
of posDiv automatically. Instead, one of the aforementioned strategies must be used. In
the first case, choosing a large unrolling depth leads to high symbolic execution times because
many copies of the loop body are generated and analyzed. A low unrolling depth, on the other
hand, may affect accuracy. If the concrete values for xand yread from an input trace results in
a number of loop iterations nbelow the unroll depth m, the resulting logical characterization
is exact but if n>mthe characterization becomes an over-approximation and the monitor
may fail to detect non-minimal traces. Indeed, it may not even detect inconsistent traces, i.e.
traces where the expected output differs from the actual output computed at run time.
These false negatives can be avoided by annotating loops with JML loop invariants, as
showninFig.4. This invariant specifies that at every loop iteration the remainder ris positive
and, when added to the iteration counter qtimes the divisor y, equals the original dividend
x. With this invariant, the symbolic execution terminates quickly and without cutting off
any branches, and Minion is able to extract a logical characterization for posDiv,which
asserts that the eventual result qof the method must satisfy the equation qy =xrfor
some rsuch that 0 r<y. This is sufficient to correctly monitor any traces generated from
posDiv for violation of DDM.
For an in-depth discussion about the effective use of loop unrolling and loop invariants,
we refer the interested reader to the KeY book [4, Sect. 3.7.2].
5.4 Empirical evaluation
We eval uate d Minion on a small set of programs to compare the output and relative run-time
performance of its eager and lazy modes. Note that our primary goal in implementing Minion
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
152 Formal Methods in System Design (2021) 58:126–159
Table 1 Mean running times and verdicts of Minion monitoring four different Java methods
Symb. K1: random (s) K2: DDMin (s) K3: MDMin (s)
exec. (s) Eager Lazy E L Eager Lazy E L Eager Lazy E L
T1 30.9±1.5 0.6 ±0.2 0.6 ±0.1 ⊥⊥51.2 ±1.2 30.3 ±7.2 ??0.9±0.8 5.4 ±0.2 ?
T2 9.2±0.7 0.4 ±0.1 0.4 ±0.0 ⊥⊥17.3 ±0.8 13.9 ±2.2 ? ? 17.3 ±0.7 3.7 ±0.3 ??
CA 1.8 ±0.1 0.5 ±0.2 0.4 ±0.2 ⊥⊥19.2 ±2.0 14.8 ±1.7 ? ? 13.6 ±5.4 3.6 ±0.1 ??
LA 3.4 ±0.4 0.4 ±0.2 0.3 ±0.0 ⊥⊥ 136.6 ±37.1 3.7 ±0.3 ??
was to demonstrate the feasibility of our approach for gray-box monitoring. No attempt was
made to optimize the performance of Minion and, accordingly, the goal of our evaluation is not
to demonstrate scalability of our tool nor to establish a performance benchmark. Rather, the
evaluation is meant to illustrate the behavior of the two DDM monitoring strategies (eager vs.
lazy) under different workloads. These workloads have been chosen to deliberately exacerbate
the differences between the two strategies.
For our evaluation, we ran Minion on four Java methods: the fee method from Fig. 2
(T1), a variant of that method that computes the fee on a road with only two toll stations
instead of three (T2), as well as the CreditApp (CA) and LoyaltyApp (LA) programs
introduced in [6]. Each method was monitored for DDM violation using three kinds of input
traces:
(K1) random input values that respect the input specifications of the methods;
(K2) traces from (K1) minimized using a distributed data minimizer (DDMin);
(K3) traces from (K1) minimized using a monolithic data minimizer (MDMin).
The traces shown in Fig. 3are subsets of the inputs generated for T1. We generated
10 instances of each kind, accounting for a total of 30 trace sets, each containing exactly
100 traces. All experiments were performed on a MacBook Pro with a 3.1 GHz Intel Core i5
processor and 16 GB of memory, running macOS 10.14. The results are summarized in
Tabl e 1.
Tabl e 1shows the mean running time and standard deviation in seconds, as well as the
verdicts produced by Minion. The second column of the table reports the initial time spent on
symbolic execution before any traces are processed. T1 incurs higher running times because
T1 features several multiply-nested branches. The remaining columns report the execution
times and verdict of the actual monitor.
The performance of eager and lazy monitoring is similar on random (K1) inputs because
all cases have small (finite) input and output domains. Both approaches detect violations after
processing only a few traces since even a small number of random traces cover a substantial
part of the I/O domains, including those cases where violations occur.
As expected, the verdicts for the DDMin (K2) traces are inconclusive since DDM is not
monitorable for satisfiability in general (by Lemma 5). Lazy monitoring does consistently
better than eager monitoring on DDMin (K2) inputs because the eager strategy checks more
input combinations before it (inevitably) comes to the same conclusion as the lazy strategy.
The differences are relatively small for T1, T2 and CA because the ranges of these methods
are small (<10 elements). There is a bigger difference for LA, where the range is larger.
The performance of lazy monitoring on MDMin traces (K3) is consistently better than
on DDMin traces (K2) because lazy DDM monitoring and MDM monitoring coincide for
MDMin traces (no SMT invocations are necessary). On the other hand, the performance of
eager monitoring for MDMin traces may change drastically depending on whether or not
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 153
the traces are also DDMin (which need not be the case). If they are, then eager monitoring
has the same performance for MDMin traces as for DDMin traces. If they are not, the eager
monitor might detect a violation early in the input set, cutting the overall execution time.
6 Related work
In this section, we review the related literature. The most relevant work are arguably the
approaches developed for monitoring hyperproperties [3,17,24]. While we will compare and
contrast our approach with these papers in detail later in this section, the main difference is
that those techniques evaluate a formula based only on the current observation, providing
an answer under the hypothesis that the observation is the full behavior of the system. The
approach we propose in this paper, on the other hand, reasons not only about the executions
observed so far, but also about other potential executions of the system that are not necessarily
observed at run time.
6.1 LTL and monitorability for trace logics
Pnueli and Zaks [38] introduced the concept of monitorability for an LTL formula ϕafter
observing a prefix Oas the existence of an extension of Othat permanently satisfies or
permanently violates ϕ. It is known that the set of monitorable LTL properties is a superset
of the union of safety and co-safety properties [12,13] and that it is also a superset of the
set of obligation properties [22,23]. More recently, Havelund and Peled [26] introduced a
finer-grained taxonomy distinguishing between always finitely satisfiable (resp. refutable),
and sometimes finitely satisfiable (resp. refutable) where only some prefixes are required
to be monitorable (for satisfaction). Their taxonomy also describes the relation between
monitorability and classical safety properties. This more fine-grained distinction adds a new
dimension to the monitorability cube in Fig. 1which we will study in the future.
The recent work by Aceto et al. [2] builds a framework comparing the different notions
of monitorability considered in the literature. For example, the original work by Pnueli and
Zaks can be instantiated to mean that a property is monitorable when there is an observation
Ofor which there is hope to monitor (that, is PZ(O)). This notion is called PZ(O). Dually,
one can consider a PZ(O)which requires that there is hope to monitor the property for
every observation O. Aceto et al. also study the existence of monitors for a variation of the
fix-point logic recHML that subsumes LTL as well as for a branching time variation [1,2].
While all the notions mentioned above ignore the system, predictive monitoring [43]
considers the traces allowed in a given finite state system. None of the work mentioned in this
subsection considers the trace/hyper and the computability dimensions of the monitorability
cube in Fig. 1.
6.2 Monitoring hyperproperties
Previous techniques proposed for monitoring hyperproperties either (1) are limited to frag-
ments for which definite verdicts can be given, or (2) given a finite set of finite traces Mand
a HyperLTL formula ϕcompute whether or not Msatisfies ϕfor a finite length semantics of
the temporal sub-formula. Note that for (2) future observations of new traces or extensions
of observed traces may change the computed verdict. In the following, we refer to this notion
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
154 Formal Methods in System Design (2021) 58:126–159
as snapshot monitoring to distinguish it from monitoring with respect to possible extended
observations as studied in this paper.
Monitoring hyperproperties was first studied by Agrawal and Bonakdarpour [3]who
provided the first notion of monitorability for HyperLTL (generalizing [38]) and gave an
algorithm for a fragment of alternation-free HyperLTL. The monitoring algorithm only covers
thes fragment of k-safety HyperLTL formulas, where the relation between traces does not
involve temporal operators. Brett et al. later generalized the work in [3] to the full fragment
of alternation-free formulas using formula rewriting [17], which can also monitor alternating
formulas but again for snapshot monitoring (i.e. the verdict is computed with respect to a
fixed finite set of finite traces).
A more general approach has been proposed by Finkbeiner et al. [24] who provide an
automata-based algorithm for monitoring HyperLTL for a given trace set, enabling the com-
putation of a verdict even for alternating formulas (still for snapshot monitoring). More
specifically,they show that deciding monitorability (the definition given in [3]) for alternation-
free HyperLTL is PSPACE-complete while the problem is undecidable in general. The
monitoring algorithm generates an automaton where transitions are annotated by the trace
variables in the input HyperLTL formula. They also study the role of certain formulas in
achieving efficient monitoring algorithms, for example exploiting symmetries in the formula.
The complexity of (snapshot) monitoring different fragments of HyperLTL was studied in
detail by Bonakdarpour and Finkbeiner [14] for Kripke structures with specific topologies,
like trees, acyclic graphs, etc.
The idea of gray-box monitoring for hyperproperties—with the help of static information
about the system under study—as a means for handling non-monitorable formulas, was first
proposed by Bonakdarpour et al. [15]. In that paper it was also suggested the potential to
incorporate abstract interpretation, symbolic execution, etc., in order to reason about quanti-
fier alternation in HyperLTL formulas at run time. Our paper explores in detail one of these
potential approaches by employing symbolic execution to extract a model from the source
code, and an SMT-solver to reason about alternating HyperLTL formulas.
When one focuses on classical monitorability (where verdicts are permanent as opposed
to changing verdicts as in snapshot monitoring) much less in known for HyperLTL, compared
to LTL [26]. The results in Sect. 3of this paper are a step in the direction of providing a
similarly fine-grained landscape of monitorability concepts for HyperLTL.
6.3 Data minimization
A formal definition of data minimization in terms of strong dependency and derived concepts,
and the concept of a data minimizer defined as a preprocessor to the data processor were first
proposed by Antignac et al. [7]. In that paper, the authors considered both the monolithic
and two versions of the distributed cases (including the weaker version considered in our
paper), and they provided a proof-of-concept implementation to obtain data minimizers for
a given program. The latter only works in practice provided enough annotations in the form
of pre/post-conditions and invariants are provided. The approach in that paper is semantics-
based, so finding a distributed minimizer is undecidable in general.
Formal and rigorous approaches to privacy have been advocated for some time (e.g. [42]),
but the data minimization principle has not been precisely defined in the past except for the
aforementioned paper. The main reason for this lack of work in the area is that it is a very
hard problem. Indeed, the data minimization principle refers to both data at collection time
and at processing time, as well as what is the explicit consent given by the data subject for
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 155
which specific purposes. There currently is no specification language to correctly formalize
all these aspects of data minimization, let alone to verify or enforce them. The best that can
be done today is to consider some aspects of the definition and address it gradually. The
state of the art are the papers already mentioned ( [7,36]) and some recent initial attempts to
formalize the notion of purpose (e.g. [11]).
A closely related work is the notion of minimal exposure [5], which consists in performing
a preprocessing of information on the client’s side to give only the data needed to benefit from
a service. The concept of data minimality is also closely related to information flow [20].
Equivalence partitioning by using symbolic execution was first introduced for test case
generation by Richardson and Clarke [39], and later used by the KeY theoremprover[4].
Symbolic execution has limitations, especially when it comes to handling loops. Though
being a main concern in theory and for some applications, while loops do not seem to be as
widespread as for loops in practice. For instance, Malacaria et al. have been able to perform
a symbolic execution-based verification of non-interference security properties from the C
source of the real world OpenSSL library [31].
In this paper, we have focused on a weaker version of DDM which is not semantically
black-box monitorable in general. For stronger versions (cf. [7]) it was shown by Pinisetty et
al. [36] that the property is not (semantically) monitorable for satisfaction in general, but it
is for violation. Their paper also discusses the runtime verification problem for other similar
safety hyperproperties in the context of deterministic programs.
7 Conclusion and future work
In this paper, we have addressed the issue of monitorability with four main contributions.
First, we have rigorously investigated the notion of monitorability, providing a new definition
that is more general than previous definitions in the literature. Our definition considers the
following dimensions: (1) a distinction between black-box and gray-box monitoring, (2)
the nature of the property: trace properties or hyperproperties, (3) computability aspects of
the monitor as a program. Second, we have shown that many hyperproperties that involve
quantifier alternation are non-monitorable in a black-box manner and proposed a technique
that allows, in certain cases, to monitor such properties by considering the use of oracles at
run time. Such oracles are essentially static verifiers that may be called to further assist the
runtime monitor in its decision on whether the property is violated or not. Third, we have
considered a privacy property known to be non-monitorable using a black-box approach, and
shown that we can use our gray-box technique to enable monitorability for violations of the
property. The property under consideration is distributed data minimality (DDM), which may
be expressed in HyperLTL involving one quantifier alternation. Our methodology to monitor
violations of DDM is based on a model extracted from the program being monitored in the
form of its symbolic execution tree, and the use of an SMT solver as an oracle. Finally, we
have implemented a tool (Minion) and applied it to a number of representative examples to
assess the feasibility of our approach.
Our work is a first step towards more ambitious tasks. First, since we have to model
data we presented a variation of HyperLTL that includes relational predicate symbols. A
full study, including the comparison of the expressive power of relational HyperLTL versus
the conventional HyperLTL that only considers monadic predicate symbols, is an interesting
theoretical problem for future research. Second, the non-monitorability results consider the
simplest temporal formulas (safety). Future work includes capturing the conditions under
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
156 Formal Methods in System Design (2021) 58:126–159
which other temporal formulas are non-monitorable. Another direction is to extend the pro-
posed methodology for other hyperproperties that consider traces not from the same system,
particularly in the concurrent and distributed setting. Indeed, most properties about concur-
rent and distributed systems are hyperproperties, and we would like to explore how to use
our technique in the absence of total information when only having a local view (e.g. when
only having access to the execution of a subset of the number of processors on a multi-core
system).
In our proof-of-concept implementation, we use a combination of symbolic execution and
an SMT solver as an oracle. In this case, we assume to have access to the source code of the
program being monitored (to extract the symbolic execution tree), but that may not always be
possible. As an alternative, we plan to explore the possibility of using, for instance, a model
checker over a model of the system. We could also use bounded model checking as our verifier
at run time by combining over- and under-approximated methods to deal with universal and
existential quantifiers in HyperLTL formulas. Another interesting problem is to apply gray-
box monitoring for hyperproperties with real-valued signals (e.g. HyperSTL [34]). Finally,
we would like to extend the definition of (distributed) data minimality over reactive systems,
and study what monitorability means in such a setting. Our preliminary study in this domain
shows that having the right definition for reactive DDM is quite challenging.
Funding Open Access funding provided by University of Gothenburg.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included in the
article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is
not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/ 4.0/.
References
1. Aceto L, Achilleos A, Francalanza A, Ingólfsdóttir A, Lehtinen K (2019a) Adventures in monitorability:
from branching to linear time and back again. Proc ACM Program Lang (POPL’19) 3:52:1–52:29. https://
doi.org/10.1145/3290365
2. Aceto L, Achilleos A, Francalanza A, Ingólfsdóttir A, Lehtinen K (2019b) An operational guide to
monitorability. In: Proceedings of the 17th international conference on software engineering and for-
mal methods (SEFM’19) vol 11724. Springer, LNCS„ pp 433–453. https://doi.org/10.1007/978-3- 030-
30446-1_23
3. Agrawal S, Bonakdarpour B (2016) Runtime verification of k-safety hyperproperties in HyperLTL. In:
Proceedings of the IEEE 29th Computer Security Foundations (CSF’16). IEEE CS Press, pp 239–252.
https://doi.org/10.1109/CSF.2016.24
4. Ahrendt W, Beckert B, Bubel R, Hähnle R, Schmitt PH, Ulbrich M (eds) (2016) Deductive software
verification–the KeY book–from theory to practice, vol 10001. LNCS. Springer, Berlin. https://doi.org/
10.1007/978-3-319-49812- 6
5. Anciaux N, Nguyen B, Vazirgiannis M (2012) Limiting data collection in application forms: a real-case
application of a founding privacy principle. In: Tenth annual international conference on privacy, security
and trust (PST’12). IEEE, pp 59–66. https://doi.org/10.1109/PST.2012.6297920
6. Antignac T, Sands D, Schneider G (2016) Data minimisation: a language-based approach (long version).
CoRR arXiv:1611.05642
7. Antignac T, Sands D, Schneider G (2017) Data minimisation: a language-based approach. In: Proceedings
of the 32nd IFIP TC 11 international conference on ICT systems security and privacy protection (SEC’17),
IFIPAICT, vol 502. Springer, pp 442–456. https://doi.org/10.1007/978-3- 319-58469- 0_30
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Formal Methods in System Design (2021) 58:126–159 157
8. Baldoni R, Coppa E, D’elia DC, Demetrescu C, Finocchi I (2018) A survey of symbolic execution
techniques. ACM Comput Surv. https://doi.org/10.1145/ 3182657
9. Bartocci E, Falcone Y (eds) (2018) Lectures on runtime verification—introductory and advanced topics,
vol 10457. LNCS. Springer, Berlin. https://doi.org/10.1007/978-3-319-75632-5
10. Bartocci E, Falcone Y, Francalanza A, Reger G (2018) Lectures on runtime verification, LNCS, vol 10457.
Springer, Chap Introduction to runtime verification, pp 1–33. https://doi.org/10.1007/978-3-319-75632-
5
11. Basin DA, Debois S, Hildebrandt TT (2018) On purpose and by necessity: compliance under the GDPR.
In: Meiklejohn S, Sako K (eds) Financial cryptography and data security, LNCS, vol 10957. Springer, pp
20–37. https://doi.org/10.1007/978-3-662- 58387-6_2
12. Bauer A, Leucker M, Schallhart C (2007) The good, the bad, and the ugly—but how ugly is ugly? In:
Proceedings of the 7th international workshop on runtime verification (RV’07), LNCS, vol 4839. Springer,
pp 126–138. https://doi.org/10.1007/978-3-540- 77395-5_11
13. Bauer A, Leucker M, Schallhart C (2011) Runtime verification for LTL and TLTL. ACM T Softw Eng
Methodol 20(4):14. https://doi.org/10.1145/2000799.2000800
14. Bonakdarpour B, Finkbeiner B (2018) The complexity of monitoring hyperproperties. In: Proceedings of
the IEEE 31st computer security foundations symposium (CSF’18). IEEE, pp 162–174. https://doi.org/
10.1109/CSF.2018.00019
15. Bonakdarpour B, Sánchez C, Schneider G (2018) Monitoring hyperproperties by combining static analysis
and runtime verification. In: Proceedings of the 8th international symposium on leveraging applications
of formal methods, verification and validation (ISoLA’18), Part II, LNCS, vol 11245. Springer, pp 8–27.
https://doi.org/10.1007/978-3-030- 03421-4_2
16. Boyer RS, Elspas B, Levitt KN (1975) SELECT—a formal system for testing and debugging programs
by symbolic execution. In: Proceedings of the international conference on reliable software. ACM, pp
234–245. https://doi.org/10.1145/800027.808445
17. Brett N, Siddique U, Bonakdarpour B (2017) Rewriting-based runtime verification for alternation-free
HyperLTL. In: Proceedings of the 23rd international conference on tools and algorithms for the con-
struction and analysis of systems (TACAS’17), LNCS, vol 10206. Springer, pp 77–93. https://doi.org/10.
1007/978-3-662-54580- 5_5
18. Clarkson MR, Schneider FB (2010) Hyperproperties. J Comput Secur 18(6):1157–1210. https://doi.org/
10.3233/JCS-2009-0393
19. Clarkson MR, Finkbeiner B, Koleini M, Micinski KK, Rabe MN, Sánchez C (2014) Temporal logics for
hyperproperties. In: Proceedings of the third international conference on principles of security and trust
(POST’14), LNCS, vol 8414. Springer, pp 265–284. https://doi.org/10.1007/978-3-642-54792-8_15
20. Cohen E (1977) Information transmission in computational systems. SIGOPS Oper Syst Rev 11(5):133–
139. https://doi.org/10.1145/1067625.806556
21. European Parliament, Council of the European Union (2016) Regulation (EU) 2016/679 of the European
Parliament and of the Council of 27 april 2016 on the protection of natural persons with regard to the
processing of personal data and on the free movement of such data, and repealing directive 95/46/EC
(General Data Protection Regulation). Offic J Eur Union L(119):1–88
22. Falcone Y, Fernandez JC, Mounier L (2009) Runtime verification of safety-progress properties. In: Pro-
ceedings of the 9th international workshop on runtime verification (RV’09), LNCS, vol 5779. Springer,
pp 40–59. https://doi.org/10.1007/978-3-642- 04694-0_4
23. Falcone Y, Fernandez JC, Mounier L (2012) What can you verify and enforce at runtime? Int J Softw
Tools Technol Transf (STTT) 14(3):349–382. https://doi.org/10.1007/s10009-011-0196-8
24. Finkbeiner B, Hahn C, Stenger M, Tentrup L (2017) Monitoring hyperproperties. In: Proceedings of the
17th international conference on runtime verification (RV’17), LNCS, vol 10548. Springer, pp 190–207.
https://doi.org/10.1007/978-3-319- 67531-2_12
25. Havelund K, Goldberg A (2005) Verify your runs. In: Proceedings of the First IFIP TC 2/WG 2.3 con-
ference on verified software: theories, tools, experiments (VSTTE’05), LNCS, vol 4171. Springer, pp
374–383. https://doi.org/10.1007/978-3-540- 69149-5_40
26. Havelund K, Peled D (2018) Runtime verification: from propositional to first-order temporal logic. In:
Proceedings of the 18th international conference on runtime verification (RV’18), LNCS, vol 11237.
Springer, pp 90–112. https://doi.org/10.1007/978- 3-030- 03769-7_7
27. KeY contributors (accessed 25 Feb 2020) The KeY project. https://www.key-project.org
28. King JC (1976) Symbolic execution and program testing. Commun ACM 19(7):385–394. https://doi.org/
10.1145/360248.360252
29. Leavens GT, Poll E, Clifton C, Cheon Y, Ruby C, Cok DR, Müller P, Kiniry J, Chalin P, Zimmerman
DM (2013) JML reference manual. Department of Computer Science, Iowa State University. http://www.
jmlspecs.org
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
158 Formal Methods in System Design (2021) 58:126–159
30. Leucker M, Schallhart C (2009) A brief account of runtime verification. J Log Algebr Program 78(5):293–
303. https://doi.org/10.1016/j.jlap.2008.08.004
31. Malacaria P, Tautchning M, DiStefano D (2016) Information leakage analysis of complex C code and its
application to OpenSSL. In: Proceedings of the 7th international symposium on leveraging applications
of formal methods, verification and validation (ISoLA’16), Part I, LNCS, vol 9952. Springer, pp 909–925.
https://doi.org/10.1007/