# Steffen LauritzenUniversity of Copenhagen · Department of Mathematical Sciences

Steffen Lauritzen

## About

183

Publications

24,068

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

18,173

Citations

Introduction

Steffen Lauritzen currently works at the Department of Mathematical Sciences, University of Copenhagen.

**Skills and Expertise**

Additional affiliations

November 2014 - present

May 2004 - September 2014

September 1981 - April 2004

## Publications

Publications (183)

In Gaussian graphical models, the likelihood equations must typically be solved iteratively, for example by iterative proportional scaling. However, this method may not scale well to models with many variables because it involves repeated inversion of large matrices. We present a version of the algorithm which avoids these inversions, resulting in...

The notion of multivariate total positivity has proved to be useful in finance and psychology but may be too restrictive in other applications. In this paper we propose a concept of local association, where highly connected components in a graphical model are positively associated and study its properties. Our main motivation comes from gene expres...

We address the identifiability and estimation of recursive max‐linear structural equation models represented by an edge‐weighted directed acyclic graph (DAG). Such models are generally unidentifiable and we identify the whole class of DAG s and edge weights corresponding to a given observational distribution. For estimation, standard likelihood the...

Motivated by extreme value theory, max-linear Bayesian networks have been recently introduced and studied as an alternative to linear structural equation models. However, for max-linear systems the classical independence results for Bayesian networks are far from exhausting valid conditional independence statements. We use tropical linear algebra t...

We study Bayesian networks based on max-linear structural equations as introduced in Gissibl and Klüppelberg (2018) and provide a summary of their independence properties. In particular, we emphasize that distributions for such networks are generally not faithful to the independence model determined by their associated directed acyclic graph. In ad...

Following Ressel (1985,2008) this note attempts to understand graph limits (Lovasz and Szegedy 2006} in terms of harmonic analysis on semigroups (Berg et al. 1984), thereby providing an alternative derivation of de Finetti's theorem for random exchangeable graphs.

We study binary distributions that are multivariate totally positive of order 2 (MTP2). Binary distributions can be represented as an exponential family and we show that MTP2 exponential families are convex. Moreover, MTP2 quadratic exponential families, which contain ferromagnetic Ising models and attractive Gaussian graphical models, are defined...

We study Bayesian networks based on max-linear structural equations as introduced in Gissibl and Kl\"uppelberg [16] and provide a summary of their independence properties. In particular we emphasize that distributions for such networks are generally not faithful to the independence model determined by their associated directed acyclic graph. In add...

We address the identifiablity and estimation of recursive max-linear structural equation models represented by an edge weighted directed acyclic graph (DAG). Such models are generally unidentifiable and we identify the whole class of DAGs and edge weights corresponding to a given observational distribution. For estimation, standard likelihood theor...

We derive representation theorems for exchangeable distributions on finite and infinite graphs using elementary arguments based on geometric and graph-theoretic concepts. Our results elucidate some of the key differences, and their implications, between statistical network models that are finitely exchangeable and models that define a consistent se...

We discuss properties of distributions that are multivariate totally positive of order two (MTP2) related to conditional independence. In particular, we show that any independence model generated by an MTP2 distribution is a compositional semi-graphoid which is upward-stable and singletontransitive. In addition, we prove that any MTP2 distribution...

We analyze the problem of maximum likelihood estimation for Gaussian distributions that are multivariate totally positive of order two (MTP2). By exploiting connections to phylogenetics and single-linkage clustering, we give a simple proof that the maximum likelihood estimator (MLE) for such distributions exists based on at least 2 observations, ir...

We study conditional independence relationships for random networks and their interplay with exchangeability. We show that, for finitely exchangeable network models, the empirical subgraph densities are maximum likelihood estimates of their theoretical counterparts. We then characterize all possible Markov structures for finitely exchangeable rando...

Several types of graph with different conditional independence interpretations --- also known as Markov properties --- have been proposed and used in graphical models. In this paper we unify these Markov properties by introducing a class of graphs with four types of edge --- lines, arrows, arcs, and dashes --- and a single separation criterion. We...

We discuss properties of distributions that are multivariate totally positive
of order two (MTP2) related to conditional independence. In particular, we show
that any independence model generated by an MTP2 distribution is a
compositional semigraphoid which is upward-stable and singleton-transitive. In
addition, we prove that any MTP2 distribution...

We present a framework for fingerprint matching based on marked point process
models. An efficient Monte Carlo algorithm is developed to calculate the
marginal likelihood ratio for the hypothesis that two observed prints originate
from the same finger against the hypothesis that they originate from different
fingers. Our model achieves good perform...

In many families of distributions, maximum likelihood estimation is
intractable because the normalization constant for the density which enters
into the likelihood function is difficult to compute. The score matching
estimator of Hyv\"arinen (2005) provides an alternative where this
normalization constant is not needed. The corresponding estimating...

Statistical analysis of DNA mixtures is known to pose computational
challenges due to the enormous state space of possible DNA profiles. We propose
a Bayesian network representation for genotypes, allowing computations to be
performed locally involving only a few alleles at each step. In addition, we
describe a general method for computing the expe...

The paper describes aHUGIN, a tool for creating adaptive systems. aHUGIN is an extension of the HUGIN shell, and is based on the methods reported by Spiegelhalter and Lauritzen (1990a). The adaptive systems resulting from aHUGIN are able to adjust the C011ditional probabilities in the model. A short analysis of the adaptation task is given and the...

We present a statistical model for the quantitative peak information obtained
from an electropherogram of a forensic DNA sample. Our model directly describes
peak height information and the dropout of an allele is interpreted as failure
for its associated peak to be observed above a detection threshold. Stutter and
dropin are readily represented in...

We present a new approach to the solution of decision problems formulated as
influence diagrams. The approach converts the influence diagram into a simpler
structure, the LImited Memory Influence Diagram (LIMID), where only the
requisite information for the computation of optimal policies is depicted.
Because the requisite information is explicitly...

Discussion of "Latent variable graphical model selection via convex
optimization" by Venkat Chandrasekaran, Pablo A. Parrilo and Alan S. Willsky
[arXiv:1008.1290].

We describe an expert system, MAIES, developed for analysing forensic
identification problems involving DNA mixture traces using quantitative peak
area information. Peak area information is represented by conditional Gaussian
distributions, and inference based on exact junction tree propagation
ascertains whether individuals, whose profiles have be...

We use a close connection between the theory of Markov fields and that of log-linear interaction models for contingency tables to define and investigate a new class of models for such tables, graphical models. These models are hierarchical models that can be represented by a simple, undirected graph on as many vertices as the dimension of the corre...

Graphical models in their modern form have been around since the late 1970s and appear today in many areas of the sciences. Along with the ongoing developments of graphical models, a number of different graphical modeling software programs have been written over the years. In recent years many of these software developments have taken place within...

This chapter describes graphical models for multivariate continuous data based on the Gaussian (normal) distribution. We gently introduce the undirected models by examining the partial correlation structure of two sets of data, one relating to meat composition of pig carcasses and the other to body fat measurements. We then give a concise expositio...

In this chapter we introduce graphs as mathematical objects, show how to work with them using R and explain how they are related to statistical models. We focus mainly on undirected graphs and directed acyclic graphs (DAGs), but also briefly treat chain graphs, that have both undirected and directed edges. Key concepts such as clique, path, separat...

This chapter describes graphical models for mixed data, that is to say, with both discrete and continuous variables. Such data are frequently met in practice. The models are based on the conditional Gaussian distribution: that is, conditional on the discrete variables, the continuous variables are Gaussian with mean depending on the discrete variab...

This chapter deals with Bayesian networks. The term usually refers to graphical models (most often with discrete variables) based on directed acyclic graphs (DAGs), applied in the expert system context. The emphasis differs somewhat from ordinary statistical modeling, since the DAG is usually taken as known and the focus is on efficient calculation...

This chapter describes methods suitable for high-dimensional graphical modeling. Recent years have seen intense interest in applying graphical modeling techniques to data of high dimension: by this we mean from hundreds to tens of thousands of variables. Such data arise routinely in fields such as molecular biology. We first describe two typical da...

This chapter describes graphical models for multivariate discrete (categorical) data. It starts out by describing various different ways in which such data may be represented in R—for example, as contingency tables—and how to convert between these representations. It then gives a concise exposition of the theory of hierarchical log-linear models, w...

This chapter provides a brief introduction to the use of Bayesian graphical models in R. In these models, parameters are treated as random quantities on an equal footing with the random variables. This allows complex stochastic systems to modeled, often using Markov chain Monte Carlo (MCMC) sampling methods. We first consider a series of examples,...

Instrumental variables are widely used for the identification of the causal effect of one random variable on another under
unobserved confounding. The distribution of the observable variables for a discrete instrumental variable model satisfies
certain inequalities but no conditional independence relations. Such models are usually tested by checkin...

In this paper we unify the Markov theory of a variety of different types of
graphs used in graphical Markov models by introducing the class of loopless
mixed graphs, and show that all independence models induced by $m$-separation
on such graphs are compositional graphoids. We focus in particular on the
subclass of ribbonless graphs which as special...

In Cowell et al. (2007), a Bayesian network for analysis of mixed traces of
DNA was presented using gamma distributions for modelling peak sizes in the
electropherogram. It was demonstrated that the analysis was sensitive to the
choice of a variance factor and hence this should be adapted to any new trace
analysed. In the present paper we discuss h...

A scoring rule is a loss function measuring the quality of a quoted
probability distribution $Q$ for a random variable $X$, in the light of the
realized outcome $x$ of $X$; it is proper if the expected score, under any
distribution $P$ for $X$, is minimized by quoting $Q=P$. Using the fact that
any differentiable proper scoring rule on a finite sam...

We investigate proper scoring rules for continuous distributions on the real
line. It is known that the log score is the only such rule that depends on the
quoted density only through its value at the outcome that materializes. Here we
allow further dependence on a finite number $m$ of derivatives of the density
at the outcome, and describe a large...

We study the problem of estimability of means in undirected graphical
Gaussian models with symmetry restrictions represented by a colored graph.
Following on from previous studies, we partition the variables into sets of
vertices whose corresponding means are restricted to being identical. We find a
necessary and sufficient condition on the partiti...

This paper presents a coherent probabilistic framework for taking account of allelic dropout, stutter bands and silent alleles when interpreting STR DNA profiles from a mixture sample using peak size information arising from a PCR analysis. This information can be exploited for evaluating the evidential strength for a hypothesis that DNA from a par...

This paper develops a general framework to support the combina-tion of information from independent but related experiments, by introducing a formal way of combining statistical models represented by families of dis-tributions. A typical example is the combination of multivariate Gaussian families respecting conditional independence constraints, i....

We introduce new types of graphical Gaussian models by placing symmetry restrictions on the concentration or correlation matrix. The models can be represented by coloured graphs, where parameters that are associated with edges or vertices of the same colour are restricted to being identical. We study the properties of such models and derive the n...

Taking peak area information into account when analysing STR DNA mixtures is acknowledged to be a difficult task. There have been a number of non-probabilistic approaches proposed in the literature, and some have been incorporated into computer systems, but comparatively little has been published from a probabilistic perspective. Here we briefly re...

Forensic inference from genetic markers uses highly polymorphic multi-locus genotypes. Measures of informativeness can aid in selecting efficient genetic markers. Existing measures do not account for multiple sources of genetic variation (i.e. mutation, silent alleles, etc.) and they are not directly applicable to complex identification problems. U...

In this chapter, graphical models are introduced and used as a natural way to formulate and address problems in genetics and related areas. Local computational algorithms on graphical models are presented and their relationship with the traditional peeling algorithms discussed. The potential of graphical model representations is explored and illust...

We present a statistical methodology for making inferences about mutation rates from paternity casework. This takes account of a number of sources of potential bias, including hidden mutation, incomplete family triplets, uncertain paternity status and differing maternal and paternal mutation rates, while allowing a wide variety of mutation models....

This article is concerned with binary random matrices that are ex-changeable with the probability of any finite submatrix only depending on its row-and column sums. We describe basic representations of such matrices both in the case of full row-and column exchangeability and the case of weak exchangeability. Finally the results are interpreted in t...

In this paper we present the R package gRc for statistical inference in graphical Gaussian models in which symmetry restrictions have been imposed on the concentration or partial correlation matrix. The models are represented by coloured graphs where parameters associated with edges or vertices of same colour are restricted to being identical. We d...

We present a new methodology for analysing forensic identification problems involving DNA mixture traces where several individuals may have con-tributed to the trace. The model used for identification and separation of DNA mixtures is based on a gamma distribution for peak area values. In this paper we illustrate the gamma model and apply it on sev...

We introduce a new methodology, based upon probabilistic expert systems, for analysing forensic identification problems involving DNA mixture traces using quantitative peak area information. Peak area is modelled with conditional Gaussian distributions. The expert system can be used for ascertaining whether individuals, whose profiles have been mea...

This article reviews aspects of significance testing. Problems of detection of a specific signal with background noise from observed Poisson counts of events is used as a basic example throughout. In particular we discuss issues of using alternative test-statistics, unbinned likelihood fits, and comparing unweighted and weighted histograms. We poin...

In this paper we show how to represent with object-oriented Bayesian networks the math- ematical model described in Cowell et al. (2006a), for identification problems involving DNA mixture traces. We present detailed de- scriptions of each component class used to build up the networks, and we apply the net- works to an example.

A decision problem is defined in terms of an outcome space, an action space and a loss function. Starting from these simple ingredients, we can construct: Proper Scoring Rule; Entropy Function; Divergence Function; Riemannian Metric; and Unbiased Estimating Equation. We illustrate these for the case of a Riemannian outcome space. From an abstract v...

In this paper we introduce restricted concen- tration models (RCMs) as a class of graphical models for the multivariate Gaussian distri- bution in which some elements of the concen- tration matrix are restricted to being identi- cal is introduced. An estimation algorithm for RCMs, which is guaranteed to converge to the maximum likelihood estimate,...

HUGIN Expert is a small company writing software that can be used to create expert systems, using probability in the guise of graphical models. Steffen Lauritzen describes his part in the genesis and development of the company.

AalborgUniversityFirst, let me congratulate both authors on two ﬁne papers which illuminate important aspectsof causal inference. I have only a little to say about Professor Arjas’ paper which speciﬁcallyilluminates the aspect oftime and causality in an excellent way. I will therefore concentrate onthe concepts described by Professor Rubin which se...

This paper introduces graphical models as a natural environment in which to formulate and solve problems in genetics and related areas. Particular emphasis is given to the relationships among various local computation algorithms which have been developed within the hitherto mostly separate areas of graphical models and genetics. The potential of gr...

We show how probabilistic expert systems can be used to structure and solve complex cases of forensic identification involving DNA traces that might be mixtures of several DNA profiles. In particular, this approach can readily handle cases where the number of contributors to the mixture cannot be regarded as known in advance. The flexible modularit...

The visual exploration of large databases raises a number of unresolved inference problems and calls for new interaction patterns between multiple disciplines—both at the conceptual and technical level. We present an approach that is based on the interaction of four disciplines: database systems, statistical analyses, perceptual and cognitive psych...

We derive a simple inequality for the probability of observing a given DNA profile when assuming a fixed number of unknown persons have contributed to the mixed stain. We then show how this inequality can be used to obtain an upper bound for the number of unknown contributors needed to be considered.

Thorvald Nicolai Thiele was a brilliant Danish researcher of the 19th Century. He was a professor of Astronomy at the University of Copenhagen and the founder of Hafnia, the first Danish private insurance company. Thiele worked in astronomy, mathematics, actuarial science, and statistics, his most spectacular contributions were in the latter two ar...

Chain graphs are a natural generalization of directed acyclic graphs and undirected graphs. However, the apparent simplicity of chain graphs belies the subtlety of the conditional independence hypotheses that they represent. There are many simple and apparently plausible, but ultimately fallacious, interpretations of chain graphs that are often inv...

Chain graphs are a natural generalization of directed acyclic graphs and undirected graphs. However, the apparent simplicity of chain graphs belies the subtlety of the conditional independence hypotheses that they represent. There are many simple and apparently plausible, but ultimately fallacious, interpretations of chain graphs that are often inv...

this paper. A pig breeder is growing pigs for a period of four months and subsequently selling them. During this period the pig may or may not develop a certain disease

This paper describes an algorithm for maximising a conditional likelihood function when the corresponding unconditional likelihood
function is more easily maximised. The algorithm is similar to the EM algorithm but different as the parameters rather than
the data are augmented and the conditional rather than the marginal likelihood function is maxi...

We introduce the notion of LImited Memory Influence Diagram (LIMID) to describe multi-stage decision problems where the traditional assumption of no forgetting is relaxed. This can be relevant in situations with multiple decision makers or when decisions must be prescribed under memory constraints, such as e.g. in partially observed Markov decision...

We investigate two approaches to constructing compatible prior laws over alternative models: `projection' and `conditioning'. Each of these is shown to require additional inputs. We suggest that these can be chosen in a natural way in each case, leading to `Kullback-Leibler projection' and `Jeffreys conditioning '. We recommend the former for the c...

Bayesian inference for concave distribution functions is investigated. This is made by transforming a mixture of Dirichlet processes on the space of distribution functions to the space of concave distribution functions. We give a method for sampling from the posterior distribution using a Polya urn scheme in combination with a Markov chain Monte Ca...

This paper presents several models for investigating whether the HLA allogenotypes DR1/Br, DR3 and DR10 are genetic markers for a predisposition of experiencing unexplained recurrent foetal losses. A total of 199 women from 113 families answered questionnaires concerning their pregnancies and 145 of these women were HLA typed. The analysis of the d...

: We introduce a class of multivariate dispersion models suitable as error distributions for generalized linear models with multivariate non-normal responses. The models preserve some of the main properties of the multivariate normal distribution, and include the elliptically contoured distributions and certain other known distributions as special...

: This note is concerned with the class of hierarchical interaction models for mixed discrete and continuous variables as defined by Edwards (1990) and modified by Lauritzen (1996). In particular it is shown that any hierarchical log-linear interaction model can be generated by selection on a set of response variables in a directed Markov model ove...

Introduction The introduction of Bayesian networks (Pearl 1986b) and associated local computation algorithms (Lauritzen and Spiegelhalter 1988, Shenoy and Shafer 1990, Jensen, Lauritzen and Olesen 1990) has initiated a renewed interest for understanding causal concepts in connection with modelling complex stochastic systems. It has become clear tha...

We introduce the notion of LImited Memory Influence Diagram (LIMID) to describe multi-stage decision problems where the traditional assumption of no forgetting is relaxed. This can be relevant in situations with multiple decision makers or when decisions must be prescribed under memory constraints, such as e.g. in partially observed Markov decision...

This article describes the basic ideas and algorithms behind specification and inference in probabilistic networks based on directed acyclic graphs, undirected graphs, and chain graphs. Let us start with observing an expert at work. Her domain of expertise is a welldefined part of the world. She may be a physician examining a patient, she may be a...

This article describes a propagation scheme for Bayesian networks with conditional Gaussian distributions that does not have the numerical weaknesses of the scheme derived in Lauritzen (Journal of the American Statistical Association 87: 1098–1108, 1992).
The propagation architecture is that of Lauritzen and Spiegelhalter (Journal of the Royal Sta...

this paper was far too advanced for the contemporaries to appreciate. The contents have been described in further detail elsewhere (Lauritzen 1981).