Page 1

arXiv:1011.1717v1 [stat.AP] 8 Nov 2010

The Annals of Applied Statistics

2010, Vol. 4, No. 2, 533–534

DOI: 10.1214/10-AOAS365

c ? Institute of Mathematical Statistics, 2010

INTRODUCTION TO PAPERS ON THE MODELING AND

ANALYSIS OF NETWORK DATA—II

By Stephen E. Fienberg

Carnegie Mellon University

This issue of The Annals of Applied Statistics (Volume 4, No. 2) con-

tains the second part of a Special Section on the topic of network modeling.

The first part consisted of seven papers and appeared with a general in-

troduction [Fienberg (2010)] in Volume 4, No. 1. In Part II we include a

diverse collection of eight additional papers with applications spanning bio-

logical, informational and social networks, using techniques such as kriging

and anomaly detection, and variational approximations, as well as the study

of latent structure in both static and dynamical networks:

• In A State-Space Mixed Membership Blockmodel for Dynamic Network To-

mography, Xing, Fu and Song combine earlier approaches involving mixed

membership stochastic blockmodels for static networks with state-space

models for trajectories and use the new dynamic modeling approach to

analyze the Sampson’s network of noviates in a monastery, the email com-

munication network between the Enron employees and a rewiring gene

interaction network of the life cycle of the fruit fly.

• In Maximum Likelihood Estimation for Social Network Dynamics, Sni-

jders, Koskinen and Schweinberger develop a likelihood-based approach to

network panel data with an underlying Markov continuous-time stochastic

actor-oriented process. They use the new methods to reanalyze a friend-

ship network between 32 freshman students in a given discipline at a Dutch

university, observed over six waves at three-week intervals beginning at

the start of the academic year.

• Xu, Dyer and Owen use a semi-supervised learning on network graphs in

which response variables observed at one node are used to estimate missing

values at other nodes, by exploiting an underlying correlation structure

among nearby nodes. The methods they employ in Empirical Stationary

Correlations for Semi-supervised Learning on Graphs are rooted in ideas

Received May 2010.

This is an electronic reprint of the original article published by the

Institute of Mathematical Statistics in The Annals of Applied Statistics,

2010, Vol. 4, No. 2, 533–534. This reprint differs from the original in pagination

and typographic detail.

1

Page 2

2

S. E. FIENBERG

about kriging emanating from geostatistics, and they compare their meth-

ods to ones proposed earlier using a data set containing the number of

web links between UK universities in 2002, and the WebKB data set con-

taining webpages collected from computer science departments of various

US universities in 1997.

• In Ranking Relations Using Analogies in Biological and Information Net-

works, Silva, Heller, Ghahramani and Airoldi explore the problem of rank-

ing relations in network-like settings based on a similarity criterion un-

derlying Bayesian sets, drawing on ideas of analogy items in test batteries

such at the SAT. They too analyze the WebKB collection, as well as the

problem of ranking protein–protein interactions using the MIPS database

for the proteins in budding yeast.

• Heard, Weston, Platanioti and Hand fuse discrete time counting models

to carry out Bayesian Anomaly Detection Methods for Social Networks

using data from the European Commission Joint Research Centre’s Euro-

pean Media Monitor web intelligence service, that provides real-time press

and media summaries to Commission cabinets and services, including a

breaking news and alerting service. They also study simulated cell phone

data from the VAST Mini Challenge covering a fictional ten-day period

on an island, narrowed to 400 unique cell phones during this period.

• James, Zhou, Zhu and Sabatti study Sparse Regulation Networks, in ge-

netic contexts using prior information about the network structure in con-

junction with observed gene expression data to estimate the transcription

regulatory network for E. coli. Their approach uses L1penalties on the

network to ensure a sparse structure.

• Zanghi, Picard, Miele and Ambroise explore Strategies for Online Infer-

ence of Model-Based Clustering in Large and Growing Networks. Their

online EM-based algorithms offer a good trade-off between precision and

speed, when estimating parameters for mixture distributions applied to

data from the political websphere during the 2008 US political campaign.

• Mariadassou, Robin and Vacher, in Uncovering Latent Structure in Val-

ued Graphs: A Variational Approach, use variational approximations to

likelihood mixture modes where the network connections are weighted

values instead of simple 0–1 entries. They use their method to analyze

interaction networks of tree and fungal species.

REFERENCE

Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network

data. Ann. Appl. Statist. 4 1–4.

Page 3

SECTION ON NETWORK MODELING—II

3

Department of Statistics

and Machine Learning Department

Carnegie Mellon University

Pittsburgh, Pennsylvania 15213

USA

E-mail: fienberg@stat.cmu.edu