Content uploaded by Edward K. Kao

Author content

All content in this area was uploaded by Edward K. Kao on Aug 25, 2014

Content may be subject to copyright.

Content uploaded by Edward K. Kao

Author content

All content in this area was uploaded by Edward K. Kao on Aug 25, 2014

Content may be subject to copyright.

Network Discovery Using

Wide-Area Surveillance Data

Steven T. Smith, Andrew Silberfarb, Scott Philips, Edward K. Kao, and Christian Anderson*

MIT Lincoln Laboratory; 244 Wood Street; Lexington, MA 02420

{ stsmith, drews, scott.philips, edward.kao, christian.anderson }@ll.mit.edu

Abstract—Network discovery of clandestine groups and their

organization is a primary objective of wide-area surveillance

systems. An overall approach and workﬂow to discover a fore-

ground network embedded within a much larger background,

using vehicle tracks observed in wide-area video surveillance

data is presented and analyzed in this paper. The approach

consists of four steps, each with their own speciﬁc algorithms:

vehicle tracking, destination detection, cued graph exploration,

and cued graph detection. Cued graph exploration on the

simulated insurgent network data is shown to discover 87% of the

foreground graph using only 0.5% of the total tracks or graph’s

total size. Graph detection on the explored graphs is shown to

achieve a probability of detection of 87% with a 1.5% false

alarm probability. We use wide-area, aerial video imagery and

a simulated vehicle network data set that contains a clandestine

insurgent network to evaluate algorithm performance. The pro-

posed approach offers signiﬁcant improvements in human analyst

efﬁciency by cueing analysts to examine the most signiﬁcant parts

of wide-area surveillance data.

Keywords: Network discovery, graph detection, wide-area

surveillance, tracking, graph sampling, spectral detection.

I. P

ROBLEM STATEMENT

Network discovery of clandestine groups and their organiza-

tion is a primary objective of wide-area surveillance systems.

Discovering such networks hidden within a sea of normal

activity is a very challenging problem [16]. Building reliable,

high-conﬁdence graphs representing networks of actors within

the ﬁeld of view is difﬁcult because the dynamic links between

network nodes have varying reliability, and because the total

number of links—the size of the underlying network—is very

large. With these realities in mind, an overall approach and

workﬂow to discover a foreground network embedded within

a much larger background using wide-area surveillance data

is presented and analyzed in this paper. The input to this

process is surveillance data, and the output is a semi-automated

estimate of the foreground network of interest represented as

a graph whose vertices or nodes represent geographical sites

within the ﬁeld of view, and whose edges represent vehicle

tracks between the source and destination nodes. The methods

developed in this paper to address this problem are general

enough to apply to a variety of measurement modes. The

paper will focus on aerial video data; an example image is

shown in Figure 1. The problem is one of discrete detection

*This work is sponsored by the United States Department of Defense under

Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions

and recommendations are those of the author and are not necessarily endorsed

by the United States Government.

Figure 1. Example image of a wide-area surveillance video from the AFRL

CLIF data set [1].

and estimation: estimate a graph within a scene given a

collection of time-sequenced imagery and detect a foreground

subgraph within the estimated graph. The detected subgraph

represents the discovered foreground network in the presence

of a larger background network. The distinction between

“foreground” and “background” networks is problem-speciﬁc

and necessarily depends upon external cues.

Advances in sensor and digital storage technology, as well

as signiﬁcant reductions in the cost of both, enable the collec-

tion of large amount of surveillance data. For example, video

imagery of a 10 km-by-10 km scene can yield Terabytes of

data per hour. Exploiting these large data sets demands inten-

sive processing capabilities, highly efﬁcient, low-complexity

algorithms, and the ability to focus on the subset of the data

that is most relevant. The challenge in exploiting this data

is not in the frontend sensor data or processing, but in the

backend determination of which small fraction of the data

is relevant to human analysts. Because analyst resources are

limited and expensive, automated or semi-automated tools that

cue the analysts offer the potential for signiﬁcant performance

gains in the task of constructing foreground graphs from

surveillance data.

Given a time series of wide-area surveillance imagery, the

14th International Conference on Information Fusion

Chicago, Illinois, USA, July 5-8, 2011

978-0-9824438-3-5 ©2011 ISIF 2042

Vehicle

Detection and

Tracking

Destination

Detection and

Clustering

Graph

Exploration

Graph

Detection

Detection and

Vehicle

Figure 2. Workﬂow for network discovery from wide-area surveillance data.

frontend problem is to detect the moving objects within the

ﬁeld of view, typically individuals or vehicles, then associate

and track these movers to provide frame-to-frame state esti-

mates of their position and dynamics. Because the objective

is to form a graph whose vertices and edges depend upon

the tracks themselves, accuracy of the graph exploration and

network detection problems are highly sensitive to errors

either in the identiﬁcation of track destinations or tracking

errors such as breaks and swaps. Misidentiﬁcation of a track

destination leads directly to incorrect association to other sites

while the correct site is potentially ignored. Therefore, such a

system demands very low probabilities of site/track misassoci-

ation and track errors because they propagate and accumulate

through the exploration and detection step. In practice, track

breaks and crossings are a problem, especially in target dense

environments. Fully automated solutions, though desirable,

suffer from the practical problems of incorrect associations.

Tracking is currently implemented as a semi-automated pro-

cess that requires analysts to examine and repair any tracks

deemed to be relevant to the construction of the foreground

graph. Prioritization of which tracks analysts should examine

then becomes a key part of the workﬂow involved in network

discovery.

This paper is organized into sections that describe each

part of the proposed workﬂow (Figure 2): the overall ap-

proach (Section II), vehicle tracking (Section III), destination

detection and clustering (Section IV), cued graph exploration

and prioritization (Section V), and cued graph detection (Sec-

tion VI).

II. A

PPROACH AND WORKFLOW

We propose a semi-automated approach to the problem

of network discovery that allocates human and computer

resources at the tasks for which they are best suited. The

huge volume of video data is ﬁrst processed to extract tracks

and determine destinations. External cues identify the part of

the scene that is of interest. This allows the human analyst

to explore a graph that contains the foreground network and

ultimately to detect this network within the explored graph.

Distinct algorithms are used for each step in this workﬂow,

which is comprised of four basic steps: tracking movers, des-

tination detection and clustering, cued graph exploration with

prioritization, and cued graph detection. Tracking involves

detecting movers within the ﬁeld of view and tracking these

using a feature-aided association algorithm/tracker.

Error-free tracker performance is not assumed or required,

as analysts will be used to assess track quality to insure

that high-conﬁdence vertices and edges are added to the

graph, thereby avoiding the accumulation and propagation

of errors in graph construction. Once tracks are available,

their destinations are determined, and these destinations are

clustered into hypothesized sites. Ultimately, a network will

be constructed between geographical sites within the the

scene; these sites are deﬁned by the output of the destination

clustering method described in Section IV, or predeﬁned using

existing knowledge, or deﬁned by an analyst using external

information. This processing provides the raw ingredients for

the exploration of the graph, and the analyst may promote

these sites and tracks as vertices and edges in the candi-

date foreground network. As more nodes are added to the

graph, these nodes are considered candidate foreground nodes

because of their association with the cued foreground node.

A graph exploration algorithm is run to prioritize each track

emanating from the new nodes. In the simplest case, without

explicitly modeling the foreground network, this exploration

algorithm simply computes the distance between the node a

track emanates from and the cued foreground node. More

sophisticated models of the foreground network may use time,

spatial proximity, network topology, site and track features

to compute track priority. As analysts promote more vertices

and edges to the constructed graph, a community detection

algorithm can be used to label each vertex as part of the

foreground subgraph.

The process of graph exploration yields a network contain-

ing some fraction of the foreground network, yet does not

make the ﬁnal distinction between foreground and background

sites. This binary decision—graph detection—is the ﬁnal step

of the process. Note that from the perspective of hypothesis

testing, the difference between graph exploration and graph

detection is that graph exploration yields a decision about

which speciﬁc vertices and edges should be added to the graph

or not, whereas graph detection yields a decision about which

of the graph vertices belong to the foreground network.

The metrics used to evaluate performance in this process

are related to standard detection metrics such as the receiver

operating characteristic (ROC) [15]. For graph exploration, the

objective is to uncover as many of the foreground sites as

possible while minimizing the number of tracks examined and

background sites investigated. A superior graph exploration

algorithm discovers the same number of foreground vertices

but with a lower number of tracks or background vertices

examined (Figure 8 and 9 in Section V). As usual, detection

performance evaluation requires knowledge of the true fore-

ground network, which is available either from knowledge of

a simulation, experimental setup, or, ideally, a nontrivial real-

world example thoroughly examined by experienced analysts.

Classically, the ROC is determined by three key operating

2043

Figure 3. Simulated insurgent network graph comprised of 4,478 locations

and 116,720 tracks. The foreground subgraph is shown using red nodes and

edges, and the background graph is shown using blue nodes and gray edges.

For clarity, only a partial graph is shown here.

parameters: the probability of detection (PD), probability of

false alarm (PFA), and signal-to-noise ratio (SNR), and algo-

rithms are typically evaluated by comparing PDs at a constant

false alarm rate (CFAR), or by comparing PFAs at a speciﬁc

PD, all at a ﬁxed SNR. In the context of graph exploration,

the percentage of the true foreground network sites uncovered

corresponds to PD, the number of background sites uncovered

corresponds to PFA, and the number of tracks required to

construct this fraction of the graph corresponds to the SNR.

Note that the percentage of foreground sites uncovered is

necessarily monotonic with the number of tracks explored,

just as PD is monotonic with classical SNR.

Two independent datasets are used to evaluate performance

in this paper. The ﬁrst consists of video imagery acquired

by a wide-area airborne sensor over a metropolitan area as

part of a collection called Project Bluegrass. Vehicle tracks

generated using this data were used in the destination detection

algorithms described in Section IV. The second dataset is

comprised of simulated vehicle motion data made available by

the National Geospatial-Intelligence Agency (NGA). This data

is derived from a scripted scenario that contains a clandestine

insurgent network and will be called the simulated insurgent

network data (Figure 3). The simulated data covers a 48-hour

time period and consists of approximately 116,720 vehicle

tracks between 4,478 locations made by 4,623 individual

actors. Of these, 31 locations and 22 actors are part of

the insurgent network. The simulated tracks in this data are

perfectly accurate and unambiguous. Consequently, there are

no complications arising from imperfect tracks, and destination

detection and clustering are not necessary. The insurgent

network data is therefore appropriate for evaluating graph

exploration and detection algorithms (Sections V and VI).

III. V

EHICLE TRACKING

Tracking of many vehicles in wide-area video scenes is

necessary to perform graph construction; however, accurate

video tracking by analysts is a manually intensive process,

requiring an analyst to focus on individual vehicles and

follow them from start to ﬁnish. This would seem to be a

task ideally suited for automation; however, current tracking

algorithms yield insufﬁciently accurate results to produce

reliable graphs. Tracks for network construction must connect

two destinations correctly, i.e., they must track vehicles from

source to destination. Vehicle tracking algorithms often make

mistakes in tracking vehicles through obscurations or in heavy

trafﬁc environments. In these cases, automated algorithms

often swap tracks between vehicles, introducing false connec-

tions between unassociated destinations. Furthermore, the fact

that tracks are often composed of hundreds of independent

detections provides many chances for errors to occur. Even an

algorithm with 99% accuracy would likely make one or more

errors per track given the large number of opportunities for

error; therefore, current automated methods cannot be relied

upon to provide accurate, fully automated tracks for graph

construction. However, the automated tracks are very useful in

semi-automated tracking because relevant but imperfect tracks

can be repaired by analysts. Tracking algorithm speciﬁcs and

performance evaluation is beyond the scope of this paper.

Nonetheless, imperfect tracks are also useful for destination

detection, as described in Section IV.

IV. D

ESTINATION DETECTION

Construction of vehicle movement graphs requires a map-

ping from the behavior of real world vehicle motions to the

nodes and edges of a graph. Edges are generally deﬁned as

tracks which link two destination points, i.e., a house and a

gas station. Nodes are then the set of discrete destinations in

the underlying vehicle movement network. Because vehicles

arriving at the same destination do not stop at identical

locations, automated detection of clusters of stopping vehi-

cles is needed to infer vehicle destinations. Additionally, the

automated algorithm provides information about destination

type, including the size of the destination and the number of

stops associated with it. This additional contextual information

can be used to better exploit the vehicle movement data.

Throughout this section “stop” refers to any time a vehicle

under track is not moving. Hence any start, stop, or pause of

a tracked vehicle is declared a stop for the purpose of graph

construction.

The most natural way to deﬁne a destination is to have

a human look at a map and classify areas into destinations,

such as parking lots and driveways. Any regions that are not

classiﬁed as some destination are then referred to as transient

regions, e.g., roads. There are some dual use regions, such as

2044

roadsides, whose use may depend upon the day, time of day or

other extraneous factors. Ignoring this complication, we tasked

human analysts to create destination truth sets based on video

imagery from the Bluegrass data set.

The goal of this section is to reproduce the manually

classiﬁed destinations as closely as possible using solely

automated techniques. Manual segmentation uses contextual

data which may not be available to a computer program.

Instead, the automated algorithm uses the motion data from

nearby vehicles. A destination is a place where groups of

vehicles stop for prolonged periods of time.

Implementation of an automated algorithm requires two

steps. First, a way to discriminate destination stops, i.e., stops

occurring at a destination, is needed. Often, vehicles will stop

at intersections or for trafﬁc related reasons; while these stops

can have a long duration, they are not of interest in declaring

destinations. Given the set of all destination stops, we need a

way of identifying the discrete destinations that are visited by

these stops. That is, we must identify driveways and parking

lots, rather than generic stopping regions. These problems are

addressed in order.

Discrimination of destination stops from transient stops is

automatically performed using hypothesis testing with con-

textual information. Speciﬁcally, a computer can analyze the

behavior of nearby trafﬁc and compute the probability that a

stop at a given position is a destination stop or a transient

stop. If much of the trafﬁc previously observed to be traveling

near a position has been traveling at high speed, then it is

unlikely that a stop at that position is a destination stop.

Conversely, if almost all of the trafﬁc near that position

has high decelerations, travels slowly, and stops frequently,

a car stopping at that position is likely to have arrived at its

destination. Technically, we use contextual data from observed

trafﬁc to create a Gaussian-mixture prior with ﬁxed variance

for the probability that a car stopping at position x has reached

a destination,

P (H

d

|x) =

P

y∈Y

d

N(|y − x|; σ)

P

y∈Y

N(|y − x|; σ)

, (1)

where H

d

is the destination hypothesis, Y

d

is a set of likely

destination stops, and Y is the set of all stops. In the current

algorithm, we use the tracker’s internal state at termination of

the track to choose the set Y

d

. The features of the speciﬁc stop

under test, including duration and abruptness, collectively Θ,

are then used to update this prior based on the speciﬁcs of the

stop to arrive at the posterior probability that a given stop is

at a destination,

P (H

d

|Θ, x) =

P (Θ|H

d

)P (H

d

|x)

P

H∈{H

d

,H

t

}

P (Θ|H)P (H|x)

, (2)

a consequence of Bayes’ rule and the conditional indepen-

dence of Θ and x given H

d

; H

t

is the transient hypothesis,

d denotes a destination stop, and t denotes a transient stop.

A likelihood test is then used to classify all stops into either

transients or destinations.

0

m

p

k

µ

k

k

x

n

z

n

Stop n

Destination k

Figure 4. Graphical model of the clustering algorithm. The hyperparameters

m, σ

0

, and θ specify how the clusters, k, are generated. The cluster parameters

µ

k

, Σ

k

,and p

k

then specify how the individual data points x

n

are generated

based on which cluster the points belong to z

n

.

The above algorithm was tested on the Bluegrass data

against manually declared stop locations. The probability of

detecting destination stops was 54%, at a false alarm rate

of 1% on all track ends. Stops were missed either from an

inability to track persistently or failure of the hypothesis test.

Note that this is an initial implementation of the algorithm

which we intend to improve. Speciﬁcally contextual informa-

tion was only learned from nearby stops, while the information

from nearby moving vehicles would seem to provide additional

necessary information (i.e., stopping in the middle of a busy

road is contraindicated).

The identiﬁed destination stops must now be clustered into

discrete destinations. There is not a single correct way to

perform this clustering, as the edges of parking lots need not be

well deﬁned. Here the goal will be to obtain a clustering which

closely matches manual clustering results on the Bluegrass

data, and which scales gracefully as more data is added and

as the area under consideration is increased.

To perform this clustering, we ﬁrst construct a simple

generative model for destinations and then use inference

techniques to optimize the model parameters which include the

locations and sizes of the destinations. A graphical diagram

of this model is presented in Figure 4. In the simple model

we approximate a destination as a Gaussian distribution of

destination stops. In other words given a destination’s mean,

µ

k

, and its covariance, Σ

k

, the stops associated with this

destination will be Gaussian distributed as N(µ

k

, Σ

k

). The

means of these Gaussians are assumed to be distributed

uniformly over the area of interest, and the variances according

to an inverse Wishart distribution. Each destination will also

have an associated rate of vehicle stops. One expects this rate

to vary between destinations as more vehicles stop in a parking

lot than in a driveway. Ignoring the total rate of vehicle stops

at all destinations we only consider the probability that a given

stopping vehicle will stop at a speciﬁc destination p

k

. By

only modeling this probability we allow the algorithm to scale

gracefully as the total number of stops considered increases.

This probability of a stop being at a given destination is

modeled using a Dirichlet process prior [5].

2045

This generative model allows us to jointly optimize the

number of destinations along with their locations, widths, and

densities. Starting from Bayes’ rule, we can integrate out the

nuisance parameters and obtain a close form solution for the

probability that any given set of destinations is correct [8].

The cost of ﬁtting a given dataset partitioned as { x } to this

model is then C = − log P ({ x }) which depends upon the

models hyperparameters m, σ

0

, θ. This cost can be expressed

with m = 3 as

C({ x }, σ

0

, θ) = −

X

k

n

k

+ 2

2

log det(σ

2

0

I + n

k

Σ

k

)

+ 2 log Γ(n

k

) + log θ

, (3)

where Σ

k

is the sample covariance matrix of the points

in cluster k, n

k

the number of points in cluster k, and

Γ(n) = (n + 1)! is the gamma function. We use a variational

Bayesian expectation maximization (VBEM) algorithm [2], [8]

to minimize the cost for the observed data, which produces

discrete destination clusters speciﬁed by p

k

, µ

k

, Σ

k

, denoting

the density, location and width of the destination respectively.

The EM algorithm is initialized by randomly selecting points

and placing them into clusters of size σ

0

. The iterations of the

EM algorithm then allow the points to move between clusters

(i.e. changing their membership z

n

), allow clusters to merge,

and allow new clusters to be formed by single points. Each

iteration reduces the cost. When the cost no longer changes,

the algorithm terminates and the parameters of the resulting

clusters then specify the discrete destinations.

Note that the cost function of Eq. (3) has several desirable

properties. First, it is the sum over independent cluster scores,

ensuring that the clustering procedure will scale gracefully as

clusters whose centers are far away from a given cluster will

not perturb its score. Second, the clusters in the model do not

interact, so that the parameters of the cluster may be estimated

solely from the points assigned to that cluster. Finally, the

cluster score has a minimum at a number of clusters that is

substantially less than the number of data points (except for

small σ

0

). This property provides a parsimonious description

which we would expect, given the limited number of parking

lots and driveways present in the world.

There are three key parameters needed to specify the cluster-

ing score, σ

0

, θ, m, corresponding to the initial cluster width,

the cluster density, and the strength of the prior on cluster

width, respectively. The results are somewhat insensitive to

the initial prior strength, and so we arbitrarily set m = 3. The

cluster density, θ, is set by requiring that if two points are

separated by a distance of 2σ

0

then the score for clustering

them together should equal the score for clustering them

separately. Thus, if two isolated points are closer together

than 2σ

0

they will form a single cluster, while if they are

further apart they will form two distinct clusters. Finally, the

base width parameter σ

0

can be tuned to choose the clustering

size. Large σ

0

’s will result in overclustering, while small σ

0

’s

will induce signiﬁcant underclustering. In practice, we choose

Manual

Clusters

Automated

Clusters

Figure 5. Example clustering of an intersection in Albuquerque, NM. Yellow

boxes are notional manual clustering. Blue circles are notional automated

clusters (one sigma ellipses). Red dots are the notional stops/data points that

are clustered. Actual clustering was evaluated using Bluegrass data.

Information Coverage Plot

Information Completeness

False Information Ratio

0 0.05 0.1 0.15 0.2 0.25

0.75

0.8

0.85

0.9

0.95

1

Figure 6. Information coverage (IC) plot of the performance of the

clustering algorithm [7]. The y-axis indicates the mutual information between

the automated and manual destination declarations normalized by the total

entropy. The x axis is the conditional entropy of the automated output given

the manual destinations again normalized by total entropy. The free parameter

in the IC plot is σ

0

, and the starred operating point is at σ

0

= 30 m.

σ

0

to be around 30 m, the approximate separation between

houses. This choice allows driveways for adjacent houses

to be clustered separately, while still permitting parking lots

to be clustered together. However, two immediately adjacent

driveways could be clustered together, and large parking lots

may be broken up if the distribution of cars is insufﬁciently

dense or insufﬁciently uniform.

Figure 5 shows an example clustering result on vehicle des-

2046

tinations around a urban intersection. Quantitative evaluation

of the automated clustering performance is done by comparing

it to the human clustering on the Bluegrass data set, shown

in Figure 6. The comparison is done using the information

theoretic metric [7]. A performance curve is traced out by

varying σ

0

. At the optimal point, with false alarms and missed

detections weighted equally, the automated clustering captures

above 95% of the information from manual clustering; the

amount of false information introduced by the automated

clustering is less than 2% of the total information needed to

identify destinations. As noted, the optimal initial cluster size

is about the size of a residence, and comparable results are

obtained at nearby initial sizes. This result is expected, as the

initial cluster size is only a suggestion and the actual cluster

width is learned by the algorithm.

V. C

UED GRAPH EXPLORATION

After vehicle destinations have been identiﬁed and clus-

tered into geographical sites of interest, a method is needed

to connect these sites together through vehicle tracks. Sites

may be speciﬁed by either the destination clustering method

just described, or by an analyst using external information. By

identifying relationships between sites, it is possible to build a

network containing the subnetwork associated with the original

cue. As discussed in Section III, automatic tracking of cars in

dense urban environments is subject to many types of errors.

This means that any connection between two locations will

require a semi-automated approach with a human in the loop.

The challenge with any semi-automated approach is that

while humans can reduce the problem of track errors, the time

required to hand-verify every track throughout an entire city

is prohibitively expensive. Therefore, an approach to prioritize

analyst tasking is needed to focus effort on following the

vehicles most relevant to constructing the foreground network.

In the case of community detection, relevant vehicles are

vehicles most likely to be part of the community of interest.

Though not the focus of this paper, alternate schemes could

consider relevance from an information theoretic approach,

whereby relevant vehicles are ones that provide maximum

information content for discriminating the foreground from

the background network.

The proposed algorithm is a cued graph exploration ap-

proach whereby a location of interest is identiﬁed (a cue) and

a graph grown beginning at the cue location. This cue location

is assumed to be known a priori and represents a site used

by the foreground community. Beginning with a cue allows

analysts to focus their attention on vehicles in a local region

(in graph space) around a known foreground location.

The graph exploration algorithm is based on the breadth-

ﬁrst search (BFS) algorithm [13]. In breadth-ﬁrst search, a

graph is initially formed by following all vehicles departing

or arriving at the cue location (node). Then for each location

(node) found in step one, all vehicles are followed again. The

priority assigned to exploring edge E in the graph is therefore

given by

Priority(E) = min d

v(E), V

c

, (4)

Figure 7. Graphs constructed by following vehicles from one location to

another. The left ﬁgures show vehicle movement overlayed on the aerial

imagery while the right ﬁgures show the same movement represented as a

graph. From top to bottom each ﬁgure shows the graph at various stages of

exploration using BFS. The foreground subgraph is shown using red nodes,

and the background graph is shown using gray vertices. The cue vertex is

shown in yellow.

where d is the standard graph distance between vertices, v(E)

are the vertices of edge E, and V

c

is the cued vertex. This

procedure is repeated until some ﬁxed number of tracks are

explored. Nodes at the same distance away from the cue are

explored in random order and vehicles departing any given

node are followed in random order.

Figure 7 shows three graphs at various stages of exploration

using BFS on the simulated insurgent network data. Note that

while the vehicles may traverse long distances in physical

space, all locations discovered are within one, two, or three

hops from the cue node.

While BFS is good at exploring a neighborhood of nodes

surrounding a cue node, the ﬁnal graph in Figure 7 demon-

strates a major drawback of this approach. The vast majority

of time in the ﬁnal graph is spent exploring tracks leaving a

handful of high degree nodes. Under a ﬁxed time constraint,

2047

this would not be an efﬁcient use of human resources. This

observation that BFS can be biased towards high degree nodes

has also been observed in a number of other studies [3], [9].

In order to combat the bias towards high degree nodes,

a degree-weighed BFS approach is implemented. In degree-

weighted BFS, nodes at the same distance from a cue node

are explored in order of their degree, with low degree nodes

explored ﬁrst and high degree nodes explored last. Additional

models of node relevancy may also be used, depending upon

the observable information available for each node and track.

Because nodes represent clustered track destinations, it is

possible to estimate the degree of each node by counting the

number of destinations in each cluster, thereby providing an

estimate of node degree before the exploration stage.

The exploration strategies are compared using two metrics.

One metric measures time required to explore the graph (hu-

man resources), and another measures efﬁciency at uncovering

foreground relative to background (ROC Analysis). Three

search strategies, random walk, BFS and degree-weighted

BFS, are compared using each metric. The random walk

strategy provides a baseline and represents a human following

random vehicles beginning at the tip node.

Figure 8 shows the percentage of the foreground network

found as a function of the number of tracks examined, which

is a proxy for the amount of human time required to uncover a

certain percentage of the foreground network. Figure 8 shows

that a local search in graph space such as BFS uncovers more

of the foreground network faster than an undirected random

search. Better still is degree-weighted BFS which ﬁnds 80%

of the foreground in approximately 250 tracks while standard

BFS requires nearly 800. This represents a signiﬁcant, three-

times savings in human resources.

Figure 9 shows the percentage of foreground network found

against the percentage of the background network found.

This is similar to a traditional ROC curve indicating PD/PFA

performance regardless of human time required. This results

seen in Figure 9 are comparable to those of Figure 8 with

degree-weighted BFS outperforming other search methods.

VI. C

UED GRAPH DETECTION

Network detection is the ﬁnal step of the network discovery

work ﬂow. It provides discrimination between the foreground

and background network on the graph constructed from the

previous steps. The output of the detection step determines

the ﬁnal overall performance of the system.

Intuitively, connectivity between nodes in the foreground

network is stronger than the connectivity between background

nodes to the foreground network. Discrimination based on

this structural difference has been demonstrated using spectral

methods by Newman [14]. Similarly, we perform detection

based on the projection of a graph into the subspace spanned

by a few eigenvectors of its modularity matrix B,

B = A −

kk

T

2|E|

, (5)

0 200 400 600 800 1000

0

20

40

60

80

100

Tracks Examined

Foreground Found (%)

Weighted BFS

BFS

Random

Figure 8. Percentage of foreground network found as a function of vehicle

tracks examined. Random search is shown in thin blue (

), BFS is shown

in medium green (

) and degree-weighted BFS is shown in thick red ( ).

0 10 20 30 40 50

50

60

70

80

90

100

Background Found (%)

Foreground Found (%)

Weighted BFS

BFS

Random

Figure 9. Percentage of foreground network found as a function of

background network founds. Random search is shown in thin blue (

), BFS

is shown in medium green (

) and degree-weighted BFS is shown in thick

red (

).

where A is the observed adjacency matrix, k is the vector of

node degrees, and |E| is the total number of edges. The mod-

ularity matrix can be interpreted as the difference between the

observed and the expected number of edges between any pair

of nodes. Performing the eigen-decomposition B =: UΛU

T

provides eigenvectors U as candidate bases. Corresponding

to the algorithm described by Miller et al. [10]–[12], the

principal eigenvectors that have the largest components along

the dimension of the cue node are selected to form the

detection subspace. Nodes, mapped to this subspace, are then

detected by thresholding on their Euclidean norm. By varying

the detection threshold, a receiver operating curve (ROC) is

generated for each constructed graph with varying number of

explored tracks (Figure 10).

For each of the detection ROCs, the top right corner of

the curve represents the detection performance if all explored

2048

0 1 2 3 4 5

0

20

40

60

80

100

Background Found (PFA, %)

Foreground Found (PD, %)

1000 Tracks Explored

150 Tracks Explored

600 Tracks Explored

Figure 10. Detection ROC on graphs with varying number of explored tracks,

using degree-weighted breadth ﬁrst search. The curves indicates detection

performance on a graph with 150 (thin blue

), 600 (medium green ),

and 1000 (thick red

) tracks explored.

nodes are declared as the foreground. Starting at the smallest

graph with only 150 explored tracks, 35% of the foreground

network is missed because the entire network cannot be

reached with this relatively small number of tracks. The

optimal performance is seen when 600 tracks (i.e., 0.5% of

the total tracks) are explored, where the detector achieves 87%

PD and 1.5% PFA. At this sampling level, the constructed

graph includes the majority of the foreground network as well

as additional information on the network topology, leading to

better discrimination in the detection step. Beyond 600 tracks,

the overall performance plateaus so that additional exploration

does not yield greater performance. Our detection analysis

highlights the advantage of coupling the detection step with the

exploration step, which achieves higher overall performance.

The detection step provides false alarm mitigation because

many of the explored nodes are actually part of the background

network. The exploration step provides optimal inclusion of

the foreground network while minimizing the size of the

constructed graph, reducing the analyst workload needed for

graph construction.

VII. C

ONCLUSIONS

This paper presents an end to end approach to perform

network discovery from wide-area surveillance data. A trafﬁc

network graph is constructed in a semi-automated fashion

where human analysts are aided with automated algorithms

such as tracking, destination detection, site clustering, graph

exploration, and graph detection. We demonstrate efﬁcient

graph exploration starting from a cue node and good ﬁnal

detection performance on the simulated insurgent network

dataset comprised of 4,478 locations and 116,720 tracks. The

degree-weighted breadth-ﬁrst search model for node relevancy

is shown to uncover 80% of the foreground network in

approximately 250 tracks while standard BFS requires nearly

800, representing a signiﬁcant, three-times potential savings

in human resources. Detection performance on the simulated

foreground graph is shown to be 87% PD and 1.5% PFA.

The strength of this approach centers on the ability to focus

analysis on a very small part of the immense video data

stream. While improvements can be made to every step of the

processing chain, we believe this approach provides a novel

and promising paradigm in conducting clandestine network

discovery.

R

EFERENCES

[1] AFRL CLIF 2007 dataset over Ohio State University.

h https://www.sdms.afrl.af.mil/datasets/clif2007 i.

[2] M. J. B

EAL. Variational Algorithms for Approximate Bayesian Infer-

ence. Ph.D. Thesis, Gatsby Computational Neuroscience Unit, Univer-

sity College London, 2003.

[3] L. B

ECCHETTI, C. CASTILLO, D. DONATO, and A. FAZZONE. “A

comparison of sampling techniques for web graph characterization,” in

Proc. Workshop on Link Analysis (LinkKDD), Philadelphia, PA, 2006.

[4] D. M. B

LEI, A. NG, and M. I. JORDAN. “Latent Dirichlet Allocation,”

Journal of Machine Learning Research 3 : 993–1022, 2003.

[5] T. S. F

ERGUSON. “A Bayesian analysis of some nonparametric prob-

lems.” Annals of Statistics 1 : 209–230, 1973.

[6] C. G

ODSIL and G. ROYLE. Algebraic Graph Theory. New York:

Springer-Verlag, Inc. 2001.

[7] R. S. H

OLT, P. A. MASTROMARINO, E. K. KAO, and M. B. HURLEY.

“Information theoretic approach for performance evaluation of multi-

class assignment systems,” in Proc. Signal Proc., Sensor Fusion, and

Target Recognition XIX (SPIE), ed., Ivan Kadar. Orlando, FL, April

2010.

[8] D. K

OLLER and N. FRIEDMAN. Probabilistic Graphical Models. Cam-

bridge, MA: The MIT Press, 2009.

[9] M. K

URANT, A. MARKOPOULOU, and P. THIRAN. “On the bias of

BFS (Breadth First Search),” in Proc. 22d Intl. Teletrafﬁ Congress (ITC),

Amsterdam, Netherlands, 2010.

[10] B. A. M

ILLER, M. S. BEARD, and N. T. BLISS. “Eigenspace Analysis

for Threat Detection in Social Networks,” to appear, Fusion, 2011.

[11] B. A. M

ILLER, N. T. BLISS, and P. J. WOLFE. “Toward signal

processing theory for graphs and other non-Euclidean data,” in Proc.

IEEE Intl. Conf. Acoustics, Speech and Signal Processing, pp. 5414–

5417, 2010.

[12] B. A. M

ILLER, N. T. BLISS, and P. J. WOLFE. “Subgraph detection

using eigenvector L

1

norms,” in Proc. 2010 Neural Information Pro-

cessing Systems (NIPS), Vancouver, Canada, 2010.

[13] M. E. J. N

EWMAN. Networks: An Introduction, Oxford University Press,

2010.

[14] M. E. J. N

EWMAN. “Finding community structure in networks using

the eigenvectors of matrices,” Phys. Rev. E, 74 (3), 2006.

[15] H. L. V

AN TREES. Detection, Estimation, and Modulation Theory,

Part 1. New York: John Wiley and Sons, Inc. 1968.

[16] J. X

U and H. CHEN. “The topology of dark networks,” Comm. ACM

51 (10) : 58–65, 2008.

2049