ThesisPDF Available

Experimental and Numerical Investigations of Novel Architectures Applied to Compressive Imaging Systems

Authors:

Abstract and Figures

A recent breakthrough in information theory known as compressive sensing is one com- ponent of an ongoing revolution in data acquisition and processing that guides one to acquire less data yet still recover the same amount of information as traditional techniques, meaning less resources such as time, detector cost, or power are required. Starting from these basic principles, this thesis explores the application of these techniques to imaging. The first labo- ratory example we introduce is a simple infrared camera. Then we discuss the application of compressive sensing techniques to hyperspectral microscopy, specifically Raman microscopy, which should prove to be a powerful technique to bring the acquisition time for such mi- croscopies down from hours to minutes. Next we explore a novel sensing architecture that uses partial circulant matrices as sensing matrices, which results in a simplified, more robust imaging system. The results of these imaging experiments lead to questions about the perfor- mance and fundamental nature of sparse signal recovery with partial circulant compressive sensing matrices. Thus, we present the results of a suite of numerical experiments that show some surprising and suggestive results that could stimulate further theoretical and applied research of partial circulant compressive sensing matrices. We conclude with a look ahead to adaptive sensing procedures that allow real-time, interactive optical signal processing to further reduce the resource demands of an imaging system.
Content may be subject to copyright.
ABSTRACT
Experimental and Numerical Investigations of Novel Architectures Applied to Compressive
Imaging Systems
by
Matthew Adam Turner
A recent breakthrough in information theory known as compressive sensing is one com-
ponent of an ongoing revolution in data acquisition and processing that guides one to acquire
less data yet still recover the same amount of information as traditional techniques, meaning
less resources such as time, detector cost, or power are required. Starting from these basic
principles, this thesis explores the application of these techniques to imaging. The first labo-
ratory example we introduce is a simple infrared camera. Then we discuss the application of
compressive sensing techniques to hyperspectral microscopy, specifically Raman microscopy,
which should prove to be a powerful technique to bring the acquisition time for such mi-
croscopies down from hours to minutes. Next we explore a novel sensing architecture that
uses partial circulant matrices as sensing matrices, which results in a simplified, more robust
imaging system. The results of these imaging experiments lead to questions about the perfor-
mance and fundamental nature of sparse signal recovery with partial circulant compressive
sensing matrices. Thus, we present the results of a suite of numerical experiments that show
some surprising and suggestive results that could stimulate further theoretical and applied
research of partial circulant compressive sensing matrices. We conclude with a look ahead
to adaptive sensing procedures that allow real-time, interactive optical signal processing to
further reduce the resource demands of an imaging system.
iii
First and foremost, I must thank my family still on this Earth, and those passed on to
the next life. I am deeply endebted to their unending encouragement, dedication, and love.
Indeed this achievement owes much to my parents who have always driven me to persevere.
Next, I must express my deepest appreciation to my research adviser Kevin F. Kelly who
encouraged me to apply for the NSF IGERT fellowship, provided space for me to work and
grow as a researcher, and supported my ventures out into side projects and allowing me to
find my own research path. Thanks also to my Kelly Lab-mates, especially Lina Xu, Yun Li,
and Ting Sun from the camera lab. Cheers to Chad Byers for drinking coca tea and many
beers with me in Bogot´a. Thanks to Chaitra Rai for being a great friend who always had
an open ear. Thanks to Corey Slavonic for many good conversations on walks for coffee, at
lunch, or any of the many times I locked myself out of the office or lab.
Thanks very much to Woato Yin whose class taught me how important high-dimensional
geometry is to understanding compressive sensing and sparse recovery. Indeed, his class
caused me to retreat at first because it was so difficult, but in that difficulty I regained a
grip and was able to view anew the beauty of geometry, which in high school first attracted
me to mathematics.
Thanks to Richard Baraniuk for being such a positive character in the Rice Electrical and
Computer Engineering Department. Because of him we have an all-star lineup of speakers
every semester. His group members have been immensely helpful to me in the process of my
Master’s work.
I must also acknowledge Roger Moye of the Rice Shared Computing Grid who patiently
answered my questions on Unix programming, batch scripts, parallel computing, and for his
timely responses when MATLAB or something else was not working on the cluster. If not
for his help, Chapter 4 would not exist.
Finally, a very huge thank you to my wonderful friends who have also been bandmates,
mentors, roommates, and psychiatrists throughout these past four years. Life would be hell
without you.
Contents
Abstract ii
List of Illustrations vi
List of Tables x
1 Motivation and Introduction 1
1.1 Overview...................................... 1
1.2 Introduction to the Mathematics of Compressive Sensing . . . . . . . . . . . 2
1.2.1 Matrix representation of a system of equations . . . . . . . . . . . . . 2
1.3 Infrared Imaging via Compressive Sampling . . . . . . . . . . . . . . . . . . 5
1.3.1 Single-Pixel Camera General Setup . . . . . . . . . . . . . . . . . . . 5
1.3.2 Image Recovery with TVAL3 . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Other Applications and Outline of the Thesis . . . . . . . . . . . . . . . . . 13
2 Compressive Microscopy 15
2.1 Compressive Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Raster Scanning Microscope Systems . . . . . . . . . . . . . . . . . . 15
2.1.2 Compressive Microscopy Setup . . . . . . . . . . . . . . . . . . . . . 18
2.2 RamanImaging.................................. 19
2.2.1 TheRamanEect ............................ 19
2.2.2 Determining Chemical Structure from the Raman Spectrum of a
Material.................................. 24
2.2.3 Raster Scanning Raman Microscopy . . . . . . . . . . . . . . . . . . . 26
2.2.4 Laser-illuminated Compressive Sensing Microscope System . . . . . . 28
CONTENTS v
3 Circulant Matrices for Compressive Imaging 35
3.1 Theory of circulant matrices for imaging . . . . . . . . . . . . . . . . . . . . 35
3.1.1 Properties of circulant matrices . . . . . . . . . . . . . . . . . . . . . 35
3.2 Imaging with Partial Circulant Measurement Matrices . . . . . . . . . . . . 37
3.2.1 ImagingSetup............................... 38
3.2.2 Formally Describing How to Build Φ from Φ............. 41
3.3 ImagingResults.................................. 45
3.4 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Experimental Investigation of Subsampled Circulant Matrices
for Compressive Sensing 49
4.1 Background .................................... 52
4.1.1 LinearProgramming ........................... 52
4.1.2 PolytopeGeometry............................ 52
4.1.3 The Connection between Polytope Geometry and Convex Optimization 54
4.2 Experimentalsetup................................ 56
4.3 Phase Diagrams for Select nwith Explanations . . . . . . . . . . . . . . . . 57
4.3.1 Basis Pursuit Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2 LinearProgram.............................. 58
4.3.3 Some Coherence Statistics . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Discussion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 Future Directions and Conclusion 71
5.1 AdaptiveSensing ................................. 72
5.2 Conclusion..................................... 78
Bibliography 79
Illustrations
1.1 Detail of the DMD along with a photograph of a DMD displaying a 32 ×32
permuted Walsh-Hadamard measurement vector, and a series of M
128 ×128 measurement vectors above it. . . . . . . . . . . . . . . . . . . . . 6
1.2 Illustration of imaging system with DMD detail . . . . . . . . . . . . . . . . 8
1.3 Infrared camera setup on the optical table . . . . . . . . . . . . . . . . . . . 10
1.4 n= 256 ×256 images of charcoal-painted “IR” behind acrylic paint as
imaged by our compressive imaging system with λ= 1450 nm . . . . . . . . 11
2.1 Raster Scanning Hyperspectral Microscopy Setup . . . . . . . . . . . . . . . 16
2.2 A simple addition of our single-pixel camera to a standalone Zeiss
microscopesystem. ................................ 18
2.3 Reconstruction of n= 128 ×128 image of AF Test Target 1951 with δ=.95
and default TVAL3 parameters β , µ. ...................... 20
2.4 Method for determining best parameters for TVAL3 reconstruction. We see
that various choices for β, µ result not only in different quality
reconstructions, but also different reconstruction times (in seconds) for each,
as indicated below each reconstruction. . . . . . . . . . . . . . . . . . . . . . 21
2.5 The Raman spectrum of silicon dioxide substrate and graphite. A typical
measurement (compressive or not) in hyperspectral microscopy could be like
either one of these, or have two peaks together, or possibly contain many any
number of peaks corresponding to different materials present in the sample. . 27
ILLUSTRATIONS vii
2.6 Visible light view of the graphite flake on silicon substrate to be imaged via
theRamaneect.................................. 28
2.7 Visible light view of the graphite flake on silicon substrate to be imaged via
the Raman effect. The resolution is n= 33 ×33................ 29
2.8 Raman image recovered from simulated compressive measurements for some
values of δ=m/n. ................................ 30
2.9 Time for Rec PC to recover solution xas a function of 1 - δwhere δis the
subsampling ratio. The dependence is non-linear. . . . . . . . . . . . . . . . 31
2.10 Laser-illuminated compressive sensing experimental setup . . . . . . . . . . . 32
2.11 Images of the smallest target on the AF Test Target 1951-A. The bars are
2.2µm wide. Images taken with 100x/.9NA Zeiss EC Epiplan/Neofluoar
lens. Small field of view, high magnification. . . . . . . . . . . . . . . . . . . 32
2.12 An alternative laser-illuminated compressive sensing experimental setup . . . 33
3.1 ϕ(1) R1024 reshaped to 2D. White squares represent ϕ(1)
k= 1, black squares
represent ϕ(1)
k= 0, k = 1, . . . , n. ........................ 38
3.2 Four copies of the seed vector ϕ(1) patterned onto an optical plate. By
shifting a selection mask (represented by the red box) to select one
measurement vector at a time, we generate, or ‘select,’ measurement basis
vectors from Φ, reshaped to 32 ×32....................... 39
3.3 By shifting one row or column of the mask at a time, we can generate all
n=N×Nrows of a block circulant matrix Φ. The optical system is
identical to the DMD-based setup, where a lens L2 focuses an image of the
scene, represented by the arrow, onto the mask. The light that allows to
pass, corresponding to an ‘on’ pixel, or ϕij = 1, is collected by the lens L1
and directed towards the photodetector for measurement. . . . . . . . . . . . 40
ILLUSTRATIONS viii
3.4 Filled points in these plots indicate the location of the selection mask for
individual measurements in terms of row and column shifts. Thus, there are
more row shifts than column shifts for the sequential method and an equal
number of row and column shifts for the box method. The random path
shows some structure since the mask is only allowed to step one row or
column shift to generate the next measurement basis vector in the sequence,
andrandomisjustthat. ............................. 44
3.5 Difference between taking measurement vectors from Φsequentially (left
column) and according to a random path (right column) for a few
subsampling ratios. Note the reconstruction with random path measurement
vectors is relatively high quality even at a very low subsampling ratio δ.
DataacquiredbyLinaXu............................. 46
3.6 Relative mean square error (normalized squared difference between
reconstructed image for a given δand the one reconstructed with δ= 1) for
the four methods of generating the measurement basis Φ . . . . . . . . . . . 47
3.7 Time to solve the underlying optimization problem and recover an image for
various undersampling ratios, δ, for the four methods of generating the
measurementbasisΦ............................... 47
4.1 The crosspolytope in three dimensions. There are six vertices, or 0-faces,
twelve line segments, or 1-faces, and eight 2-faces, or what we commonly call
aface. ....................................... 55
4.2 (BP) n= 32 ×32,s............................... 58
4.3 (BP) n= 32 ×32,r............................... 59
4.4 (BP) n= 32 ×32,Φ has Gaussian entries . . . . . . . . . . . . . . . . . . . . 60
4.5 (LP) n= 32 ×32,s............................... 61
4.6 (LP) n= 32 ×32,r............................... 62
4.7 (LP) n= 33 ×33,s............................... 63
ILLUSTRATIONS ix
4.8 (LP) n= 31 ×31,s............................... 64
4.9 (LP) n= 30 ×30,s............................... 65
4.10 (LP) n= 16 ×16,s............................... 66
4.11 (LP) n= 27 ×27,s............................... 67
4.12 µ(Φ) for various resolutions and for random and sequential methods for a
series of values δ. Perhaps unexpectedly, the coherence for sequential-type Φ
is lower than for random-type, however the deviation from the mean is larger
for sequential than for random. . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.1 A series of 1-bit adaptive measurements. An alternative description would
be a binary search, where on each measurement we split the portion of the
DMD where the ‘on’ pixel could be and ask ‘Which half of this active space
hastheonpixel?. ............................... 74
5.2 Sequence of all log2(1024) = 10 32 ×32 binary adaptive measurement
vectors. The 0th measurement vector, where all the mirrors are ‘on’ is omitted. 75
5.3 Sequence of all 4 8 ×8 binary adaptive measurement vectors. The 0th
measurement vector, where all the mirrors are ‘on’ is omitted. Theoretically
mplog2n, however with only n= 64, we do not achieve this dramatic of
animprovement. ................................. 76
5.4 Sequence of all log2(log21024) = 3 8 ×8, 22igrayscale levels for the ith
measurement adaptive measurement vectors. The 0th measurement vector,
where all the mirrors are ‘on’ is omitted. . . . . . . . . . . . . . . . . . . . . 76
5.5 Results from sending 64 different gray levels for the DMD to display along
with the actual measurement recorded at the DMD. We quantified the
spectrometer reading by ‘sum’ and ‘max’, ‘sum’ meaning we summed over
all wavelength bins and divided by the number of pixels, and for ‘max’ we
took the maximum value over all wavelength bins as we did for the
measurements presented in the rest of the chapter. . . . . . . . . . . . . . . 77
Tables
5.1 Table showing expected and actual measurement values for the ith
measurement with the igray levels scheme, taken as the maximum of the
peak of a spectrometer reading. . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Table showing expected and actual measurement values for the ith
measurement, taken as the maximum of the peak of a spectrometer reading. 73
1
Chapter 1
Motivation and Introduction
1.1 Overview
The physical sciences, like the all aspects of society, are strained under the effects of the
“data deluge,” which could be described as our species’ overwhelming ability to acquire
or create data coupled with our relatively weak skill in extracting useful information from
that data. In a raster scanning, hyperspectral microscope, such as a darkfield, Raman, or
Fourier-transform infrared microscope, large amounts of data are collected, and oftentimes
acquisition is either time-intensive, or in the case of infrared microsocopy and spectroscopy,
monetarily expensive because exotic, non-silicon based detectors are required. This thesis is
a description, exploration, and exposition, of how a new theory in signal processing, known
as compressed sensing, can be applied to microscopy and other imaging systems in order to
minimize these and other costs. Compressed sensing (CS) is a mathematical jewel itself. A
groundswell of mathematical and engineering work has risen exploring the implications and
theoretical applications of this theory. Based on the applications it has already found, CS
could prove to be one of the most useful developments in mathematics so far this century.
In this chapter, we introduce some notation we will need to describe compressive imaging
systems, followed by the introduction of our compressive infrared imaging system where we
will further illustrate the connection between the measurement formalism and the physical
camera system. We will see that compressive imaging reduces to solving an underdetermined
set of equations, unsolvable by elimination methods. In order to solve these equations, we
1.2. INTRODUCTION TO THE MATHEMATICS OF COMPRESSIVE SENSING 2
regularize the problem, in other words, we apply a priori information so that the deficit of
information is sufficiently reduced, and we recover an image as we would have otherwise.
1.2 Introduction to the Mathematics of Compressive Sensing
1.2.1 Matrix representation of a system of equations
In order to describe our measurement systems mathematically, we need to be able to ef-
ficiently write large systems of equations. In the sequel, signal and image will be used
interchangably. Although there are philosophical differences, mainly that a signal can be
exactly known and recovered, but an image will always be an approximation to “true real-
ity,” reality itself being an point of philosophical debate. Generally, we use the word ‘signal’
when referring to an arbitrary xRnand ‘image’ when referring specifically to a signal
approximated by an imaging system.
We now demonstrate how to compactly write large systems of equations in matrix nota-
tion by means of a simple example. Consider the system of equations
3x1+ 2x2=5
3x19x2= 3
with x1, x2R.
This is represented in matrix form by
y= Φx(1.1)
where
y=
5
3
, x =
x1
x2
,and Φ =
3 2
39
1.2. INTRODUCTION TO THE MATHEMATICS OF COMPRESSIVE SENSING 3
This allows us to efficiently describe an arbitrary number of equations and arbitrary
number of unknowns,
yRm
y1
y2
.
.
.
ym
=
ϕ11 ϕ12 . . . ϕ1n
ϕ21 ϕ22 . . . ϕ2n
.
.
..
.
.....
.
.
ϕm1ϕm2. . . ϕmn
| {z }
ΦRm×n
x1
x2
.
.
.
xn
xRn(1.2)
is still just y= Φx. Using this matrix notation for a system of equations, we can describe
the acquisition of data that results in an image. For pixel-array or raster scan imaging,
Φ = I=δij,with δij =
1 if i=j
0 otherwise
,
or
I=
1 0 0 ··· 0
0 1 0 ··· 0
.
.
..
.
........
.
.
0 0 ··· 0 1
By simply reshaping the resulting vector yto be a rectangular 2D array, we recover an image
of the scene. Each row of Φ, which we denote ϕ(i), probes the ith discretized point of x. If
we regard xas the scene to be imaged, note that xis not discrete until we impose some grid
on it. The value yi=ϕ(i), xmeasures the brightness corresponding to each pixel. We call
ϕ(i)is the ith measurement vector. If we write a column of Φ as ϕc
iΦ = (ϕc
1, . . . , ϕc
n), then
1.2. INTRODUCTION TO THE MATHEMATICS OF COMPRESSIVE SENSING 4
we get the representation of the signal x,
y=ϕc
1x1+ϕc
2x2+. . . +ϕc
nxn=
n
X
i=1
ϕc
ixi.(1.3)
If Φ = I,
y1
y2
y3
.
.
.
ym
=
1
0
0
.
.
.
0
x1+
0
1
0
.
.
.
0
x2+· ·· +
0
0
.
.
.
0
1
xn.(1.4)
Thus we see that the measurements ymay be viewed as a weighted sum of the coumns of Φ
where the weights are the discretized points of the scene, x.
In this work, compressive imaging is achieved through techniques deriving from transform
imaging, where more than one pixel is probed at a time, or, in other words, each measurement
vector ϕ(i)has many 1s in it. In the example to follow, Pn
j=1 ϕ(i)
j=n
2, i = 1, . . . , m and
ϕ(i) {0,1}n. For compressive imaging, however, we do not have Φ Rn×n, but instead
ΦRm×nwith m<n. In other words we have less equations than unknowns. To quantify
the amount of undersampling, define the undersampling, or equivalently compression, ratio
δ=m
n.
The magic of compressive sensing is that we are still able to recover at least a good approx-
imation to x, if not xexactly, from the underdetermined set of equations arising from our
measurements y= Φx. Before explaining how to recover an image from these underdeter-
mined equations, here is an example of compressive imaging in action that also serves to
further solidify the notation to be used throughout the rest of this thesis.
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 5
1.3 Infrared Imaging via Compressive Sampling
As an example to prepare for the sequel in which we explore more advanced imaging systems,
we introduce the ‘Rice single-pixel camera.’ Specifically this camera is an infrared camera,
one of the useful applications of compressive sensing, especially for remote sensing and data-
fusion applications of compressive sensing where detector cost is not the only one to be
mitigated. Infrared imaging has important applications in missile technologies as is well-
known, and also for night vision and surveillance techniques. Predator drones, for example,
routinely acquire their targets for assassination with highly advanced infrared cameras [1]. A
more interesting application may be to combine multiple views for enhanced video sensing via
multiple compressive streams [2], and perhaps even do some sort of compressive data fusion
with radar imaging systems on drones to develop a full view of a scene, both indoors via
“through-the-wall radar imaging” [3] and outdoors. A further bonus is that data collected
via compressive imaging is naturally encrypted as well as copmressed with no on-board
computing. This will become more clear once we better understand compressive imaging.
1.3.1 Single-Pixel Camera General Setup
Here we begin our introduction of the infrared camera system by introducing the optical
element that displays the measurement vectors ϕ(i). This is the digital micromirror device
(DMD) from Texas Instruments, Inc., shown in Figure 1.1 displaying one measurement vector
with an illustration of a series of mmeasurement vectors, again corresponding to the mrows
of the measurement vector Φ. The white pixels correspond to ϕij = 1, or we say this is an
‘on’ pixel, and black corresponds to ϕij = 0, or an ‘off’ pixel.
As in Figure 1.2, we focus our target scene, in this case the blue painted card as in
Figure 1.3, onto the digital micromirror device (DMD). The DMD, has 1024 ×768 mirrors
with a diagonal length of 13.6µm, so the maximum resolution in pixels one can attain in
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 6
Figure 1.1 : Detail of the DMD along with a photograph of a DMD displaying a 32 ×32
permuted Walsh-Hadamard measurement vector, and a series of M128 ×128 measurement
vectors above it.
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 7
this setup is n= 768 ×1024. The size of the mirrors is compared with an ant leg in Figure
1.2. To acquire images at other resolutions, we operate mirrors in blocks to constitute one
pixel. The light from the scene is collected by a lens and focused onto the DMD. The DMD
displays reshaped rows of the measurement matrix Φ, the photodetector measures converts
light intensity to a voltage that is sent and stored on computer. The ith voltage measurement
serves as the inner product yi=ϕ(i), xto yield the set of measurements y= Φx. An ‘on’
pixel, or ϕij = 1, directs light towards the detector, and an ‘off pixel directs light away from
the detector. The mirrors are fixed to only flip ±12away from parallel with the face of the
DMD. Thus, as in the transform imaging case described above, the DMD encodes the scene,
and the photodetector measures the total intensity of light reflected towards the detector
from the DMD for each measurement vector.
Our measurement matrix is a partial permuted Hadamard measurement matrix, which
we write as
Φ = RP Sn(1.5)
where
Sn=f(Hn) (1.6)
where Hnis the Hadamard matrix of order nand f is a function on ARn×n,aij being the
elements of Asuch that
f(aij) =
1 if aij = 1
0 otherwise
(1.7)
This follows, but is not identical to, the construction of Smatrices” in Harwit and Sloane [4].
The purpose is to allow us to use Hadamard matrices with a single detector, meaning our
measurement system can only implement measurement matrices Φ with elements ϕij
{0,1}.Pis an operator that permutes the columns of Snand Rselects a set of rows
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 8
Figure 1.2 : Illustration of imaging system with DMD detail
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 9
indexed by the entries of the set Ω. We take ||=mwhich determines the number of
measurements we acquire, and traditionally we have set to be mintegers taken at random
from the set of natural numbers less than or equal to nwithout replacement. Compressive
sensing theory so far yields stronger guarantees for signal recovery when there is some element
of randomness in the measurement vectors, thus the random permutations of the columns
and then random selection of the rows [5,6].
Once we have the measurement matrix Φ we sequentially display each row, or measure-
ment vector, from Φ, collecting a measurement yifor each measurement vector. Once we
have displayed all m ϕ(i)to acquire the set of equations y= Φx, we employ the TVAL3 re-
construction algorithm to recover an image. We implement this scheme as shown in Figure
1.3 and as illustrated in Figure 1.2.
As in Figure 1.2, we focus our target scene, in this case the blue painted card as in
Figure 1.3, onto the digital micromirror device (DMD). The blue square on the card is acrylic
paint. Below the acrylic paint are the letters “IR” in charcoal. We illuminate the card with
an array of 1450nm infrared light emitting diodes. Some of that light is able to penetrate
the acrylic paint to either reflect off the card or be absorbed by the charcoal letters. The
light reflecting from the scene is imaged onto the DMD, then collected by another lens and
the total intensity of light coming from the DMD is measured for mdifferent measurement
vectors by a Hamamatsu photodetector (model no. G6122) sensitive to wavelengths from
1–2 µm and peak sensitivity at λ= 1.95 µm.
The results of some compressive measurements for δ=.1 and δ=.075 are shown in
Figure 1.4.
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 10
Figure 1.3 : Infrared camera setup on the optical table
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 11
δ.1, m= 6550 δ.075, m= 4900
Figure 1.4 : n= 256 ×256 images of charcoal-painted “IR” behind acrylic paint as imaged
by our compressive imaging system with λ= 1450 nm
1.3.2 Image Recovery with TVAL3
After we acquire the measurements, y, the challenge is to recover an image xthat represents
the scene x. To do so we regularize the problem, in this case by solving
(TV) min
x
n
X
i=1 kDixk2subject to y= Φx(1.8)
where Diis a “local finite-difference operator” such that DixR2. Regularization is the
mathematical process of applying a priori information, in this case knowledge that the
solution xshould have a small total variation (TV), Pn
i=1 kDixk2. This is one of many
possible methods of regularization. We will explore two alternative methods in Chapter 4.
To solve this we use TVAL3 of Li, Yin, and Zhang, which stands for total variation
minimization by augmented lagrangian and alternating direction algorithms [7]. TVAL3
traces its roots back to a seminal paper by Courant in 1943 on the quadratic penalty method
1.3. INFRARED IMAGING VIA COMPRESSIVE SAMPLING 12
[8]. Physicists will be familiar with the general principle of TVAL3, namely the augmented
Lagrangian, which is virtually identical to the Lagrangian formalism of dynamics. The
augmented Lagrangian method is used to solve the problem of Equation 1.8, (TV), as follows.
First, rewrite (TV) in the equivalent form
min
x,wi
n
X
i=1 kwik2subject to y= Φxand Dix=wi.(1.9)
The augmented Lagrangian for the rewritten problem is given by
L(x, λ, ν) =
n
X
i=1 kwik2νT
i(Dixwi)+ βi
2kDixwik2
2λTxy)+ µ
2kΦxyk2
2.(1.10)
TVAL3 solves this problem through a so-called “alternating direction algorithm” developed
by Wang, Yang, Yin, and Zhang specifically designed to solve TV minimization problems
in imaging [9]. For further discussion of TVAL3 including the augmented Lagrangian for-
malism, see [10] and [11]. Briefly, the algorithm finds a new approximation for xgiven
Lagrange multipliers λand ν, and then finds optimal Lagrange multipliers with the updated
x, which then are used in another iteration to find a new approximation for x, and so on
until ∇L(x, λ, ν)< εtol , where εtol is a user-defined tolerance parameter, which says that
L(x, λ, ν) has reached its global minimum, guaranteed by a global convergence theorem [11].
We used the TVAL3 algorithm with the default options for the coefficients βiand µin
Equation 1.10, and default stopping tolerance εtol to obtain the 256 ×256 images in Figure
1.4. We achieve very high compression ratios, with subsampling ratios δ=.1 and δ=.075,
corresponding to m= 6550 and m= 4900 respectively.
1.4. OTHER APPLICATIONS AND OUTLINE OF THE THESIS 13
1.4 Other Applications and Outline of the Thesis
By way of the above example, it should be clear that compressive sensing is a powerful
method to reduce the resources required to acquire information. There are many applications
for compressive sensing outside of imaging. Much of the groundwork for compressive sensing
had been laid down before the breakthroughs by Cand`es, et al [5] and Donoho [6]. The
paradigm shift came out of the recognition that successful recovery of a signal depends on
a quantifiable dependence between how we acquire a signal and the structure of that signal.
Compressive sensing grew out of efforts to find the simplest accurate representation of a
signal.
Since CS guides one to acquire more information with less resources, it has found a wide
range of applications in science and engineering. Because measurements are already in a
compressed form no on-board compression is required. As such, there have been proposed
CS systems for astronomy [12] and hyperspectral remote sensing [13,14]. Other examples in
physics include radio astronomy-based cosmology [15], and quantum state tomography [16],
which allows one to more accurately determine the state of a collection of, for example,
electron spins, which has applications in quantum computing.
Compressive sensing has also been applied to medicine and biology. CS holds much
promise to improve magnetic resonance imaging because of the greatly reduced acquisition
time it allows [17, 18]. Thus, when the patient is less able to control their movements, as
with children, or if time is of the essence as it often is in medicine, CS techniques could be
of immense benefit.
Shental, et al [19], identify carriers of “rare variants” of disease via CS techniques applied
to group testing. Erlich, et al [20], applied a similar method to identify genetic disease
in Ashkenazi Jews, and then extended this work to “Compressed Genotyping” to identify
genetic variation of any sort in any number of individuals [21]. Machines known as DNA
1.4. OTHER APPLICATIONS AND OUTLINE OF THE THESIS 14
microarrays determine the sequence of base pairs in DNA. Dai, et al., [22] and Sheikh,
et al., [23] both explore the application of compressive sensing to DNA sequencing with
DNA microarrays. One of the most interesting applications of compressive sensing straddles
the disciplines of biology and physics. AlQuarishi and McAdams suggest a method for using
compressive sensing to effectively learn a physical model for protein-DNA interactions, which
could have important applications in drug delivery, disease treatment, and fundamental
genomics [24].
In the sequel, we explore the application of compressive sensing to microscopic imaging
in Chapter 2, introduce a novel compressive imaging system based on circulant matrices in
Chapter 3, and then study numerically the efficacy of circulant matrices for general com-
pressive sensing in Chapter 4. The final chapter will be dedicated to final thoughts and
future directions with some preliminary data from adaptive sensing experiments, where the
ith measurement vector depends on the result yi1=ϕ(i1), x. Although the work here is
dedicated to imaging, many of the results might have parallel applications or provide insight
into problems from the varied fields of application mentioned above, espeically Chapter 4.
15
Chapter 2
Compressive Microscopy
Now we turn our focus from novel imaging architectures and their characterization to con-
crete applications to real-world imaging systems. Raman imaging is a prototypical example
of a measurement system that benefits substantially from reducing the time required to ac-
quire a Raman microscopic image. In our lab, and as is common in other laboratories as
well, high-resolution raster scan Raman images require a few to tens of hours of acquisition
time. Each location must be probed individually, resulting in a measurement akin to that
described in . Below we will discuss in more detail the nature of acquiring data for Raman
microscopy.
2.1 Compressive Microscopy
In this section, we describe some various forms our microscope system could take, as well
as the measurement formalism we’ll need to describe them. To do that we first introduce
the standard, non-compressive raster scanning system. Then we will introduce compressive
microscope systems, including a simple method to choose the best parameters for TVAL3,
first introduced in Section 1.3.2, that can easily be extended to other imaging systems.
2.1.1 Raster Scanning Microscope Systems
A typical, simplified raster scanning microscope setup is shown in Figure 2.1. A laser is sent
through a beamsplitting mirror through the back of an objective lens which focuses the laser
light ideally to a diffraction-limited point, reflected and/or scattered light is collected by the
2.1. COMPRESSIVE MICROSCOPY 16
Figure 2.1 : Raster Scanning Hyperspectral Microscopy Setup
same objective lens, then directed back through the beamsplitter to either a photodetector
in the case of standard imaging, or a spectrometer for hyperspectral (i.e. beyond only three,
red, green, and blue, color channels) imaging. We use the same idealized measurement
formalism as before. In the simple raster scan imaging system we have the measurements,
y, given as
y=Ix, (2.1)
where Iis the n×nidentity matrix. As in a pixel array, each discretized point in the
scene is sampled by the measurement vector Φ = I, the measurement yiis proportional
to the number of photons registered by the detector during the acquisition time,tacq . In
reality it is stored on computer as the voltage reading from a photodetector caused by the
dislocation of electrons by incident photons. To physically sample each point in the image,
the sample stage moves in the “x-y” directions (not to be confused with xand yvectors
2.1. COMPRESSIVE MICROSCOPY 17
from our measurement formalism) in discrete steps, the size of which determine the spatial
resolution of the image. All that is needed to recover an image from the set of measurements
y= Φxwhen Φ is the identity is a proper accounting of what point was illuminated when.
If instead we are performing hyperspectral microscopy, instead of yRn, we have y
Rn×nspec , where nspec is the resolution of the spectrometer. The spectrometer works like a
prism to disperse the incoming light scattered or reflected by the sample, and then measure
how much light at a set of discrete wavelengths is present at each point. However, in fact
the light is dispersed by a diffraction grating, a reflective optical element cut with grooves.
Each wavelength of light reflects at a different angle, causing the prism-like dispersion. A
highly rectangular CCD pixel array is calibrated so that the light striking a portion of it is
registered as, for example, λ= 632.5nm. The Ocean Optics USB4000 spectrometer we use
is sensitive to light from 200-1100 nm with a Toshiba TCD1304AP Linear CCD array that
has a resolution of 3648 pixels, or 3648 possible wavelength bins. Although the spectrometer
is sensitive to light in this wide wavelength range, oftentimes a higher resolution in λis
exchanged for a smaller range of wavelength values to be probed. This technique is useful for
Raman imaging, discussed below, but also for fluorescence microscopy, a popular technique
used extensively in biological and medical research that allows for identification of various
parts of a cell by functionalizing fluorophores, proteins that absorb then emit at specific
wavelengths, to attach to specific parts of a cell. Before we discuss the Raman effect and its
usefulness in more detail, we describe how we modify a white-light illumination microscope
to employ compressive sensing, as well as some calibration data. We will introduce two
options for a microscope setup for laser-illumination in Section ??.
2.1. COMPRESSIVE MICROSCOPY 18
Figure 2.2 : A simple addition of our single-pixel camera to a standalone Zeiss microscope
system.
2.1.2 Compressive Microscopy Setup
As introduced in the introduction, we will acquire measurements y= Φxwith Φ being
a randomly permuted and subsampled Hadamard matrix as given in Equation 1.5. The
experimental setup is essentially the same, except our image is collected using microscope
optics. The simplest method for implementing compressive microscopy is to simply collect
light from a prebuilt microscope, as shown in Figure 2.3. Here, an image collected by the
internal optics of the microscope are projected out from the exit aperature, collected by
a lens, then sent to the DMD via a rotated mirror. Then an eyepiece collects the light
2.2. RAMAN IMAGING 19
corresponding to yi=φ(i),xand focuses it down the photodetector, which, as discussed
previously, converts light intensity to an analog voltage signal, which is then converted to
digital at the analog-to-digital converter (ADC), and saved on the computer.
Some calibration images of the standard Air Force test target AF-1951 taken with this
setup are shown below. With δ=.95 and again the default parameters for βand µwe
acquired the image in Figure ??. The smallest bars are 2.2µm wide, which sets the field
of view to about 40µm2. The image is fair quality, at best. In fact, especially in low-
light situations, the default, or even the recommended range, of parameters is not optimal.
To determine the best choice of βand µ, we solve and plot a series of solutions xto the
optimization problem (TV) in Equation 1.8. Then, either using some mathematical heuristic
or simply by visual inspection, one may choose the optimal parameters.
2.2 Raman Imaging
2.2.1 The Raman Effect
The Raman effect was first discovered by Indian physicist C.V. Raman in 1929. Briefly, the
Raman effect is an optical, quantum mechanical effect that probes the vibrational properties
of a material, be it in the gas, liquid, or solid phase. As is common knowledge, some of the
light reflected from a material is the same wavelength as the incident light; the technical
term is Rayleigh scattering. However, if the material is ‘Raman-active,’ there will also be
other wavelengths of light ‘reflected’ as well. More accurately, this light is scattered instead
of reflected, and we say it is Raman-scattered light. For readers familiar with fluorescence,
this may sound like fluorescence, but it is not the same effect. The only true similarity is
that both effects are mediated by the absorption of quanta of light, photons, by the negative
charge carriers of a material, electrons.
One of the most basic principles of physics, the conservation of energy, is the route by
2.2. RAMAN IMAGING 20
Figure 2.3 : Reconstruction of n= 128 ×128 image of AF Test Target 1951 with δ=.95
and default TVAL3 parameters β , µ.
2.2. RAMAN IMAGING 21
Figure 2.4 : Method for determining best parameters for TVAL3 reconstruction. We see that
various choices for β, µ result not only in different quality reconstructions, but also different
reconstruction times (in seconds) for each, as indicated below each reconstruction.
2.2. RAMAN IMAGING 22
which Raman scattering occurs. Photons of varying wavelength correspond to photons of
different energy. A beam of light consisting of 100 ultraviolet photons is more energetic
than a beam of light with 1,000 infrared photons. However, in optics we would say that
the infrared beam is more intense, since there are more photons. The energy Eof a single
photon is directly proportional to the frequency of the photon, ω,
E=~ω(2.2)
In the Raman effect, we have that a beam of incident light of a certain energy, i.e. ‘color’,
enters the material and light of a different energy enters. Therefore, energy was lost or
gained. With the Raman effect it is more likely to observe a lower energy, or red-shifted,
photon, so let’s assume energy was lost. Where did this energy go?
The answer is that the light stimulated vibrations in the material. To understand this
better, let’s consider the diatomic molecule, say O2for example, as a pair of weights on a
spring. The nuclei are the weights, and the chemical bond (sharing of two electrons between
the nuclei) is the spring. The spring has a stiffness, K, corresponding to the strength of the
chemical bond. Assume that light, written as an electric field
E(t) = E0cos(ωt +δk),(2.3)
is incident on the molecule. In general the polarization vector of a diatomic molecule may
be written as
p=αE(2.4)
where αis the polarizability, or susceptibility to a change in polarization. In general it is an
arbitrary tensor, however in this case we assume it acts in only one dimension and we take
2.2. RAMAN IMAGING 23
it to only linear order,
α(x) = α(0) +
dx x=0
x(2.5)
We assume that xis the solution of a simple harmonic oscillator, so that
x(t) = Acos(ω1t) (2.6)
where
ω1=sK
µ(2.7)
and µis the reduced mass of the two nuclei. Combining Equations (2.3), (2.4), and (2.5),
we get
p(t) = α0E0cos(ωt +δk) + 0(0)E0cos(ωt +δk) cos(ω1t) (2.8)
Using a trigonometric identity we write
p(t) = α0E0cos(ωt +δk) + 1
20(0)E0{cos (ωω1)t+δk+ cos (ω+ω1)t+δk}(2.9)
Thus the polarization changes as a function of time, meaning that we have the acceleration of
charge with at two natural frequencies, ω, corresponding to the Rayleigh scattering (common
reflection) and the Raman scattered light at frequency ωR=ω±ω1[25]. The minus sign
says that light is scattered at a frequency less than the incident frequency, so with energy
E=~(ωω1).
Since in terms of color, this is a shift towards the ‘red’ end of the electromagnetic spectrum,
we call this photon red-shifted. The plus sign corresponds to a blue-shift, or equivalently a
2.2. RAMAN IMAGING 24
gain in energy for the incident photons,
E=~(ω+ω1).
The photons red-shifted via the Raman effect are called Stokes-shifted, and the blue-shifted
photons are called anti-Stokes, after the famous English nineteenth century physicist Sir
George Stokes. There is a roughly 106chance that the incident photon will be Stokes-
scatterd, and about a 108chance that an incident photon will be anti-Stokes scattered.
Thus, the fact that Raman originally observed this previously anomolous effect with modest
optics and sunlight is quite amazing. Today, Raman spectroscopy and microscopy is per-
formed with nearly single-frequency laser light, an array of precision optics, and cooled CCD
spectrometers. Although it is such a weak effect, it is a very powerful method for probing
the chemical structure of a sample, as we will see in the following section.
2.2.2 Determining Chemical Structure from the Raman Spectrum of a Material
The astute and informed reader will note that in order for the above derivation to be valid,
the first derivative of the polarizability, α0(0) =
dx |x=0 6= 0. This defines a so-called selection
rule for Raman scattering—the first derivative of the polarizability with respect to space
must be nonzero. In the case of vibration of the O2molecule, α0(0) = 0 and so there
actually is no Raman scattering from vibrations of O2. However, one of the first important
applications of Raman scattering was to show that there are other carbon dioxide, CO2,
had hitherto unknown vibrational modes, or in terms of Equation 2.9, unknown ω1[26, 27].
Our above derivation accounted only for vibrations in one dimension, but as also mentioned
above, polarizability, αis a tensor in general, so there could be nine total nonzero partial
first derivatives with respect to space, and rotational modes are also allowed, so that if
an incident photon causes a molecule to rotate, that might also be reflected in the Raman
2.2. RAMAN IMAGING 25
spectrum through a blue- or red-shifted peak in the Raman spectrum.
The example of CO2is also important for us because before the Raman spectrum was
acquired for that molecule, its infrared (IR) spectrum, which also tells us about vibrational
modes, but for a different set of selection rules, was known. The vibrational modes discovered
by Raman spectroscopy differed from those discovered by infrared spectroscopy, but the
vibrations were on the same order of energy. This illustrates how Raman spectroscopy
allows one to use light in the visible part of the electromagnetic spectrum, with wavelength
λ400 700nm, to probe energies in that would correspond to photons in the infrared,
corresponding to .8µmλ100 µm and larger. To detect such photons requires more
exotic light sources as well as more exotic detectors compared to visible light. Thus, Raman
spectroscopy provides a simpler, complementary, method for investigating the vibrations of
molecules.
Raman spectroscopy is not limited to probing vibrations of molecules. It is also possible
to determine the vibrational modes of solids, known as phonons, as in quanta of sound, just
like the photon is a quantum of light. We model a solid as a lattice of masses connected by
springs instead of just two or three masses connected by springs in the case of molecules in
either the gaseous or liquid state. When light is incident on the lattice, it either reflects as in
Rayleigh scattering, or it creates a phonon, which again is a vibration that travels through
the solid, and a lower-energy phton. It is sensible to call this a quanta of sound, because it
is precisely vibrations of nuclei, transmitted through electron-electron interactions, that is
responsible for the majority of thermal conductivity and transmission of sound in materials.
Among the many important applications of Raman scattering in solids are stress and
strain analysis for silicon technologies and for investigating the properties of graphene and
carbon nanotubes, including how many layers of graphene are present in a graphene sample
and also how many layers comprise a nanotube or what diameters of nanotubes are present
2.2. RAMAN IMAGING 26
in a sample. We choose these two examples because, as a proof of concept of the applicability
of compressive sensing techniques to Raman microscopy, we show that compressive sensing
is indeed effective for laser-illumination based microscopy, and that through simulations on
actual Raman microscopy data, compressive sensing may be used to discriminate graphite
from silicon, which suggests it might be effective for more complex samples.
2.2.3 Raster Scanning Raman Microscopy
Raman microscopy is a powerful experimental technique to determine the spatial distribution
of substances in a sample. We will show a simple example of this with experimental data
acquired on the commercial Renishaw Raman microscopy system, followed by two proposed
architectures for laser-illuminated microscopy with the DMD. The example is graphite on
silicon dioxide, which is a toy model for a more interesting system, graphene on silicon
dioxide. Graphene is a single layer of carbon atoms in a 2D lattice, however the term graphene
is also used to describe more than one layer stacked on top of one another. It seems that
it is graphene until about ten layers, then graphene just becomes graphite. Graphite and
graphene share similar features in their Raman spectrum since both are carbon allotropes,
and in fact Raman spectroscopy is one method for determining whether a carbon sample
is single-layer, double-layer, or more-layered graphene or just simply graphite. Graphene is
a widely-studied material because of its novel conduction properties, strength, and overall
novelty, and was the subject of a recent Nobel prize in Physics. Silicon also has two prominent
peaks in its Raman spectrum, and both the Raman spectrum of silicon and of graphite are
shown in Figure 2.5. We focus only on the so-called ‘G’ peak of graphite with wavenumber
k1590 cm-1, and the well-known silicon peak at k= 520 cm-1 .
We image the boxed region with graphite flake shown in Figure 2.6 by sensing how intense
the Raman ‘G’ peak is. If the peak is not there, then we know that it is the silicon substrate
2.2. RAMAN IMAGING 27
Silicon Raman spectrum, k= 520 cm-1 peak Graphite Raman spectrum, ‘G’ peak
Figure 2.5 : The Raman spectrum of silicon dioxide substrate and graphite. A typical mea-
surement (compressive or not) in hyperspectral microscopy could be like either one of these,
or have two peaks together, or possibly contain many any number of peaks corresponding
to different materials present in the sample.
at that point, and if we observe the ‘G’ peak then we know it is graphite. By raster scanning
over the sample and acquiring a spectrum as in Figure 2.5, as explained in Section 2.1.1, we
can build a Raman image of the region of interest. The Raman image acquired by raster
scanning is shown in Figure 2.7. The brighter the pixel, the larger the maximum value of
the ‘G’ peak. To test the efficacy of compressive imaging for Raman imaging, we simulate
compressive acquisition where Φ 6=Iand Φ Rm×nwhere m < n. Instead of having Φ be
randomly permuted Walsh-Hadamard vectors, Φ are partial circulant matrices as explained
in Chapter 3. We hold off further discussion of the details of partial circulants for now. To
reconstruct an image from measurements taken with partial circulant Φ, we use the Rec PC
algorithm of Yin, et al., [28], also to be futher discussed in Chapter 3. We only need to say
here that we used default settings for user-defined parameters and that the reconstruction
algorithm recovers the solution xaccording to the same problem (TV) in Equation 1.8.
We look at the reconstruction of xthe compressive measurements for a few values of δin
Figure 2.8. Data on convergence across a range of δvalues is given in Figure 2.9. We see
2.2. RAMAN IMAGING 28
Figure 2.6 : Visible light view of the graphite flake on silicon substrate to be imaged via the
Raman effect.
that computational resources increase non-linearly with δ, as expected.
2.2.4 Laser-illuminated Compressive Sensing Microscope System
To finish this chapter, we present preliminary data from our home built copmressive sensing
microscope system, the optics on the table in Figure 2.10. With this setup we obtained the
images shown in Figure 2.11.
With the proper equipment, it is clearly possible to acquire Raman microscopic images
in a fraction of the time it would take with traditional methods. Because of some not yet
understood spectral phenomena arising from the diffractive properties of the DMD, it may
be better to construct the microscope as we have, shown in Figure 2.12, where instead of
implementing the measurement vectors by patterning light coming from the scene, the DMD
structures laser light sent to illuminate the scene, which is totally mathematically equivalent.
2.2. RAMAN IMAGING 29
Figure 2.7 : Visible light view of the graphite flake on silicon substrate to be imaged via the
Raman effect. The resolution is n= 33 ×33
2.2. RAMAN IMAGING 30
Figure 2.8 : Raman image recovered from simulated compressive measurements for some
values of δ=m/n.
2.2. RAMAN IMAGING 31
Figure 2.9 : Time for Rec PC to recover solution xas a function of 1 - δwhere δis the
subsampling ratio. The dependence is non-linear.
2.2. RAMAN IMAGING 32
Figure 2.10 : Laser-illuminated compressive sensing experimental setup
AF Target Bars, n= 128 ×128 AF Target Bars, n= 256 ×256
Figure 2.11 : Images of the smallest target on the AF Test Target 1951-A. The bars are
2.2µm wide. Images taken with 100x/.9NA Zeiss EC Epiplan/Neofluoar lens. Small field of
view, high magnification.
2.2. RAMAN IMAGING 33
Figure 2.12 : An alternative laser-illuminated compressive sensing experimental setup
2.2. RAMAN IMAGING 34
Studer, et al. [29], employ essentially the same setup for fluorescence microscopy for biology.
35
Chapter 3
Circulant Matrices for Compressive Imaging
In this chapter we explore the use of circulant matrices for imaging. By employing measure-
ment matrices Φ whose mrows are taken from an n×ncirculant or block-circulant matrix
we will denote Φ. We will define a circulant matrix mathematically, explain their utility,
and show how circulant matrices can result in a more versatile, efficient imaging system.
Much of this material will also serve as an introduction for Chapter 4. The work presented
in Chapter 4 was motivated by the results of our imaging experiments presented in this
chapter.
3.1 Theory of circulant matrices for imaging
3.1.1 Properties of circulant matrices
Circulant matrices “underpin elementary harmonic analysis” (Aldrovandi, 2001 [30]) because
of their special relationship to the Fourier transform. This relationship enables us to more
carefully design our measurement matrices, but maintain a fast matrix-vector multiply in
the form of the Fourier transform. To see why, let us explicitly write a circulant matrix.
A circulant matrix is a matrix CRn×nwith entries tiR, i = 0,1,··· , n 1 such
3.1. THEORY OF CIRCULANT MATRICES FOR IMAGING 36
that
C=
t0tn1tn2··· t1
t1t0tn1··· t2
t2t1t0··· t3
.
.
..
.
..
.
.....
.
.
tn1tn2tn3··· t0
.(3.1)
Such a matrix is also sometimes referred to as a convolution matrix. To see why, consider
c= (t0, tn1, . . . , t1)
so that cRn. Define the (circular) convolution operator for vectors a, b Rnto be
(ab)k=
n1
X
i=0
aibki, k = 0,1, . . . , n 1 (3.2)
then we see that for xRn
Cx =cx(3.3)
This fact allows a fast matrix-vector multiply on a binary computer via the fast Fourier
transform (FFT). Let FRn×nbe the Fourier transform matrix
Fjl =1
ne2πi (j1)(l1)/n (j, l = 1,2, . . . , n).
Now apply the identity I=F1Fto the right hand side of Equation 3.3 and use the
convolution rule for the Fourier transform to obtain
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 37
Cx =cx
=F1F(cx)
=F1(F c)(F x)
=F1DF x
where D= diag(λ1, λ2, . . . , λn) = diag(λ) is not only the Fourier transform of the vector c,
but also the eigenvectors of the matrix C, which reveals one more remarkable property of
circulant matrices, namely they are diagonalized by the Fourier transform, or, equivalently,
the eigenvectors of a circulant matrix are the columns of the Fourier matrix, F. For more
details see [30–32]. For an interesting application to machine multiplication for two numbers
with arbitrary digits and precision, see Knuth, 1981 [33].
So, the utility of circulant matrices for compressive sensing should now be clear. All we
need to do is define a seed vector c, compute its Fourier transform which gives us λ, and
perform two FFTs modulated by the entries of λ. Thus, the computation of the matrix-vector
multiply Cx will not take O(n2) operations, but instead O(nlog(n)) operations. Even just
for n= 128×128 = 16384, the resolution of the images we present below, we see a substantial
decrease in the number of operations since n22.7×108and nlog(n)2.3×105, three
orders of magnitude difference. This savings is essential for an efficient recovery algorithm.
3.2 Imaging with Partial Circulant Measurement Matrices
In this section we follow the structure of Chapter 1 and introduce the imaging system before
diving too deeply into the mathematics. The advantage we gain from circulants for imaging
is that we may pattern four copies of the first row, or seed vector,ϕ(1), of the measurement
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 38
Figure 3.1 : ϕ(1) R1024 reshaped to 2D. White squares represent ϕ(1)
k= 1, black squares
represent ϕ(1)
k= 0, k = 1, . . . , n.
matrix Φ onto an optical plate. An n= 32 ×32 example of ϕ(1) is shown in Figure 3.1.
By shifting the plate, we generate rows of a block circulant matrix, to be explained in more
detail below.
3.2.1 Imaging Setup
The imaging setup is identical to those introduced in the previous chapters, except now
instead of the DMD directing light towards or away from a photodetector via reflection, we
have a optical plate patterned with a mask to either allow the light to pass through or block
light from a pixel. This corresponds to ϕij {1,0}, again with a 1 being represented by
white in Figures 3.1 and 3.3, and furthermore, as before, ϕij = 1 is an element that allows
light to pass to the detector, and ϕij = 0 is an element that blocks light.
On a single optical plate we pattern four copies of ϕ(1) to make a 2N×2Npixel grid. By
overlaying an N×Nselection mask, shown in dashed green in Figure 3.2, we can generate
all nmeasurement vectors by moving the selection mask. Each selected N×Nsquare
corresponds to a row of Φ, as illustrated in Figure 3.2. In practice, we would not actually
move the selection mask since this would also entail moving the photodetector. Instead we
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 39
move the optical plate itself, as illustrated in Figure 3.3.
ϕ(1) ϕ(14) ϕ(32)
Figure 3.2 : Four copies of the seed vector ϕ(1) patterned onto an optical plate. By shifting
a selection mask (represented by the red box) to select one measurement vector at a time,
we generate, or ‘select,’ measurement basis vectors from Φ, reshaped to 32 ×32.
We define
{0,1, . . . , N 1}×{0,1, . . . , N 1}= (3.4)
to represent the number of column shifts and row shifts of the optical plate used to acquire
the measurements y. If we acquire mmeasurements yi=ϕ(i), x, then ||=m. There are
four different methods of creating that we explore in this work,
1. Sequential: Starting with the selection mask in the lower left corner, shift the selection
mask one row at a time, N1 times. Then shift by one column and again perform
N1 row shifts. Repeat until munique regions are selected by the selection mask.
This is illustrated in Figure 3.2.
2. Box: Do an equal number ( dme) of row and column shifts.
3. Random: Select m ωiat random.
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 40
column
shift
row
shift
photodetector
L1 L2
column
shift
row
shift
photodetector
L1 L2
Figure 3.3 : By shifting one row or column of the mask at a time, we can generate all
n=N×Nrows of a block circulant matrix Φ. The optical system is identical to the
DMD-based setup, where a lens L2 focuses an image of the scene, represented by the arrow,
onto the mask. The light that allows to pass, corresponding to an ‘on’ pixel, or ϕij = 1, is
collected by the lens L1 and directed towards the photodetector for measurement.
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 41
4. Random Walk: Restrict the plate to shift one row or column at a time, but take the
step at random. The result is a random walk. At this time we do not optimize for
self-crossings and require that munique points are generated isntead of msteps taken.
3.2.2 Formally Describing How to Build Φfrom Φ
Recall the restriction operator Rfrom Section 1.3 that selects the rows indexed by the set
Ω, for example if = {1,2}then
RA=R
a11 a12 ··· a14
a21 a22 ··· a24
.
.
..
.
.....
.
.
a41 a42 ··· a44
=R
a(1)
a(2)
a(3)
a(4)
=
a(1)
a(2)
.(3.5)
Thus, we can compactly write building our measurement matrix Φ Rm×nfrom a circulant
matrix ΦRn×nas
Φ = RΦ.(3.6)
In our application where corresponds to coordinates of row and column shifts of the
optical plate, we cannot directly apply Ras above. Here instead the coordinates in define
which row of the block circulant matrix Φwill be taken. To see how this works, let us write
the matrix
M=
a b c a b c
e f g e f g
h i j h i j
a b c a b c
e f g e f g
h i j h i j
,(3.7)
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 42
a 3 ×3 analogue of the pattern on the optical plate. In this toy example, then,
ϕ(1) =
a b c
e f g
h i j
,
which we have indicated by coloring it red in Equation 3.8. Of course ω1= (0,0), no shifts.
If we implement the sequential method beginning with a column shift, the selection mask
would next select the red elements,
M=
a b c a b c
e f g e f g
h i j h i j
ab c a b c
ef g e f g
hi j h i j
,(3.8)
meaning
ϕ(2) =
b c a
f g e
i j h
and ω2= (0,1). Continuing on we have
ϕ(3) =
c a b
g e f
j h i
.
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 43
If we want to reshape these as rows of Φ to implement y= Φxwe would have
Φ =
a b c e f g h i j
b c a f g e i j h
c a b g e f j h i
.
Define
α=
a b c
b c a
c a b
, β =
e f g
f g e
g e f
,and γ=
h i j
i j h
j h i
Then
Φ = α β γ
Continuing like this we can write the matrix of all reshaped measurement vectors generated
by such shifts as
Φ=
α β γ
γ α β
γ β α
.(3.9)
Just as Cwas called a convolution matrix in 1D, this matrix Φis a 2D convolution matrix.
Φis not circulant as with 1D, but block circulant. We still have
Φx=F1DF x
as discussed above, however now Fand F1are the 2D Fourier and inverse Fourier transform
[34].
3.2. IMAGING WITH PARTIAL CIRCULANT MEASUREMENT MATRICES 44
Sequetial Box
Random Path Random
Figure 3.4 : Filled points in these plots indicate the location of the selection mask for
individual measurements in terms of row and column shifts. Thus, there are more row shifts
than column shifts for the sequential method and an equal number of row and column shifts
for the box method. The random path shows some structure since the mask is only allowed to
step one row or column shift to generate the next measurement basis vector in the sequence,
and random is just that.
3.3. IMAGING RESULTS 45
3.3 Imaging Results
In this section we show results from imaging with each of those four methods. Our imaging
system is the same as in Section 1.3, except the illumination source is different. We use a
broadband lamp here. The Hamamatsu detector is the same. We use the digital micromirror
device (DMD) to simulate the mask motions as a proof-of-concept for this imaging scheme.
In order to recover an image from the measurements y= Φx, we use the Rec PC (PC
stands for ‘partial circulant’) algorithm of Yin, et al., [28]. Essentially it is the same as
TVAL3, but with some modification, most notably to accommodate the circulant measure-
ment matrices. One other difference is it offers an explicit handling of both total variation
minimization and `1minimization, along with the usual fidelity constraint. Rec PC finds x
such that
(Rec PC) x= min
xα
n
X
i=1 kDixk+µ
2kΦxyk2.(3.10)
For the images below, we set α= 102and µ= 1. Again, the operator DiR2×nis a
discrete gradient operator.
As shown in Figure 3.5, random outperforms the sequential one. Furthermore, on
comparison of all four methods introduced here, the relative error, defined as
Relative Error = kx
δ=1 x
δk2
2
kx
δ=1k2
2
,(3.11)
where x
δ=1 is the solution when m=n, or δ= 1, declines equally quickly for both random
and random walk, converging nonlinearly, while sequential and box converge approximately
linearly with increasing δ. Furthermore, it appears that the computational problem (Rec PC)
in Equation 3.10 is more difficult for box or sequential patterns than the randomized patterns
since the algorithm takes longer to converge in these cases, as shown in Figure 3.7.
3.3. IMAGING RESULTS 46
Figure 3.5 : Difference between taking measurement vectors from Φsequentially (left col-
umn) and according to a random path (right column) for a few subsampling ratios. Note
the reconstruction with random path measurement vectors is relatively high quality even at
a very low subsampling ratio δ. Data acquired by Lina Xu.
3.3. IMAGING RESULTS 47
Figure 3.6 : Relative mean square error (normalized squared difference between reconstructed
image for a given δand the one reconstructed with δ= 1) for the four methods of generating
the measurement basis Φ
Figure 3.7 : Time to solve the underlying optimization problem and recover an image for
various undersampling ratios, δ, for the four methods of generating the measurement basis
Φ
3.4. CONCLUSION AND DISCUSSION 48
3.4 Conclusion and Discussion
In this chapter we have demonstrated the feasibility and utility of circulant matrices for
imaging. Apparently, when the measurement vectors are chosen in a sequential fashion,
the recovery problem is more difficult than if the measurement vectors are chosen with an
element of randomness. We may be tempted to generalize this phenomenon and say that
in all cases, a sequentially-built Φ results in worse recovery than randomly-generated Φ.
However results in Chapter 4 suggest this is not the case.
The dependence on shift type we see here may be due to the fact that our figure of
merit is the magnitude of the derivative, the total variation. When we minimize a cost
function that includes total variation, we are asking the recovery algorithm to maximize
piecewise constancy. Thus, we acquire more information with a new measurement only
if the measurement vector is not probing the same piecewise constant areas. By shifting
sequentially, we probe the same piecewise constant area with the same set of measurement
vector pixels ϕ(i)
jk = 1, and thus acquiring less new information per measurement for most
values of δ. The results shown in the next chapter support this claim.
49
Chapter 4
Experimental Investigation of Subsampled Circulant
Matrices for Compressive Sensing
As we saw in the previous chapter, there is a marked difference in signal recovery when the
measurement matrix Φ is subsampled via a restriction operator Rwhere the index set is
chosen either randomly or sequentially. However in that chapter, we utilized a reconstruction
algorithm based on total-variation (TV) minimization coupled with the usual least-squares
condition. That sequentially-built Φ gather less information per measurement for the first
measurements, and thus require more measurements for commensurate performance with
randomly-built Φ is one possible and quite plausible explanation for this behavior. Plausible
because TV minimization finds images with the largest possible regions of small derivative,
so it favors piecewise constancy. An intuitive argument is that if a set of measurement
vectors sample identical regions of near-constant light intensity from a scene, then it makes
sense that the measurements acquired would be somewhat redundant.
If this were the end of the story, then for `1minimization, which we reiterate serves as a
proxy for `0minimization, with strict equality constraints, we might expect that reconstruc-
tion would not be sensitive to how we build Φ. In this chapter we show results that defy
that expectation. Furthermore these results suggest that the powerful precise undersampling
theorems proposed, developed, and derived for gaussian random measurement matrices by
Donoho and Tanner in a series of papers, [37–43] apply to partial circulant measurement
matrices as well.
Our results indicate that partial circulant or partial block circulant matrices might well
50
belong to what Donoho and Tanner call the universality class of gaussian matrices for phase
transitions with respect to signal recovery, or, as we will see shortly, phase transitions in
the number of faces of certain polytopes after projection by a measurement matrix ϕ. These
results show, perhaps surprisingly, that indeed, as Yin, et al., posited and demonstrated for
a few test cases [28], partial circulant matrices, with entries taken from {0,1}no less, are as
effective as Gaussian random measurement matrices. In the following we will be considering
if, for a set of measurements y= Φx, the underlying signal xcan be exactly recovered via
alternate regularization techniques. We will define undersampling phase space as
(δ, ρ)[0,1]2,
where
δ=m
n,and ρ=k
m.
Here kis the number of nonzeros in the vector x, or kxk0=k.
To close the introduction to this Chapter, we forward the following two ‘research chal-
lenges’ laid out by Donoho and Tanner [37],
“Characterize universality classes of Gaussian phase transitions. We have shown
that many ‘random’ matrix ensembles yield phase transitions matching those of
Gaussian matrices. Characterize the precise universality class of such matrices.”
Not only do our experiments have a clear practical value, but the experiments we report
here suggest that most of the partial circulant measurement matrices with seed vectors
ϕ(1) {0,1}in equal number follow the same phase transition in probability of success as
the Gaussian measurement matrices as reported in the series of work by Donoho and Tanner.
However we have found evidence suggesting that not all constructions yield Gaussian-like
phase transitions.
51
When we built partial block circulant matrices as with a selection mask moving over
a patterned optical plate in a sequential fashion, as explained in Section 3.2.2, and nis a
perfect square such that n= 2w, where wZ+, we see degradations in performance for three
specific values of δthat arise for two different wwe tested, w= 10, and w= 8. Thus, we may
have found a candidate counterexample to gaussian measurement matrices. Investigation of
this counterexample could yield insight into the connection between linear programming and
polytope geometry. Perhaps we are seeing evidence that would address the second research
challenge from [37],
“Discover new transitions for (LP) and (BP). Many but not all matrix ensem-
bles yield phase transitions matching those of Gaussian matrices. Discover more
examples which do not, and which are also interesting matrix ensembles, either
because the phase transition is better or because the matrix is explicit and de-
terministic”
Before we show our results, we introduce the convex programming problems we solve,
the linear program (LP) and basis pursuit (BP). These problems are just two alternative
methods of regularizing the problem of solving xin y= Φx. We also present some of the
basic facts and concepts from polytope geometry. With that background in order, we present
the main theoretical result from about seven years worth of work by Donoho and Tanner,
and discuss how that result, which has no explicit connection to sparse recovery, can be
used to predict the success rate for sparse recovery based on the properties of the choice of
measurement matrix.
Donoho and Tanner use the problem code ‘(P1)’ instead of (BP).
4.1. BACKGROUND 52
4.1 Background
4.1.1 Linear Programming
LP stands for the linear program, which is to find an xsuch that
x= min 1Txsubject to
y= Φx
x0
(4.1)
The basis pursuit problem (BP) is, find xsuch that
x= min kxk1subject to y= Φx. (4.2)
Clearly LP is just BP with the additional constraint that all entries of xbe nonnegative.
What began as a curiosity into whether circulant matrices indeed yield Gaussian phase
transitions for these problems may indeed meet both criteria of Donoho and Tanner’s Chal-
lenge, namely, discover a phase transition that does not match the Gaussian ensemble and
is an interesting ensemble because the matrix is explicit and deterministic.
4.1.2 Polytope Geometry
In order to understand the work of Donoho and Tanner, it is necessary to know only a few
key definitions from convex polytope geometry. Although non-convex polytopes exist, we
have no occasion to consider them here, so all future references to polytopes are understood
to be references to convex polytopes.
A set C Rnis convex if for any points z1, z2 C,
tz1+ (1 t)z2 C, t [0,1]
4.1. BACKGROUND 53
as well, or in other words, a convex set contains the line segment between any two points in
the set. If we have a set of points S, the convex hull of Sis given by
convS={t1z1+·· · +tkzk:ziS, ti0, i = 1, . . . , k, t1+. . . +tk= 1}
c.f. [10]. In words, the convex hull is the minimal surface that encloses all points in the set
S.
Apolytope is the convex hull of a finite set of points in Rn. A polytope may also be defined
as a polyhedron that is bounded, though we mention this only for completeness. For a more
detailed and technical description of polytopes, see Ziegler’s Lectures on Polytopes [44] and
Gr¨unbaum’s more advanced Convex Polytopes [45].
The most basic polytope is the standard n-simplex,
Tn1={xRn:1Tx1, x 0}.(4.3)
In two dimensions, the simplex is an isosceles triange, and in three dimensions, the simplex
is a tetrahedron.
We also define the crosspolytope as
Cn={xRn:
n
X
i=1 |xi| 1}.(4.4)
For n= 3, the crosspolytope is the octahedron. Already there are hints of a connection to
(LP) and (BP) [46].
The most essential feature of polytopes we will be concerned with is the number of faces
of dimension k. For a polytope P, we write the number of k-dimensional faces as fk(Q).
f0(Q) is the number of vertices of Q,f1(Q) is the number of edges, f2(Q) the number of
4.1. BACKGROUND 54
two-dimensional faces, and so on. If Q=C3, the cross-polytope (Equation 4.4), the convex
body with vertices at the unit vectors ±e1,±e2,and ±e3, we have
C3=(vR3:kvk1=
3
X
i=1 |vi| 1),
As can be counted in Figure 4.1 the face counts of C3are
C3:f0(P) = 6 (4.5)
f1(P) = 12 (4.6)
f2(P) = 8 (4.7)
For short, we refer to a k-dimensional face as a k-face. It is no coincidence that we refer to
the k-sparsity of a signal and a k-face. Indeed, as we will see, the connection is exactly what
enables precise undersampling theorems.
4.1.3 The Connection between Polytope Geometry and Convex Optimization
Donoho and Tanner, 2010 [37], present the following theorem that is purely the result of
combinatorial geometry with no explicit connection to linear programming. We state it here
in full with some paraphrasing and modification of notation and note that in that paper
they cite their own work for the proof [42,43,47,48], which is well beyond the scope of this
paper:
Theorem 4.1
Phase Transition for Face Counts of Gaussian Randomly Projected Polytopes
Let the m×nrandom matrix A have i.i.d. N(0,1) Gaussian elements. Consider sequences
of triples (n, m, k)where m=δn,k=ρm, and n . There are functions ρB(δ;Q)for