Page 1
Learning classifiers from distributed, semantically heterogeneous, autonomous
data sources
by
Doina Caragea
A dissertation submitted to the graduate faculty
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Major: Computer Science
Program of Study Committee:
Vasant Honavar, Major Professor
Dianne Cook
Drena Dobbs
David Fernandez-Baca
Leslie Miller
Iowa State University
Ames, Iowa
2004
Copyright c ? Doina Caragea, 2004. All rights reserved.
Page 2
ii
Graduate College
Iowa State University
This is to certify that the doctoral dissertation of
Doina Caragea
has met the dissertation requirements of Iowa State University
Major Professor
For the Major Program
Page 3
iii
TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
LIST OF FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xiv
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xvi
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.2Traditional Machine Learning Limitations . . . . . . . . . . . . . . . . . . . .4
1.3 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
1.4 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
1.4.1 Distributed Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
1.4.2 Information Integration . . . . . . . . . . . . . . . . . . . . . . . . . . .16
1.4.3 Learning Classifiers from Heterogeneous Data . . . . . . . . . . . . . .18
1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 LEARNING CLASSIFIERS FROM DATA. . . . . . . . . . . . . . . . . 23
2.1 Machine Learning Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
2.2Learning from Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
2.3 Examples of Algorithms for Learning from Data . . . . . . . . . . . . . . . . .27
2.3.1 Naive Bayes Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . .27
2.3.2 Decision Tree Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . .30
2.3.3 Perceptron Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.4Support Vector Machines and Related Large Margin Classifiers . . . . .33
2.3.5 k Nearest Neighbors Classifiers . . . . . . . . . . . . . . . . . . . . . .40
Page 4
iv
2.4 Decomposition of Learning Algorithms into Information Extraction and Hy-
pothesis Generation Components . . . . . . . . . . . . . . . . . . . . . . . . .42
2.5 Sufficient Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
2.6 Examples of Sufficient Statistics . . . . . . . . . . . . . . . . . . . . . . . . . .47
2.6.1 Sufficient Statistics for Naive Bayes Classifiers . . . . . . . . . . . . . .47
2.6.2 Sufficient Statistics for Decision Trees . . . . . . . . . . . . . . . . . . .47
2.6.3Sufficient Statistics for Perceptron Algorithm . . . . . . . . . . . . . . . 48
2.6.4 Sufficient Statistics for SVM . . . . . . . . . . . . . . . . . . . . . . . .50
2.6.5 Sufficient Statistics for k-NN . . . . . . . . . . . . . . . . . . . . . . . .51
2.7 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
3 LEARNING CLASSIFIERS FROM DISTRIBUTED DATA . . . . . . .54
3.1 Learning from Distributed Data . . . . . . . . . . . . . . . . . . . . . . . . . .54
3.2 General Strategy for Learning from Distributed Data . . . . . . . . . . . . . .58
3.3 Algorithms for Learning Classifiers from Distributed Data . . . . . . . . . . .60
3.3.1 Learning Naive Bayes Classifiers from Distributed Data . . . . . . . . .61
3.3.2Learning Decision Tree Classifiers from Distributed Data . . . . . . . .68
3.3.3 Horizontally Fragmented Distributed Data . . . . . . . . . . . . . . . .68
3.3.4 Learning Threshold Functions from Distributed Data . . . . . . . . . .78
3.3.5 Learning Support Vector Machines from Distributed Data. . . . . . .84
3.3.6 Learning k Nearest Neighbor Classifiers from Distributed Data . . . . .92
3.4 Statistical Query Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
3.4.1 Operator Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4 LEARNING CLASSIFIERS FROM SEMANTICALLY HETERO-
GENEOUS DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .106
4.1Integration of the Data at the Semantic Level . . . . . . . . . . . . . . . . . . 107
4.1.1 Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.1.2 Ontology Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Page 5
v
4.1.3 Ontology Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.1.4Ontology-Extended Data Sources . . . . . . . . . . . . . . . . . . . . . 119
4.2 Ontology-Extended Query Operators . . . . . . . . . . . . . . . . . . . . . . . 123
4.2.1 Ontology-Extended Primitive Operators . . . . . . . . . . . . . . . . . 124
4.2.2 Ontology-Extended Statistical Operators . . . . . . . . . . . . . . . . . 126
4.3 Semantic Heterogeneity and Statistical Queries . . . . . . . . . . . . . . . . . . 127
4.4 Algorithms for Learning Classifiers from Heterogeneous Distributed Data . . . 129
4.4.1Naive Bayes Classifiers from Heterogeneous Data . . . . . . . . . . . . 132
4.4.2 Decision Tree Induction from Heterogeneous Data . . . . . . . . . . . . 133
4.4.3 Support Vector Machines from Heterogeneous Data . . . . . . . . . . . 133
4.4.4Learning Threshold Functions from Heterogeneous Data . . . . . . . . 135
4.4.5 k-Nearest Neighbors Classifiers from Heterogeneous Data . . . . . . . . 135
4.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5 SUFFICIENT STATISTICS GATHERING . . . . . . . . . . . . . . . . .139
5.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2 Central Resource Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3 Query Answering Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.4 Query Optimization Component . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.4.1 Optimization Problem Definition . . . . . . . . . . . . . . . . . . . . . 146
5.4.2 Planning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5 Sufficient Statistics Gathering: Example . . . . . . . . . . . . . . . . . . . . . 151
5.6 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6 INDUS: A FEDERATED QUERY-CENTRIC APPROACH TO
LEARNING CLASSIFIERS FROM DISTRIBUTED HETERO-
GENEOUS AUTONOMOUS DATA SOURCES. . . . . . . . . . . . . .156
6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2 From Weka to AirlDM to INDUS . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Page 6
vi
6.3.1 Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.3.2 Learning NB Classifiers from Distributed Data . . . . . . . . . . . . . . 162
6.3.3 Learning NB Classifiers from Heterogeneous Distributed Data . . . . . 163
6.4 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.2Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.3Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
GLOSSARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175
INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183
Page 7
vii
LIST OF TABLES
2.1 Data set D: Decide EnjoySport based on Weather Data . . . . . . . .47
4.1 Data set D1: Weather Data collected by company C1
. . . . . . . . . 108
4.2 Data set D2: Weather Data collected by the company C2 . . . . . . . 108
4.3 Mappings from H1(is-a) and H2(is-a) (corresponding to the data sets
D1and D2) to HU(is-a) found using name matching strategy . . . . . 118
4.4Mappings from H1(is-a) and H2(is-a) (corresponding to the data sets
D1and D2, respectively) to HU(is-a) found from equality constraints . 118
6.1 Learning from distributed UCI/CENSUS-INCOME data sources . . . 163
6.2 Learning from heterogeneous UCI/ADULT data sources . . . . . . . . 167
Page 8
viii
LIST OF FIGURES
1.1 Example of a scenario that calls for knowledge acquisition from au-
tonomous, distributed, semantically heterogeneous data source - dis-
covery of protein sequence-structure-function relationships using infor-
mation from PROSITE, MEROPS, SWISSPROT repositories of pro-
tein sequence, structure, and function data. O1and O2are two user
ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
1.2 Learning revisited: identify sufficient statistics, gather the sufficient
statistics and generate the current algorithm output . . . . . . . . . .6
1.3 Exact learning from distributed data: distribute the statistical query
among the distributed data sets and compose their answers . . . . . .6
1.4 Learning from semantically heterogeneous distributed data: each data
source has an associated ontology and the user provides a global on-
tology and mappings from the local ontologies to the global ontology .7
1.5INDUS: INtelligent Data Understanding System . . . . . . . . . . . .8
2.1Learning algorithm (Up) Learning component (Down) Classification
component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27
2.2 Naive Bayes classifier . . . . . . . . . . . . . . . . . . . . . . . . . . .29
2.3 ID3 algorithm - greedy algorithm that grows the tree top-down, by
selecting the best attribute at each step (according to the information
gain). The growth of the tree stops when all the training examples are
correctly classified . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
2.4 Linearly separable data set . . . . . . . . . . . . . . . . . . . . . . . .33
Page 9
ix
2.5 The Perceptron algorithm . . . . . . . . . . . . . . . . . . . . . . . . .34
2.6 Maximum margin classifier . . . . . . . . . . . . . . . . . . . . . . . .35
2.7 Non-linearly separable data mapped to a feature space where it be-
comes linearly separable . . . . . . . . . . . . . . . . . . . . . . . . . .37
2.8 Support Vector Machines algorithm . . . . . . . . . . . . . . . . . . .38
2.9The Dual Perceptron algorithm. . . . . . . . . . . . . . . . . . . . .39
2.10 Decision boundary induced by the 1 nearest neighbor classifier . . . .41
2.11The k Nearest Neighbors algorithm . . . . . . . . . . . . . . . . . . .42
2.12 Learning revisited: identify sufficient statistics, gather the sufficient
statistics and generate the current algorithm output . . . . . . . . . .43
2.13 Naive Bayes classifiers learning as information extraction and hypoth-
esis generation: the algorithm asks a joint count statistical query for
each attribute in order to construct the classifier . . . . . . . . . . . .48
2.14 Decision Tree learning as information extraction and hypothesis gener-
ation: for each node, the algorithm asks a joint count statistical query
and chooses the best attribute according to the count distribution . .49
2.15 The Perceptron algorithm as information extraction and hypothesis
generation: at each iteration i + 1, the current weight wi+1(D) is up-
dated based on the refinement sufficient statistic s(D,wi(D)) . . . . . 50
2.16 The SVM algorithm as information extraction and hypothesis genera-
tion: the algorithm asks for the support vectors and their associated
weights) and the weight w is computed based on this information . .51
2.17 k-NN Algorithm as information extraction and hypothesis generation:
for each example x the algorithm asks for the k nearest neighbors
and computes the classification h(x) taking a majority vote over these
neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
3.1 Data fragmentation: (Left) Horizontally fragmented data (Right) Ver-
tically fragmented data . . . . . . . . . . . . . . . . . . . . . . . . . .54
Page 10
x
3.2 Multi relational database . . . . . . . . . . . . . . . . . . . . . . . . .55
3.3 Exact distributed learning: distribute the statistical query among the
distributed data sets and compose their answers.(a) Eager learning (b)
Lazy learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
3.4 Distributed statistics gathering: (Left) Serial (Right) Parallel . . . . .59
3.5 Learning Naive Bayes classifiers from horizontally distributed data:
the algorithm asks a joint count statistical query for each attribute in
order to construct the classifier. Each query is decomposed into sub-
queries, which are sent to the distributed data sources and the answers
to sub-queries are composed and sent back to the learning algorithm62
3.6 Naive Bayes classifier from horizontally fragmented data . . . . . . . .63
3.7 Naive Bayes classifier from vertically fragmented data . . . . . . . . .66
3.8 Learning Decision Tree classifiers from horizontally fragmented dis-
tributed data: for each node, the algorithm asks a joint count statis-
tical query, the query is decomposed into sub-queries and sent to the
distributed data sources, and the resulting counts are added up and
sent back to the learning algorithm. One iteration is shown . . . . . .70
3.9 Decision Tree classifiers: finding the best attribute for split when data
are horizontally fragmented . . . . . . . . . . . . . . . . . . . . . . . .71
3.10 Decision Tree classifiers: finding the best attribute for split when data
are vertically fragmented . . . . . . . . . . . . . . . . . . . . . . . . .75
3.11 Learning Threshold Functions from horizontally distributed data: the
algorithm asks a statistical query, the query is decomposed into sub-
queries which are subsequently sent to the distributed data sources, and
the final result is sent back to the learning algorithm. One iteration i
is shown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79
3.12 The Perceptron algorithm when data is horizontally fragmented. . . 80
Page 11
xi
3.13 Learning SVM from horizontally distributed data: the algorithm asks
a statistical query, the query is decomposed into sub-queries which are
sent to the distributed data sources, the results are composed, and the
final result is sent back to the learning algorithm . . . . . . . . . . . .85
3.14 Naive SVM from horizontally fragmented distributed data . . . . . . .85
3.15 Counterexample to naive SVM from distributed data. . . . . . . . .86
3.16Convex hull based SVM learning from horizontally fragmented dis-
tributed data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
3.17 Exact and efficient LSVM learning from horizontally fragmented dis-
tributed data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91
3.18 Learning k-NN classifiers from horizontally fragmented distributed data:
the algorithm asks a statistical query, the query is decomposed into
sub-queries which are sent to the distributed data sources, results are
composed, and the final result is sent to the learning algorithm . . . .93
3.19 Algorithm for learning k Nearest Neighbors classifiers from horizontally
fragmented distributed data . . . . . . . . . . . . . . . . . . . . . . .94
3.20Algorithm for k Nearest Neighbors classifiers from vertically fragmented
distributed data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
4.1 Learning from semantically heterogeneous distributed data: each data
source has an associated ontology and the user provides a user ontology
and mappings from the data source ontologies to the user ontology . . 106
4.2 The ontology (part-of and is-a hierarchies) associated with the data
set D1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3 The ontology (part-of and is-a hierarchies) associated with the data
set D2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.4User ontology OU, which represents an integration of the hierarchies
corresponding to the data sources D1and D2in weather domain . . . 113
Page 12
xii
4.5 Algorithm for finding mappings between a set of data source hierarchies
and a user hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.6 Algorithm for checking the consistency of a set of partial injective
mappings with a set of an interoperation constraints and with the
order preservation property . . . . . . . . . . . . . . . . . . . . . . . . 117
4.7 The AVTs corresponding to the Prec attribute in the ontologies O1,
O2and OU, associated with the data sources D1and D2and a user,
respectively (after the names have been matched). . . . . . . . . . . 129
5.1 The architecture of a system for gathering sufficient statistics from
distributed heterogeneous autonomous data sources . . . . . . . . . . 140
5.2 Central resource repository: data sources, learning algorithms, itera-
tors and users registration . . . . . . . . . . . . . . . . . . . . . . . . 141
5.3 Simple user workflow examples . . . . . . . . . . . . . . . . . . . . . . 143
5.4 Internal translation of the workflows in Figure 5.3 according to the
semantic imposed by the user ontology . . . . . . . . . . . . . . . . . 143
5.5 Example of RDF file for a data source (Prosite) described by name,
URI, schema and operators allowed by the data source . . . . . . . . . 144
5.6 Query answering engine . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.7 Query optimization (planning) algorithm . . . . . . . . . . . . . . . . 149
5.8Operator placement algorithm . . . . . . . . . . . . . . . . . . . . . . 150
5.9(Left) User workflow Naive Bayes example (Right) User workflow in-
ternal translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.10 The four plans found by the query optimizer for Naive Bayes example.
The operators below the dotted line are executed at the remote data
sources, and the operators above the dotted line are executed at the
central place . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Page 13
xiii
6.1INDUS: Intelligent Data Understanding System. Three data sources
are shown: PROSITE, MEROPS and SWISSPROT together with
their associated ontologies. Ontologies O1 and O2 are two different
user ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2 AirlDM: Data source independent learning algorithms through the
means of sufficient statistics and wrappers . . . . . . . . . . . . . . . . 159
6.3 Taxonomy for the attribute Ocupation in user (test) data. The filled
nodes represent the level of abstraction specified by the user. . . . . 164
6.4Taxonomy for the attribute Ocupation in the data set Dh
1. The filled
nodes represent the level of abstraction determined by the user cut.
Values Priv-house-serv, Other-service, Machine-op-inspct, Farming-
fishing are over specified with respect to the user cut . . . . . . . . . . 165
6.5 Taxonomy for the attribute Ocupation in the data set Dh
2. The filled
nodes represent the level of abstraction determined by the user cut.
The value (Sales+Tech-support) is underspecified with respect to the
user cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Page 14
xiv
ACKNOWLEDGMENTS
I express my gratitude to my advisor Dr. Vasant Honavar for guiding the research presented
in this dissertation throughout my Ph.D. student years. He has been a constant source of
motivation and encouragement. He helped me to develop my own views and opinions about
the research I undertook. I thank him for being always accessible and for providing invaluable
feedback on my work. I also thank him for providing funding for my research and for helping
me to receive an IBM Fellowship two years in a row. Thanks for organizing the AI seminar
which brought up thoughtful discussions and helped me to broaden my views about Artificial
Intelligence. Most importantly, I thank him for his friendship and for his encouragement when
I felt confused or overwhelmed.
I give my warm thanks to Dr. Dianne Cook for introducing me to the wonderful world
of visualization, for being a close collaborator and a model for me, as well as well as a good
friend. Thanks also go to Dr. Drena Dobbs for introducing me to the world of molecular
biology and motivating me to pursue a minor in bioinformatics, and to Dr. Leslie Miller
and Dr. David Fernandez-Baca for being the first two people that I interacted with in my
first semester at Iowa State University. They both helped me overcome my fears of graduate
school. I thank everybody in my committee for fruitful interactions and for their support.
I am grateful to Adrian Silvescu for motivating me to go to graduate school, for collabo-
rating with me on several projects, for his great ideas and enthusiasm, and for being one of
my best friends ever.
Thanks to all the students who were present in the Artificial Intelligence lab at Iowa State
University while I was there. They were great colleagues and provided me with a friendly
environment to work in. Thanks to Jyotishman Pathak for closely collaborating with me on
ontology-extended data sources and ontology-extended workflow components during the last
Page 15
xv
year of my Ph.D. studies. Thanks to Facundo Bromberg for interesting discussions about
multi-agent systems. Thanks to Dae-Ki Kang for letting me use his tools for generating tax-
onomies and for teaching me about AT&T graphviz. Thanks to Jun Zhang for the useful
discussions about partially specified data. Thanks to Jie Bao for discussions and insights
about ontology management. Thanks to Changui Yan and Carson Andorf for useful dis-
cussions about biological data sources. Thanks to Oksana Yakhnenko for helping with the
implementation of AirlDM. Finally, thanks to Jaime Reinoso-Castillo for the first INDUS
prototype.
It has been an honor to be part of the Computer Science Department at Iowa State
University. There are many individuals in Computer Science Department that I would like to
thank for their direct or indirect contribution to my research or education. Special thanks to
Dr. Jack Lutz for introducing me to the fascinating Kolmogorov Complexity. I thoroughly
enjoyed the two courses and the seminars I took with Dr. Lutz.
I also thank the current as well as previous staff members in Computer Science Department,
especially Linda Dutton, Pei-Lin Shi and Melanie Eckhart. Their kind and generous assistance
with various tasks was of great importance during my years at Iowa State University.
I am grateful to Dr. John Mayfield for financial support from the Graduate College and
to Sam Ellis from IBM Rochester for his assistance in obtaining the IBM graduate fellowship.
The research described in this thesis was supported in part by grants from the National
Science Foundation (NSF 0219699) and the National Institute of Health (NIH GM066387) to
Vasant Honavar.
Above all, I am fortunate to have family and friends that have provided so much sup-
port, encouragements and love during my Ph.D. years and otherwise. Thanks to my parents,
Alexandra and Paul Caragea, to my sister Cornelia Caragea and to my cousin Petruta Caragea.
Thanks to my friends Pia Sindile, Nicoleta Roman, Simona Verga, Veronica Nicolae, Calin
Anton, Liviu Badea, Mircea Neagu, Marius Vilcu, Cristina and Marcel Popescu, Petrica and
Mirela Vlad, Anna Atramentova, Laura Hamilton, Carol Hand, Barbara Gwiasda, Emiko Fu-
rukawa, Shireen Choobineh, JoAnn Kovar, Mike Collyer, and especially to Charles Archer for
helping me believe that I was able to complete this thesis.
Download full-text