Michael S. Waterman

University of Southern California, Los Angeles, California, United States

Are you Michael S. Waterman?

Claim your profile

Publications (170)533.92 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The DiseaseConnect (http://disease-connect.org) is a web server for analysis and visualization of a comprehensive knowledge on mechanism-based disease connectivity. The traditional disease classification system groups diseases with similar clinical symptoms and phenotypic traits. Thus, diseases with entirely different pathologies could be grouped together, leading to a similar treatment design. Such problems could be avoided if diseases were classified based on their molecular mechanisms. Connecting diseases with similar pathological mechanisms could inspire novel strategies on the effective repositioning of existing drugs and therapies. Although there have been several studies attempting to generate disease connectivity networks, they have not yet utilized the enormous and rapidly growing public repositories of disease-related omics data and literature, two primary resources capable of providing insights into disease connections at an unprecedented level of detail. Our DiseaseConnect, the first public web server, integrates comprehensive omics and literature data, including a large amount of gene expression data, Genome-Wide Association Studies catalog, and text-mined knowledge, to discover disease-disease connectivity via common molecular mechanisms. Moreover, the clinical comorbidity data and a comprehensive compilation of known drug-disease relationships are additionally utilized for advancing the understanding of the disease landscape and for facilitating the mechanism-based development of new drug treatments.
    Nucleic Acids Research 06/2014; · 8.81 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be challenging for genomes and metagenomes without template sequences, making alignment-based genome sequence comparison difficult. In addition, sequence reads from NGS can come from different regions of various genomes and they may not be alignable. Sequence signature-based methods for genome comparison based on the frequencies of word patterns in genomes and metagenomes can potentially be useful for the analysis of short reads data from NGS. Here we review the recent development of alignment-free genome and metagenome comparison based on the frequencies of word patterns with emphasis on the dissimilarity measures between sequences, the statistical power of these measures when two sequences are related and the applications of these measures to NGS data.
    Briefings in Bioinformatics 09/2013; · 5.30 Impact Factor
  • Ehsan Behnam, Michael S Waterman, Andrew D Smith
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Local alignment-free sequence comparison arises in the context of identifying similar segments of sequences that may not be alignable in the traditional sense. We propose a randomized approximation algorithm that is both accurate and efficient. We show that under D2 and its important variant [Formula: see text] as the similarity measure, local alignment-free comparison between a pair of sequences can be formulated as the problem of finding the maximum bichromatic dot product between two sets of points in high dimensions. We introduce a geometric framework that reduces this problem to that of finding the bichromatic closest pair (BCP), allowing the properties of the underlying metric to be leveraged. Local alignment-free sequence comparison can be solved by making a quadratic number of alignment-free substring comparisons. We show both theoretically and through empirical results on simulated data that our approximation algorithm requires a subquadratic number of such comparisons and trades only a small amount of accuracy to achieve this efficiency. Therefore, our algorithm can extend the current usage of alignment-free-based methods and can also be regarded as a substitute for local alignment algorithms in many biological studies.
    Journal of computational biology: a journal of computational molecular cell biology 07/2013; 20(7):471-85. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To an RNA pseudoknot structure is naturally associated a topological surface, which has its associated genus, and structures can thus be classified by the genus. Based on earlier work of Harer-Zagier, we compute the generating function [Formula: see text] for the number [Formula: see text] of those structures of fixed genus [Formula: see text] and minimum stack size [Formula: see text] with [Formula: see text] nucleotides so that no two consecutive nucleotides are basepaired and show that [Formula: see text] is algebraic. In particular, we prove that [Formula: see text], where [Formula: see text]. Thus, for stack size at least two, the genus only enters through the sub-exponential factor, and the slow growth rate compared to the number of RNA molecules implies the existence of neutral networks of distinct molecules with the same structure of any genus. Certain RNA structures called shapes are shown to be in natural one-to-one correspondence with the cells in the Penner-Strebel decomposition of Riemann's moduli space of a surface of genus [Formula: see text] with one boundary component, thus providing a link between RNA enumerative problems and the geometry of Riemann's moduli space.
    Journal of Mathematical Biology 10/2012; · 2.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Next generation sequencing (NGS) technologies are now widely used in many biological studies. In NGS, sequence reads are randomly sampled from the genome sequence of interest. Most computational approaches for NGS data first map the reads to the genome and then analyze the data based on the mapped reads. Since many organisms have unknown genome sequences and many reads cannot be uniquely mapped to the genomes even if the genome sequences are known, alternative analytical methods are needed for the study of NGS data. Here we suggest using word patterns to analyze NGS data. Word pattern counting (the study of the probabilistic distribution of the number of occurrences of word patterns in one or multiple long sequences) has played an important role in molecular sequence analysis. However, no studies are available on the distribution of the number of occurrences of word patterns in NGS reads. In this article, we build probabilistic models for the background sequence and the sampling process of the sequence reads from the genome. Based on the models, we provide normal and compound Poisson approximations for the number of occurrences of word patterns from the sequence reads, with bounds on the approximation error. The main challenge is to consider the randomness in generating the long background sequence, as well as in the sampling of the reads using NGS. We show the accuracy of these approximations under a variety of conditions for different patterns with various characteristics. Under realistic assumptions, the compound Poisson approximation seems to outperform the normal approximation in most situations. These approximate distributions can be used to evaluate the statistical significance of the occurrence of patterns from NGS data. The theory and the computational algorithm for calculating the approximate distributions are then used to analyze ChIP-Seq data using transcription factor GABP. Software is available online (www-rcf.usc.edu/∼fsun/Programs/NGS_motif_power/NGS_motif_power.html). In addition, Supplementary Material can be found online (www.liebertonline.com/cmb).
    Journal of computational biology: a journal of computational molecular cell biology 06/2012; 19(6):839-54. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model.
    Journal of Theoretical Biology 06/2011; 284(1):106-16. · 2.35 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks.
    PLoS Computational Biology 06/2011; 7(6):e1001106. · 4.87 Impact Factor
  • Source
    Lu Meng, Fengzhu Sun, Xuegong Zhang, Michael S Waterman
    [Show abstract] [Hide abstract]
    ABSTRACT: Sequence alignment depends on the scoring function that defines similarity between pairs of letters. For local alignment, the computational algorithm searches for the most similar segments in the sequences according to the scoring function. The choice of this scoring function is important for correctly detecting segments of interest. We formulate sequence alignment as a hypothesis testing problem, and conduct extensive simulation experiments to study the relationship between the scoring function and the distribution of aligned pairs within the aligned segment under this framework. We cut through the many ways to construct scoring functions and showed that any scoring function with negative expectation used in local alignment corresponds to a hypothesis test between the background distribution of sequence letters and a statistical distribution of letter pairs determined by the scoring function. The results indicate that the log-likelihood ratio scoring function is statistically most powerful and has the highest accuracy for detecting the segments of interest that are defined by the statistical distribution of aligned letter pairs.
    Journal of computational biology: a journal of computational molecular cell biology 05/2011; 18(5):677-91. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A linear chord diagram canonically determines a fatgraph and hence has an associated genus $g$. We compute the natural generating function ${\bf C}_g(z)=\sum_{n\geq 0} {\bf c}_g(n)z^n$ for the number ${\bf c}_g(n)$ of linear chord diagrams of fixed genus $g\geq 1$ with a given number $n\geq 0$ of chords and find the remarkably simple formula ${\bf C}_g(z)=z^{2g}R_g(z) (1-4z)^{{1\over 2}-3g}$, where $R_g(z)$ is a polynomial of degree at most $g-1$ with integral coefficients satisfying $R_g({1\over 4})\neq 0$ and $R_g(0) = {\bf c}_g(2g)\neq 0.$ In particular, ${\bf C}_g(z)$ is algebraic over $\mathbb C(z)$, which generalizes the corresponding classical fact for the generating function ${\bf C}_0(z)$ of the Catalan numbers. As a corollary, we also calculate a related generating function germaine to the enumeration of knotted RNA secondary structures, which is again found to be algebraic.
    10/2010;
  • Source
    Lin Wan, Gesine Reinert, Fengzhu Sun, Michael S Waterman
    [Show abstract] [Hide abstract]
    ABSTRACT: Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.
    Journal of computational biology: a journal of computational molecular cell biology 10/2010; 17(11):1467-90. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Variation in genome structure is an important source of human genetic polymorphism: It affects a large proportion of the genome and has a variety of phenotypic consequences relevant to health and disease. In spite of this, human genome structure variation is incompletely characterized due to a lack of approaches for discovering a broad range of structural variants in a global, comprehensive fashion. We addressed this gap with Optical Mapping, a high-throughput, high-resolution single-molecule system for studying genome structure. We used Optical Mapping to create genome-wide restriction maps of a complete hydatidiform mole and three lymphoblast-derived cell lines, and we validated the approach by demonstrating a strong concordance with existing methods. We also describe thousands of new variants with sizes ranging from kb to Mb.
    Proceedings of the National Academy of Sciences 06/2010; 107(24):10848-53. · 9.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of binding sites of transcription factors (TF) and other regulatory regions, referred to as motifs, located in a set of molecular sequences is of fundamental importance in genomic research. Many computational and experimental approaches have been developed to locate motifs. The set of sequences of interest can be concatenated to form a long sequence of length n. One of the successful approaches for motif discovery is to identify statistically over- or under-represented patterns in this long sequence. A pattern refers to a fixed word W over the alphabet. In the example of interest, W is a word in the set of patterns of the motif. Despite extensive studies on motif discovery, no studies have been carried out on the power of detecting statistically over- or under-represented patterns Here we address the issue of how the known presence of random instances of a known motif affects the power of detecting patterns, such as patterns within the motif. Let N(W)(n) be the number of possibly overlapping occurrences of a pattern W in the sequence that contains instances of a known motif; such a sequence is modeled here by a Hidden Markov Model (HMM). First, efficient computational methods for calculating the mean and variance of N(W)(n) are developed. Second, efficient computational methods for calculating parameters involved in the normal approximation of N(W)(n) for frequent patterns and compound Poisson approximation of N(W)(n) for rare patterns are developed. Third, an easy to use web program is developed to calculate the power of detecting patterns and the program is used to study the power of detection in several interesting biological examples.
    Journal of computational biology: a journal of computational molecular cell biology 04/2010; 17(4):581-92. · 1.69 Impact Factor
  • Source
    David C Schwartz, Michael S Waterman
    [Show abstract] [Hide abstract]
    ABSTRACT: New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands.
    Journal of Computer Science and Technology 01/2010; 25(1):3-9. · 0.48 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Complex human diseases are often caused by multiple mutations, each of which contributes only a minor effect to the disease phenotype. To study the basis for these complex phenotypes, we developed a network-based approach to identify coexpression modules specifically activated in particular phenotypes. We integrated these modules, protein-protein interaction data, Gene Ontology annotations, and our database of gene-phenotype associations derived from literature to predict novel human gene-phenotype associations. Our systematic predictions provide us with the opportunity to perform a global analysis of human gene pleiotropy and its underlying regulatory mechanisms. We applied this method to 338 microarray datasets, covering 178 phenotype classes, and identified 193,145 phenotype-specific coexpression modules. We trained random forest classifiers for each phenotype and predicted a total of 6,558 gene-phenotype associations. We showed that 40.9% genes are pleiotropic, highlighting that pleiotropy is more prevalent than previously expected. We collected 77 ChIP-chip datasets studying 69 transcription factors binding over 16,000 targets under various phenotypic conditions. Utilizing this unique data source, we confirmed that dynamic transcriptional regulation is an important force driving the formation of phenotype specific gene modules. We created a genome-wide gene to phenotype mapping that has many potential implications, including providing potential new drug targets and uncovering the basis for human disease phenotypes. Our analysis of these phenotype-specific coexpression modules reveals a high prevalence of gene pleiotropy, and suggests that phenotype-specific transcription factor binding may contribute to phenotypic diversity. All resources from our study are made freely available on our online Phenotype Prediction Database.
    BMC Bioinformatics 01/2010; 11 Suppl 1:S62. · 3.02 Impact Factor
  • Source
    Gesine Reinert, David Chew, Fengzhu Sun, Michael S Waterman
    [Show abstract] [Hide abstract]
    ABSTRACT: Large-scale comparison of the similarities between two biological sequences is a major issue in computational biology; a fast method, the D(2) statistic, relies on the comparison of the k-tuple content for both sequences. Although it has been known for some years that the D(2) statistic is not suitable for this task, as it tends to be dominated by single-sequence noise, to date no suitable adjustments have been proposed. In this article, we suggest two new variants of the D(2) word count statistic, which we call D(2)(S) and D(2)(*). For D(2)(S), which is a self-standardized statistic, we show that the statistic is asymptotically normally distributed, when sequence lengths tend to infinity, and not dominated by the noise in the individual sequences. The second statistic, D(2)(*), outperforms D(2)(S) in terms of power for detecting the relatedness between the two sequences in our examples; but although it is straightforward to simulate from the asymptotic distribution of D(2)(*), we cannot provide a closed form for power calculations.
    Journal of computational biology: a journal of computational molecular cell biology 12/2009; 16(12):1615-34. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
    Science 11/2009; 326(5956):1112-1115. · 31.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
    Science 11/2009; 326(5956):1112-1115. · 31.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
    Science 11/2009; 326(5956):1112-5. · 31.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/ approximately 23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/ approximately 2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.
    PLoS Genetics 11/2009; 5(11):e1000711. · 8.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Although many studies have been successful in the discovery of cooperating groups of genes, mapping these groups to phenotypes has proved a much more challenging task. In this article, we present the first genome-wide mapping of gene coexpression modules onto the phenome. We annotated coexpression networks from 136 microarray datasets with phenotypes from the Unified Medical Language System (UMLS). We then designed an efficient graph-based simulated annealing approach to identify coexpression modules frequently and specifically occurring in datasets related to individual phenotypes. By requiring phenotype-specific recurrence, we ensure the robustness of our findings. We discovered 118,772 modules specific to 42 phenotypes, and developed validation tests combining Gene Ontology, GeneRIF and UMLS. Our method is generally applicable to any kind of abundant network data with defined phenotype association, and thus paves the way for genome-wide, gene network-phenotype maps.
    Journal of computational biology: a journal of computational molecular cell biology 08/2009; 16(8):1023-34. · 1.69 Impact Factor

Publication Stats

6k Citations
533.92 Total Impact Points

Institutions

  • 1983–2014
    • University of Southern California
      • • Division of Molecular and Computational Biology
      • • Department of Biological Sciences
      • • Department of Mathematics
      Los Angeles, California, United States
  • 2012
    • Aarhus University
      • Centre for Quantum Geometry of Moduli Spaces QGM
      Aars, Region North Jutland, Denmark
  • 2010–2012
    • Shandong University
      • School of Mathematics and Statistics
      Jinan, Shandong Sheng, China
  • 2011
    • South China University of Technology
      Shengcheng, Guangdong, China
  • 2009–2011
    • Tsinghua University
      • Department of Automation
      Beijing, Beijing Shi, China
    • Harvard Medical School
      • Department of Genetics
      Boston, MA, United States
    • Iowa State University
      Ames, Iowa, United States
  • 2009–2010
    • University of Wisconsin, Madison
      • Department of Chemistry
      Madison, MS, United States
  • 2007–2009
    • University of Oxford
      • Department of Statistics
      Oxford, ENG, United Kingdom
    • Yonsei University
      • Department of Computer Science
      Seoul, Seoul, South Korea
  • 1985–2007
    • University of California, Los Angeles
      • Department of Mathematics
      Los Angeles, California, United States
    • The Rockefeller University
      New York City, New York, United States
  • 2005
    • University of Alabama at Birmingham
      • Department of Biostatistics
      Birmingham, AL, United States
  • 2002
    • Celera
      Alameda, California, United States
  • 2001
    • University of California, San Diego
      • Department of Computer Science and Engineering (CSE)
      San Diego, CA, United States
  • 1997–2001
    • Emory University
      Atlanta, Georgia, United States
  • 1998
    • National Institutes of Health
      Maryland, United States
  • 1995
    • Pennsylvania State University
      • Department of Electrical Engineering
      University Park, MD, United States
  • 1994
    • The University of Memphis
      Memphis, Tennessee, United States
  • 1992
    • Whitehead Institute for Biomedical Research
      Cambridge, Massachusetts, United States
    • Cornell University
      Ithaca, New York, United States
  • 1991
    • California Institute of Technology
      • Jet Propulsion Laboratory
      Pasadena, California, United States
  • 1990
    • Lawrence Livermore National Laboratory
      Livermore, California, United States
  • 1986
    • Harvard University
      Cambridge, Massachusetts, United States
  • 1981
    • Northern Michigan University
      Marquette, Michigan, United States