Lucila Ohno-Machado

University of California, San Diego, San Diego, California, United States

Are you Lucila Ohno-Machado?

Claim your profile

Publications (265)491.17 Total impact

  • Source
    Yuan Wu · Xiaoqian Jiang · Shuang Wang · Wenchao Jiang · Pinghao Li · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: Multi-category response models are very important complements to binary logistic models in medical decision-making. Decomposing model construction by aggregating computation developed at different sites is necessary when data cannot be moved outside institutions due to privacy or other concerns. Such decomposition makes it possible to conduct grid computing to protect the privacy of individual observations. This paper proposes two grid multi-category response models for ordinal and multinomial logistic regressions. Grid computation to test model assumptions is also developed for these two types of models. In addition, we present grid methods for goodness-of-fit assessment and for classification performance evaluation. Simulation results show that the grid models produce the same results as those obtained from corresponding centralized models, demonstrating that it is possible to build models using multi-center data without losing accuracy or transmitting observation-level data. Two real data sets are used to evaluate the performance of our proposed grid models. The grid fitting method offers a practical solution for resolving privacy and other issues caused by pooling all data in a central site. The proposed method is applicable for various likelihood estimation problems, including other generalized linear models.
    BMC Medical Informatics and Decision Making 12/2015; 15(1). DOI:10.1186/s12911-015-0133-y · 1.50 Impact Factor
  • Chia-Lun Lu · Shuang Wang · Zhanglong Ji · Yuan Wu · Li Xiong · Xiaoqian Jiang · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: The Cox proportional hazards model is a widely used method for analyzing survival data. To achieve sufficient statistical power in a survival analysis, it usually requires a large amount of data. Data sharing across institutions could be a potential workaround for providing this added power. The authors develop a web service for distributed Cox model learning (WebDISCO), which focuses on the proof-of-concept and algorithm development for federated survival analysis. The sensitive patient-level data can be processed locally and only the less-sensitive intermediate statistics are exchanged to build a global Cox model. Mathematical derivation shows that the proposed distributed algorithm is identical to the centralized Cox model. The authors evaluated the proposed framework at the University of California, San Diego (UCSD), Emory, and Duke. The experimental results show that both distributed and centralized models result in near-identical model coefficients with differences in the range [Formula: see text] to [Formula: see text]. The results confirm the mathematical derivation and show that the implementation of the distributed model can achieve the same results as the centralized implementation. The proposed method serves as a proof of concept, in which a publicly available dataset was used to evaluate the performance. The authors do not intend to suggest that this method can resolve policy and engineering issues related to the federated use of institutional data, but they should serve as evidence of the technical feasibility of the proposed approach.Conclusions WebDISCO (Web-based Distributed Cox Regression Model; provides a proof-of-concept web service that implements a distributed algorithm to conduct distributed survival analysis without sharing patient level data. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email:
    Journal of the American Medical Informatics Association 07/2015; DOI:10.1093/jamia/ocv083 · 3.93 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
    Journal of the American Medical Informatics Association 07/2015; DOI:10.1093/jamia/ocv017 · 3.93 Impact Factor
  • Katherine K Kim · Jill G Joseph · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: New models of healthcare delivery such as accountable care organizations and patient-centered medical homes seek to improve quality, access, and cost. They rely on a robust, secure technology infrastructure provided by health information exchanges (HIEs) and distributed research networks and the willingness of patients to share their data. There are few large, in-depth studies of US consumers’ views on privacy, security, and consent in electronic data sharing for healthcare and research together. Objective This paper addresses this gap, reporting on a survey which asks about California consumers’ views of data sharing for healthcare and research together. Materials and Methods The survey conducted was a representative, random-digit dial telephone survey of 800 Californians, performed in Spanish and English. Results There is a great deal of concern that HIEs will worsen privacy (40.3%) and security (42.5%). Consumers are in favor of electronic data sharing but elements of transparency are important: individual control, who has access, and the purpose for use of data. Respondents were more likely to agree to share deidentified information for research than to share identified information for healthcare (76.2% vs 57.3%, p < .001). Discussion While consumers show willingness to share health information electronically, they value individual control and privacy. Responsiveness to these needs, rather than mere reliance on Health Insurance Portability and Accountability Act (HIPAA), may improve support of data networks. Conclusion Responsiveness to the public’s concerns regarding their health information is a pre-requisite for patient-centeredness. This is one of the first in-depth studies of attitudes about electronic data sharing that compares attitudes of the same individual towards healthcare and research.
    Journal of the American Medical Informatics Association 03/2015; DOI:10.1093/jamia/ocv014 · 3.93 Impact Factor
  • PLoS ONE 03/2015; 10(3):e0121507. DOI:10.1371/journal.pone.0121507 · 3.23 Impact Factor
  • Source
    Xiaoqian Jiang · Yuan Wu · Keith Marsolo · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe functional specifications and practicalities in the software development process for a web service that allows the construction of the multivariate logistic regression model, Grid Logistic Regression (GLORE), by aggregating partial estimates from distributed sites, with no exchange of patient-level data. We recently developed and published a web service for model construction and data analysis in a distributed environment. This recent paper provided an overview of the system that is useful for users, but included very few details that are relevant for biomedical informatics developers or network security personnel who may be interested in implementing this or similar systems. We focus here on how the system was conceived and implemented. We followed a two-stage development approach by first implementing the backbone system and incrementally improving the user experience through interactions with potential users during the development. Our system went through various stages such as concept proof, algorithm validation, user interface development, and system testing. We used the Zoho Project management system to track tasks and milestones. We leveraged Google Code and Apache Subversion to share code among team members, and developed an applet-servlet architecture to support the cross platform deployment. During the development process, we encountered challenges such as Information Technology (IT) infrastructure gaps and limited team experience in user-interface design. We figured out solutions as well as enabling factors to support the translation of an innovative privacy-preserving, distributed modeling technology into a working prototype. Using GLORE (a distributed model that we developed earlier) as a pilot example, we demonstrated the feasibility of building and integrating distributed modeling technology into a usable framework that can support privacy-preserving, distributed data analysis among researchers at geographically dispersed institutes.
    12/2014; 2(1):1053. DOI:10.13063/2327-9214.1053
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To answer the need for the rigorous protection of biomedical data, we organized the Critical Assessment of Data Privacy and Protection initiative as a community effort to evaluate privacy-preserving dissemination techniques for biomedical data. We focused on the challenge of sharing aggregate human genomic data (e.g., allele frequencies) in a way that preserves the privacy of the data donors, without undermining the utility of genome-wide association studies (GWAS) or impeding their dissemination. Specifically, we designed two problems for disseminating the raw data and the analysis outcome, respectively, based on publicly available data from HapMap and from the Personal Genome Project. A total of six teams participated in the challenges. The final results were presented at a workshop of the iDASH (integrating Data for Analysis, 'anonymization,' and SHaring) National Center for Biomedical Computing. We report the results of the challenge and our findings about the current genome privacy protection techniques.
    BMC Medical Informatics and Decision Making 12/2014; 14(Suppl 1):S1. DOI:10.1186/1472-6947-14-S1-S1 · 1.50 Impact Factor
  • Lucila Ohno-Machado
    Journal of the American Medical Informatics Association 11/2014; 21(6):954-6. DOI:10.1136/amiajnl-2014-NovEditorial · 3.93 Impact Factor
  • Yongan Zhao · Xiaofeng Wang · Xiaoqian Jiang · Lucila Ohno-Machado · Haixu Tang
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients’ privacy. Methods Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence. Results We evaluated our approach on real human genomic data using four popular association tests. Our study shows that the proposed approach can help data users make the right choices in most cases. Conclusions Even though the pilot data cannot be directly used for scientific discovery, it provides a useful indication of which datasets are more likely to be useful to data users, who can therefore approach the appropriate data owners to gain access to the data.
    Journal of the American Medical Informatics Association 10/2014; DOI:10.1136/amiajnl-2014-003043 · 3.93 Impact Factor
  • Source
    Petra Stepanowsky · Eric Levy · Jihoon Kim · Xiaoqian Jiang · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.
    Cancer informatics 10/2014; 13(Suppl 1):95-102. DOI:10.4137/CIN.S13877
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MicroRNAs (miRNAs) are a class of small (∼22 nucleotides) non-coding RNAs that post-transcriptionally regulate gene expression by interacting with target mRNAs. A majority of miRNAs is located within intronic or exonic regions of protein-coding genes (host genes), and increasing evidence suggests a functional relationship between these miRNAs and their host genes. Here, we introduce miRIAD, a web-service to facilitate the analysis of genomic and structural features of intragenic miRNAs and their host genes for five species (human, rhesus monkey, mouse, chicken and opossum). miRIAD contains the genomic classification of all miRNAs (inter- and intragenic), as well as classification of all protein-coding genes into host or non-host genes (depending on whether they contain an intragenic miRNA or not). We collected and processed public data from several sources to provide a clear visualization of relevant knowledge related to intragenic miRNAs, such as host gene function, genomic context, names of and references to intragenic miRNAs, miRNA binding sites, clusters of intragenic miRNAs, miRNA and host gene expression across different tissues and expression correlation for intragenic miRNAs and their host genes. Protein–protein interaction data are also presented for functional network analysis of host genes. In summary, miRIAD was designed to help the research community to explore, in a user-friendly environment, intragenic miRNAs, their host genes and functional annotations with minimal effort, facilitating hypothesis generation and in-silico validations. Database URL:
    Database The Journal of Biological Databases and Curation 10/2014; 2014. DOI:10.1093/database/bau099 · 4.46 Impact Factor
  • Source
    David W Bates · Suchi Saria · Lucila Ohno-Machado · Anand Shah · Gabriel Escobar
    [Show abstract] [Hide abstract]
    ABSTRACT: The US health care system is rapidly adopting electronic health records, which will dramatically increase the quantity of clinical data that are available electronically. Simultaneously, rapid progress has been made in clinical analytics-techniques for analyzing large quantities of data and gleaning new insights from that analysis-which is part of what is known as big data. As a result, there are unprecedented opportunities to use big data to reduce the costs of health care in the United States. We present six use cases-that is, key examples-where some of the clearest opportunities exist to reduce costs through the use of big data: high-cost patients, readmissions, triage, decompensation (when a patient's condition worsens), adverse events, and treatment optimization for diseases affecting multiple organ systems. We discuss the types of insights that are likely to emerge from clinical analytics, the types of data needed to obtain such insights, and the infrastructure-analytics, algorithms, registries, assessment scores, monitoring devices, and so forth-that organizations will need to perform the necessary analyses and to implement changes that will improve care while reducing costs. Our findings have policy implications for regulatory oversight, ways to address privacy concerns, and the support of research on analytics.
    Health Affairs 07/2014; 33(7):1123-31. DOI:10.1377/hlthaff.2014.0041 · 4.32 Impact Factor
  • Source
    Haoran Li · Li Xiong · Lucila Ohno-Machado · Xiaoqian Jiang
    [Show abstract] [Hide abstract]
    ABSTRACT: Data sharing is challenging but important for healthcare research. Methods for privacy-preserving data dissemination based on the rigorous differential privacy standard have been developed but they did not consider the characteristics of biomedical data and make full use of the available information. This often results in too much noise in the final outputs. We hypothesized that this situation can be alleviated by leveraging a small portion of open-consented data to improve utility without sacrificing privacy. We developed a hybrid privacy-preserving differentially private support vector machine (SVM) model that uses public data and private data together. Our model leverages the RBF kernel and can handle nonlinearly separable cases. Experiments showed that this approach outperforms two baselines: (1) SVMs that only use public data, and (2) differentially private SVMs that are built from private data. Our method demonstrated very close performance metrics compared to nonprivate SVMs trained on the private data.
    06/2014; 2014:1-10. DOI:10.1155/2014/827371
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Summary: MAGI is a web service for fast MicroRNA-Seq data analysis in a graphics processing unit (GPU) infrastructure. Using just a browser, users have access to results as web reports in just a few hours—>600% end-to-end performance improvement over state of the art. MAGI’s salient features are (i) transfer of large input files in native FASTA with Qualities (FASTQ) format through drag-and-drop operations, (ii) rapid prediction of microRNA target genes leveraging parallel computing with GPU devices, (iii) all-in-one analytics with novel feature extraction, statistical test for differential expression and diagnostic plot generation for quality control and (iv) interactive visualization and exploration of results in web reports that are readily available for publication.Availability and implementation: MAGI relies on the Node.js JavaScript framework, along with NVIDIA CUDA C, PHP: Hypertext Preprocessor (PHP), Perl and R. It is freely available at j5kim@ucsd.eduSupplementary information: Supplementary data are available at Bioinformatics online.
    Bioinformatics 06/2014; 30(19). DOI:10.1093/bioinformatics/btu377 · 4.62 Impact Factor
  • Source
    Zhanglong Ji · Xiaoqian Jiang · Shuang Wang · Li Xiong · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.
    BMC Medical Genomics 05/2014; 7(Suppl 1):S14. DOI:10.1186/1755-8794-7-S1-S14 · 3.91 Impact Factor
  • Source
    Shuang Wang · Jihoon Kim · Xiaoqian Jiang · Stefan F Brunner · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Non-coding sequences such as microRNAs have important roles in disease processes. Computational microRNA target identification (CMTI) is becoming increasingly important since traditional experimental methods for target identification pose many difficulties. These methods are time-consuming, costly, and often need guidance from computational methods to narrow down candidate genes anyway. However, most CMTI methods are computationally demanding, since they need to handle not only several million query microRNA and reference RNA pairs, but also several million nucleotide comparisons within each given pair. Thus, the need to perform microRNA identification at such large scale has increased the demand for parallel computing. Methods Although most CMTI programs (e.g., the miRanda algorithm) are based on a modified Smith-Waterman (SW) algorithm, the existing parallel SW implementations (e.g., CUDASW++ 2.0/3.0, SWIPE) are unable to meet this demand in CMTI tasks. We present CUDA-miRanda, a fast microRNA target identification algorithm that takes advantage of massively parallel computing on Graphics Processing Units (GPU) using NVIDIA's Compute Unified Device Architecture (CUDA). CUDA-miRanda specifically focuses on the local alignment of short (i.e., ≤ 32 nucleotides) sequences against longer reference sequences (e.g., 20K nucleotides). Moreover, the proposed algorithm is able to report multiple alignments (up to 191 top scores) and the corresponding traceback sequences for any given (query sequence, reference sequence) pair. Results Speeds over 5.36 Giga Cell Updates Per Second (GCUPs) are achieved on a server with 4 NVIDIA Tesla M2090 GPUs. Compared to the original miRanda algorithm, which is evaluated on an Intel Xeon E5620@2.4 GHz CPU, the experimental results show up to 166 times performance gains in terms of execution time. In addition, we have verified that the exact same targets were predicted in both CUDA-miRanda and the original miRanda implementations through multiple test datasets. Conclusions We offer a GPU-based alternative to high performance compute (HPC) that can be developed locally at a relatively small cost. The community of GPU developers in the biomedical research community, particularly for genome analysis, is still growing. With increasing shared resources, this community will be able to advance CMTI in a very significant manner. Our source code is available at
    BMC Medical Genomics 05/2014; 7(Suppl 1):S9. DOI:10.1186/1755-8794-7-S1-S9 · 3.91 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This article describes the patient-centered Scalable National Network for Effectiveness Research (pSCANNER), which is part of the recently formed PCORnet, a national network composed of learning healthcare systems and patient-powered research networks funded by the Patient Centered Outcomes Research Institute (PCORI). It is designed to be a stakeholder-governed federated network that uses a distributed architecture to integrate data from three existing networks covering over 21 million patients in all 50 states: (1) VA Informatics and Computing Infrastructure (VINCI), with data from Veteran Health Administration's 151 inpatient and 909 ambulatory care and community-based outpatient clinics; (2) the University of California Research exchange (UC-ReX) network, with data from UC Davis, Irvine, Los Angeles, San Francisco, and San Diego; and (3) SCANNER, a consortium of UCSD, Tennessee VA, and three federally qualified health systems in the Los Angeles area supplemented with claims and health information exchange data, led by the University of Southern California. Initial use cases will focus on three conditions: (1) congestive heart failure; (2) Kawasaki disease; (3) obesity. Stakeholders, such as patients, clinicians, and health service researchers, will be engaged to prioritize research questions to be answered through the network. We will use a privacy-preserving distributed computation model with synchronous and asynchronous modes. The distributed system will be based on a common data model that allows the construction and evaluation of distributed multivariate models for a variety of statistical analyses.
    Journal of the American Medical Informatics Association 04/2014; 21(4). DOI:10.1136/amiajnl-2014-002751 · 3.93 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many healthcare facilities enforce security on their electronic health records (EHRs) through a corrective mechanism: some staff nominally have almost unrestricted access to the records, but there is a strict ex post facto audit process for inappropriate accesses, i.e., accesses that violate the facility’s security and privacy policies. This process is inefficient, as each suspicious access has to be reviewed by a security expert, and is purely retrospective, as it occurs after damage may have been incurred. This motivates automated approaches based on machine learning using historical data. Previous attempts at such a system have successfully applied supervised learning models to this end, such as SVMs and logistic regression. While providing benefits over manual auditing, these approaches ignore the identity of the users and patients involved in a record access. Therefore, they cannot exploit the fact that a patient whose record was previously involved in a violation has an increased risk of being involved in a future violation. Motivated by this, in this paper, we propose a collaborative filtering inspired approach to predicting inappropriate accesses. Our solution integrates both explicit and latent features for staff and patients, the latter acting as a personalized “fingerprint” based on historical access patterns. The proposed method, when applied to real EHR access data from two tertiary hospitals and a file-access dataset from Amazon, shows not only significantly improved performance compared to existing methods, but also provides insights as to what indicates an inappropriate access.
    Machine Learning 04/2014; 95(1). DOI:10.1007/s10994-013-5376-1 · 1.69 Impact Factor
  • Source
    Son Doan · Mike Conway · Tu Minh Phuong · Lucila Ohno-Machado
    [Show abstract] [Hide abstract]
    ABSTRACT: In modern electronic medical records (EMR) much of the clinically important data - signs and symptoms, symptom severity, disease status, etc. - are not provided in structured data fields, but rather are encoded in clinician generated narrative text. Natural language processing (NLP) provides a means of "unlocking" this important data source for applications in clinical decision support, quality assurance, and public health. This chapter provides an overview of representative NLP systems in biomedicine based on a unified architectural view. A general architecture in an NLP system consists of two main components: background knowledge that includes biomedical knowledge resources and a framework that integrates NLP tools to process text. Systems differ in both components, which we will review briefly. Additionally, challenges facing current research efforts in biomedical NLP include the paucity of large, publicly available annotated corpora, although initiatives that facilitate data sharing, system evaluation, and collaborative work between researchers in clinical NLP are starting to emerge.
    Methods in molecular biology (Clifton, N.J.) 01/2014; 1168. DOI:10.1007/978-1-4939-0847-9_16 · 1.29 Impact Factor
  • Elizabeth A Bell · Lucila Ohno-Machado · M Adela Grando
    [Show abstract] [Hide abstract]
    ABSTRACT: We interviewed 70 healthy volunteers to understand their choices about how the information in their health record should be shared for research. Twenty-eight survey questions captured individual preferences of healthy volunteers. The results showed that respondents felt comfortable participating in research if they were given choices about which portions of their medical data would be shared, and with whom those data would be shared. Respondents indicated a strong preference towards controlling access to specific data (83%), and a large proportion (68%) indicated concern about the possibility of their data being used by for-profit entities. The results suggest that transparency in the process of sharing is an important factor in the decision to share clinical data for research.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 01/2014; 2014:1699-708.

Publication Stats

5k Citations
491.17 Total Impact Points


  • 2009–2015
    • University of California, San Diego
      • Department of Medicine
      San Diego, California, United States
  • 2014
    • Duke University
      Durham, North Carolina, United States
  • 2013
    • Toyota Technological Institute at Chicago
      Chicago, Illinois, United States
  • 1993–2013
    • Stanford Medicine
      • Department of Pediatrics
      Stanford, CA, United States
  • 2012
    • University of Pittsburgh
      Pittsburgh, Pennsylvania, United States
  • 1996–2012
    • Harvard Medical School
      • • Department of Radiology
      • • Division of Emergency Medicine
      • • Department of Genetics
      Boston, Massachusetts, United States
  • 2010
    • The University of Hong Kong
      Hong Kong, Hong Kong
  • 2001–2009
    • Harvard University
      Cambridge, Massachusetts, United States
  • 2001–2008
    • Massachusetts Institute of Technology
      • • Computer Science and Artificial Intelligence Laboratory
      • • Division of Health Sciences and Technology
      Cambridge, Massachusetts, United States
  • 1996–2008
    • Brigham and Women's Hospital
      • • Decision Systems Group
      • • Department of Radiology
      Boston, Massachusetts, United States
  • 2005–2006
    • Universidade Federal de São Paulo
      San Paulo, São Paulo, Brazil
  • 2003–2005
    • Teikyo University Hospital
      Edo, Tōkyō, Japan
  • 2004
    • Partners HealthCare
      Boston, Massachusetts, United States
    • University of Massachusetts Boston
      Boston, Massachusetts, United States
  • 2002
    • Federal University of Rio de Janeiro
      Rio de Janeiro, Rio de Janeiro, Brazil
    • University of Hertfordshire
      • School of Computer Science
      Hatfield, ENG, United Kingdom
  • 1999–2002
    • Stanford University
      Palo Alto, California, United States
  • 2000
    • Consorcio Hospital General Universitario de Valencia
      • Departamento de Cardiología
      Valencia, Valencia, Spain
  • 1997
    • University of Southern California
      • Department of Chemical Engineering and Materials Science
      Los Angeles, CA, United States