• Home
  • Stephane M Meystre
Stephane M Meystre

Stephane M Meystre
University of Applied Sciences and Arts of Southern Switzerland (SUPSI) · Institute of Digital Technologies for Personalised Healthcare (MeDiTech)

MD, PhD, FACMI FIAHSI FAMIA

About

177
Publications
29,632
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,720
Citations
Introduction
I am a physician and medical informaticist with interests and background in applications of “artificial intelligence” and more specifically machine learning and natural language processing (NLP) to support more effective clinical care and enable reuse of existing (unstructured and structured) clinical information for research applications, all while addressing responsible AI requirements (privacy-preserving, ethical, reproducible, explainable, efficient, secure, unbiased).
Additional affiliations
January 2022 - September 2023
OnePlanet Research Center
Position
  • Scientific Director
July 2020 - December 2021
Medical University of South Carolina
Position
  • Professor (Full)
August 2016 - June 2020
Medical University of South Carolina
Position
  • Professor (Associate)
Education
September 2002 - May 2005
University of Utah
Field of study
  • Medical Informatics
September 2001 - June 2002
University of California, Davis
Field of study
  • Medical Informatics
September 1992 - December 1998
University of Lausanne
Field of study
  • Medicine

Publications

Publications (177)
Article
Full-text available
Objective This paper describes a new congestive heart failure (CHF) treatment performance measure information extraction system – CHIEF – developed as part of the Automated Data Acquisition for Heart Failure project, a Veterans Health Administration project aiming at improving the detection of patients not receiving recommended care for CHF. Design...
Article
Full-text available
The adoption of Electronic Health Records is growing at a fast pace, and this growth results in very large quantities of patient clinical information becoming available in electronic format, with tremendous potentials, but also equally growing concern for patient confidentiality breaches. De-identification of patient information has been proposed a...
Article
Full-text available
Clinical text de-identification can potentially overlap with clinical information such as medical problems or treatments, therefore causing this information to be lost. In this study, we focused on the analysis of the overlap between the 2010 i2b2 NLP challenge concept annotations, with the PHI annotations of our best-of-breed clinical text de-iden...
Article
Full-text available
In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be consid...
Article
Full-text available
In this study, we evaluate the performance of a Natural Language Processing (NLP) application designed to extract medical problems from narrative text clinical documents. The documents come from a patient's electronic medical record and medical problems are proposed for inclusion in the patient's electronic problem list. This application has been d...
Conference Paper
Full-text available
Large language models have existed only for half a decade, but their recent dramatic evolution has made impressive performance and promises broadly visible to a large public and lifted the veil on numerous issues and potential serious problems. A leap in popularity happened when OpenAI opened access to a new chatbot-ChatGPT-in November 2022. It off...
Article
Background Clinical natural language processing (NLP) researchers need access to directly comparable evaluation results for applications such as text deidentification across a range of corpus types and the means to easily test new systems or corpora within the same framework. Current systems, reported metrics, and the personally identifiable inform...
Article
Objectives Generative large language models (LLMs) are a subset of transformers-based neural network architecture models. LLMs have successfully leveraged a combination of an increased number of parameters, improvements in computational efficiency, and large pre-training datasets to perform a wide spectrum of natural language processing (NLP) tasks...
Article
Full-text available
Clinical data de-identification offers patient data privacy protection and eases reuse of clinical data. As an open-source solution to de-identify unstructured clinical text with high accuracy, CliniDeID applies an ensemble method combining deep and shallow machine learning with rule-based algorithms. It reached high recall and precision when recen...
Preprint
BACKGROUND Clinical Natural Language Processing (NLP) researchers need access to directly comparable evaluation results for applications such as text de-identification across a range of corpus types and the means to easily test new systems or corpora within the same framework. Current systems, reported metrics, and personally identifiable informati...
Conference Paper
Introduction The increased use and adoption of Electronic Health Records (EHR), and parallel growth in patient data available for secondary use by clinicians, researchers, and operational purposes, all cause patient data privacy protection to become an increasingly important requirement and expectation. The laws protecting patient privacy and confi...
Chapter
Clinical research, being patient-oriented, is based predominantly on clinical data—symptoms reported by patients, observations of patients made by healthcare providers, radiological images, and various metrics, including laboratory measurements that reflect physiological functions. Recently, however, a new type of data—genes and their products—has...
Conference Paper
Full-text available
The automatic de-identification of clinical narrative text offers efficient patient data privacy protection and eases reuse of clinical data. CliniDeID applies an ensemble method combining deep and shallow machine learning with rule-based algorithms and is released as free and open-source solution to de-identify unstructured clinical text with high...
Article
Full-text available
More than 40% of the adult population suffers from functional gastrointestinal disorders, now considered disorders of the "gut-brain axis" (GBA) interactions, a very complex bidirectional neural, endocrine, immune, and humoral communication system modulated by the microbiota. To help discover, understand, and manage GBA disorders, the OnePlanet res...
Article
Full-text available
Background To advance new therapies into clinical care, clinical trials must recruit enough participants. Yet, many trials fail to do so, leading to delays, early trial termination, and wasted resources. Under-enrolling trials make it impossible to draw conclusions about the efficacy of new therapies. An oft-cited reason for insufficient enrollment...
Conference Paper
Full-text available
Introduction: Multi-classifier ensemble methods is a means for combining multiple classifiers into a stronger meta-classifier. Past research lacks re-usable implementations of post-hoc ensemble generation tools, the subclass of methods focused on model composition. We hypothesize that efficient post-hoc ensemble generation will increase model reusa...
Chapter
Full-text available
A new natural language processing (NLP) application for COVID-19 related information extraction from clinical text notes is being developed as part of our pandemic response efforts. This NLP application called DECOVRI (Data Extraction for COVID-19 Related Information) will be released as a free and open source tool to convert unstructured notes int...
Chapter
Full-text available
We present on the performance evaluation of machine learning (ML) and Natural Language Processing (NLP) based Section Header classification. The section headers classification task was performed as a two-pass system. The first pass detects a section header while the second pass classifies it. Recall, precision, and F1-measure metrics were reported...
Preprint
Full-text available
Background: To advance new therapies into clinical care, clinical trials must recruit enough participants. Yet, many trials fail to do so, leading to delays, early trial termination, and wasted resources. Under-enrolling trials make it impossible to draw conclusions about the efficacy of new therapies. An oft-cited reason for insufficient enrollmen...
Article
Full-text available
Objective The COVID-19 pandemic response at MUSC included virtual care visits for patients with suspected SARS-CoV-2 infection. The telehealth system used for these visits only exports a text note to integrate with the EHR, but structured and coded information about COVID-19 (e.g., exposure, risk factors, symptoms) was needed to support clinical ca...
Article
Full-text available
Objective: The COVID-19 pandemic response at MUSC included virtual care visits for patients with suspected SARS-CoV-2 infection. The telehealth system used for these visits only exports a text note to integrate with the EHR, but structured and coded information about COVID-19 (e.g., exposure, risk factors, symptoms) was needed to support clinical c...
Article
Full-text available
Importance The National COVID Cohort Collaborative (N3C) is a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative COVID-19 cohort to date. This multicenter data set can support robust evidence-based development of predictive and diagnostic tools and inform clinical care and policy....
Article
Full-text available
Background Family history information is important to assess the risk of inherited medical conditions. Natural language processing has the potential to extract this information from unstructured free-text notes to improve patient care and decision making. We describe the end-to-end information extraction system the Medical University of South Carol...
Article
Full-text available
De-identification of electric health record narratives is a fundamental task applying natural language processing to better protect patient information privacy. We explore different types of ensemble learning methods to improve clinical text de-identification. We present two ensemble-based approaches for combining multiple predictive models. The fi...
Preprint
BACKGROUND Family history information is important to assess the risk of inherited medical conditions. Natural language processing has the potential to extract this information from unstructured free-text notes to improve patient care and decision-making. We describe the end-to-end information extraction system the Medical University of South Carol...
Article
Full-text available
Background: COVID-19 challenges and needs required health systems to rapidly redesign the delivery of care. Objective: To describe our approach in using health information technology to provide a continuum of services during the COVID-19 pandemic. Materials and methods: Our health system deployed four COVID-19 telehealth programs, and four bio...
Article
Full-text available
A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identifiable Information (PII), which cannot be freely shared within the research community without compromising patient co...
Article
Full-text available
We sought to evaluate the context of potential implementation of an automated quality measurement system for inpatients with heart failure in the U.S. Department of Veterans Affairs (VA). The research methodology was guided by the Promoting Action on Research Implementation in Health Sciences (PARIHS) framework and the sociotechnical model of healt...
Article
Full-text available
Objective: In an effort to improve the efficiency of computer algorithms applied to screening for COVID-19 testing, we used natural language processing (NLP) and artificial intelligence (AI)-based methods with unstructured patient data collected through telehealth visits. Methods: After segmenting and parsing documents, we conducted analysis of...
Conference Paper
A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identi- fiable Information (PII), which cannot be freely shared within the research community without compromising patient...
Article
Objective: Accurate and complete information about medications and related information is crucial for effective clinical decision support and precise health care. Recognition and reduction of adverse drug events is also central to effective patient care. The goal of this research is the development of a natural language processing (NLP) system to...
Conference Paper
Clinical concept normalization consists in associating a phrase identified as a clinical concept with a concept found in a standard medical terminology. As defined for the 2019 National NLP Clinical Challenge (n2c2) third track, the meaning of a given medical concept mentioned in some clinical narrative text must be determined by assigning a concep...
Article
This study focuses on the extraction of medical problems mentioned in electric health records to support disease management. We experimented with a variety of information extraction methods based on rules, on knowledge bases, and on machine learning, and combined them in an ensemble method approach. A new dataset drawn from cancer patient medical r...
Article
Full-text available
Automated extraction of patient trial eligibility for clinical research studies can increase enrollment at a decreased time and money cost. We have developed a modular trial eligibility pipeline including patient-batched processing and an internal webservice backed by a uimaFIT pipeline as part of a multi-phase approach to include note-batched proc...
Article
Full-text available
Clinical text de-identification enables collaborative research while protecting patient privacy and confidentiality; however, concerns persist about the reduction in the utility of the de-identified text for information extraction and machine learning tasks. In the context of a deep learning experiment to detect altered mental status in emergency d...
Article
Introduction: Insufficient patient enrollment in clinical trials remains a serious and costly problem and is often considered the most critical issue to solve for the clinical trials community. In this project, we assessed the feasibility of automatically detecting a patient's eligibility for a sample of breast cancer clinical trials by mapping co...
Chapter
Clinical research, being patient-oriented, is based predominantly on clinical data – symptoms reported by patients, observations of patients made by health-care providers, radiological images, and various metrics, including laboratory measurements that reflect physiological functions. Recently, however, a new type of data – genes and their products...
Article
Full-text available
Text de-identification is an application of clinical natural language processing that offers significant efficiency and scalability advantages. Hence, various learning algorithms have been applied to this task to yield better performance. Instead of choosing the best individual learning algorithm, we aim to improve de-identification by constructing...
Conference Paper
Text de-identification is an application of clinical natural language processing that offers significant efficiency We present three different ensemble methods that combine multiple de-identification models trained from deep learning, shallow learning, and rule-based approaches. Each model is capable of automated de-identification without manual m...
Conference Paper
The adoption of Electronic Health Record (EHR) systems is growing at a fast pace in the U.S., and this growth results in very large quantities of patient clinical data becoming available in electronic format, with tremendous potential, coupled with growing concern for patient confidentiality breaches. Secondary use of clinical data is essential to...
Conference Paper
Full-text available
The growing ecosystem of natural language processing (NLP) tools introduces a growing evaluation problem. Both developers and users need consistent tools to evaluate performance regardless of development envi- ronment, across teams, and between annotation schemata (i.e., annotation category definitions). Our motivation for developing ETUDE (Evaluat...
Conference Paper
Artificial Intelligence (AI) is developing at a fast pace in healthcare, enabled by cheap powerful computing resources and an important growth in patient information becoming available in electronic format. In healthcare, AI has already been applied in multiple domains, often either enabling decision support or providing the data analysis and knowl...
Article
Full-text available
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically...
Article
Full-text available
Quality reporting that relies on coded administrative data alone may not completely and accurately depict providers' performance. To assess this concern with a test case, we developed and evaluated a natural language processing (NLP) approach to identify falls risk screenings documented in clinical notes of patients without coded falls risk screeni...
Article
Classifying relations between pairs of medical concepts in clinical texts is a crucial task to acquire empirical evidence relevant to patient care. Due to limited labeled data and extremely unbalanced class distributions, medical relation classification systems struggle to achieve good performance on less common relation types, which capture valuab...
Conference Paper
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com- mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typicall...
Article
Full-text available
Background: We developed an accurate, stakeholder-informed, automated, natural language processing (NLP) system to measure the quality of heart failure (HF) inpatient care, and explored the potential for adoption of this system within an integrated health care system. Objective: To accurately automate a United States Department of Veterans Affai...
Conference Paper
Quality reporting that relies on coded administrative data alone may not completely and accurately depict providers’ performance. To assess this concern with a test case, we developed and evaluated a natural language processing (NLP) approach to identify falls risk screenings documented in clinical notes of patients without coded falls risk screeni...
Conference Paper
Full-text available
Classifying relations between pairs of medical concepts in clinical texts is a crucial task to acquire empirical evidence relevant to patient care. Due to limited labeled data and extremely unbalanced class distributions, medical relation classification systems struggle to achieve good performance on less common relation types, which capture valuab...
Article
Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selec...
Conference Paper
Terminologies or ontologies to describe patient-reported information are lacking. The development and maintenance of ontologies is usually a manual, lengthy, and resource-intensive process. To support the development of medical specialty-specific ontologies, we created a semi-automated ontology development and management system (SEAM). SEAM support...
Article
Full-text available
Background: Community-acquired pneumonia is a leading cause of pediatric morbidity. Administrative data are often used to conduct comparative effectiveness research (CER) with sufficient sample sizes to enhance detection of important outcomes. However, such studies are prone to misclassification errors because of the variable accuracy of discharge...